Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CPU: i7-6500U 2C4T(2 cores, 4 threads)
使用TBB完成循环的并行(中间也实验了下OpenMP)
saxpy部分试图使用手写SIMD指令加速,但没有什么效果,应该是编译器优化自动SIMD了吧?
输出结果如下:
原版
fill: 1.61105s
fill: 1.56055s
saxpy: 0.0412753s
sqrtdot: 0.104043s
5165.4
minvalue: 0.101617s
-1.11803
magicfilter: 0.51515s
55924034
scanner: 0.101111s
5.28566e+07
TBB后
fill: 0.6906s
fill: 0.753506s
saxpy: 0.040185s
sqrtdot: 0.024997s
5792.62
minvalue: 0.012356s
-1.11803
magicfilter: 0.256035s
55924034
scanner: 0.075566s
6.18781e+07