You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Profiling the recon-all pipeline, we found that when using multi-threading a majority of the time was spent waiting for OpenMP threads to synchronize. The two figure below show the CPU time spent in each function when using 1 and 32 threads respectively.
Note the difference on the y-scale as well.
We found similar results when using lower number of threads. Furthermore, the parallel efficiency decrease significantly when increasing the number of threads.
Potential Solution
We think this issue might arise from the OpenMP scheduling policy used; mostly static policy is used. We think that using dynamic policy might reduce the impact from threads synchronization. However, we couldn't test this hypothesis since naively replacing the OpenMP scheduling type failed to compile.
The text was updated successfully, but these errors were encountered:
Environment details
Docker image:
mathdugre/freesurfer:debug-info
also available at https://github.com/mathdugre/mri-bottleneck/blob/main/container/freesurfer.DockerfileMulti-threading was set using:
-threads
argumentOMP_NUM_THREADS
env varITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
env varAll set to the same value.
Issue
Profiling the
recon-all
pipeline, we found that when using multi-threading a majority of the time was spent waiting for OpenMP threads to synchronize. The two figure below show the CPU time spent in each function when using 1 and 32 threads respectively.Note the difference on the y-scale as well.
We found similar results when using lower number of threads. Furthermore, the parallel efficiency decrease significantly when increasing the number of threads.
Potential Solution
We think this issue might arise from the OpenMP scheduling policy used; mostly static policy is used. We think that using dynamic policy might reduce the impact from threads synchronization. However, we couldn't test this hypothesis since naively replacing the OpenMP scheduling type failed to compile.
The text was updated successfully, but these errors were encountered: