RF: configure num_threads==-1 as the value to use all cores #2352

drombas · 2021-04-06T11:49:39Z

Related to #2300 and a continuation of #2341.

Proposed Changes

Configure num_threads<=0 as the option to use all cores across the codebase
Modify the name of the argument num_proceses --> num_threads in reslice.py and test_reslice.py
Delete the num_threads argument where it is not used

Important point

Unlike in #2341, most of the functions edited here used all cores by default (using None) so, to keep this behavior, I set the new default to 0. This implies that:

The number of used cores by default differs between functions. Does it make sense to use all cores by default in some functions and only 1 in others?
This PR modifies the default value of num_threads in several functions, which was one of the main concerns discussed in NF: Add "None" options in the CLIs #2300. What do you think @jhlegarreta,@skoudoro?

codecov · 2021-04-06T14:15:49Z

Codecov Report

Merging #2352 (1e439ce) into master (5394ac5) will decrease coverage by 6.15%.
The diff coverage is 78.29%.

@@            Coverage Diff             @@
##           master    #2352      +/-   ##
==========================================
- Coverage   91.38%   85.23%   -6.16%     
==========================================
  Files         254      126     -128     
  Lines       33851    16562   -17289     
  Branches     3569     2681     -888     
==========================================
- Hits        30936    14117   -16819     
+ Misses       2111     1759     -352     
+ Partials      804      686     -118

Impacted Files	Coverage Δ
dipy/data/__init__.py	`81.18% <ø> (ø)`
dipy/denoise/nlmeans.py	`100.00% <ø> (ø)`
dipy/denoise/non_local_means.py	`100.00% <ø> (ø)`
dipy/reconst/csdeconv.py	`86.79% <ø> (-1.26%)`	⬇️
dipy/reconst/shm.py	`93.06% <ø> (-0.04%)`	⬇️
dipy/workflows/base.py	`76.15% <ø> (ø)`
dipy/workflows/io.py	`74.13% <ø> (ø)`
dipy/workflows/mask.py	`94.44% <ø> (ø)`
dipy/workflows/stats.py	`84.80% <0.00%> (-1.01%)`	⬇️
dipy/workflows/tracking.py	`96.51% <ø> (ø)`
... and 181 more

grlee77 · 2021-04-07T15:53:34Z

As an alternative, I would also consider using num_threads=-1 to mean the maximum number of workers (similar array[-1] giving th last element in an array). Similarly, num_threads=-3 would be two less workers than the maximum. That is the approach taken for the workers argument in SciPy functions (example) and for the n_jobs argument in joblib.Parallel.

Unfortunately there is not good consensus on either this behavior or the name for the num_threads/workers/n_jobs argument across scientific Python libraries though! (see some discussion in scikit-image/scikit-image#4876)

jhlegarreta · 2021-04-08T12:55:38Z

Thanks for taking care of this @drombas !

Configure num_threads<=0 as the option to use all cores across the codebase

IMO using two (or more) different values to mean the same thing is misleading.

The number of used cores by default differs between functions. Does it make sense to use all cores by default in some functions and only 1 in others?

At first sight, this looks inconsistent/undesirable to me, but have not investigated the reasons, if any. Maybe others can elaborate on this.

This PR modifies the default value of num_threads in several functions, which was one of the main concerns discussed in #2300. What do you think @jhlegarreta,@skoudoro?

Although not desirable, I'd definitely be for it if the new values are the ones that make sense/are sensible defaults.

Interesting what @grlee77 says #2352 (comment) . That also broadens the scope of our issue. As for the terms used, I wouldn't know to tell which term is more accurate/honors its purpose, but have seen mixed use in the past elsewhere with the terms process, thread, job, worker.

skoudoro

Hi @drombas,

Thank you for doing this again!

After all comments and more thinking, see below some suggestions:

Modify the name of the argument num_proceses --> num_threads in reslice.py and test_reslice.py

you should keep num_processes here. This is different than num_threads. To simplify it means we spawn processes (run new python interpreter to parallelize).

Delete the num_threads argument where it is not used

you will have to keep it and start a deprecation cycle to warn the users

The number of used cores by default differs between functions. Does it make sense to use all cores by default in some functions and only 1 in others?

To me it makes sense. It depends a lot on the algorithms. Some of them are greedy and we do not want to kill/freeze the laptop of the user (they will blame the algo which is wrong)

This PR modifies the default value of num_threads in several functions, which was one of the main concerns discussed in #2300. What do you think @jhlegarreta,@skoudoro?

After @jhlegarreta and @grlee77 comments, I recommend that you keep the default value at None or the value indicated, and inside each function just initialize the num_threads by doing:

num_threads = num_threads or -1

Does it make sense to you @drombas?

dipy/align/reslice.py

drombas · 2021-04-09T09:07:55Z

Thank you all for your comments! TBH I don't have a strong opinion on this kind of conventions.

On your points @skoudoro:

you should keep num_processes here. This is different than num_threads. To simplify it means we spawn processes (run new python interpreter to parallelize).

Sorry for that, thought it was just a different naming for the same.

you will have to keep it and start a deprecation cycle to warn the users

Ok, I'll keep those.

I recommend that you keep the default value at None or the value indicated, and inside each function just initialize the num_threads by doing:
num_threads = num_threads or -1

Just to clarify, you mean we keep None as default:

def anyFunction(...,num_threads=None):

and then inside the function we initialize it to -1 or 1 (depending on the case):

if num_threads is None:
    num_threads = -1

It sounds reasonable as it doesn't change the default but incorporates -1 as the option for all cores.

As for selecting between threads,jobs,workers,... we could leave it for another issue/discussion ans stick to num_threads for the moment.

skoudoro · 2021-04-09T14:55:16Z

It sounds reasonable as it doesn't change the default but incorporates -1 as the option for all cores.

Exactly and this num_threads = num_threads or -1 is equivalent to

if num_threads is None:
    num_threads = -1

drombas · 2021-04-14T11:22:30Z

Before reviewing the code, please notice that I added some extra tests to account for invalid num_threads values.

Could we restart those failing tests? (All related tests passed locally )

jhlegarreta · 2021-04-14T12:24:32Z

Thanks for the effort @drombas. Restarted the tests.

dipy/align/bundlemin.pyx

dipy/denoise/denspeed.pyx

dipy/denoise/shift_twist_convolution.pyx

jhlegarreta

I had a quick look at the changes: the docstrings and the default values in the method signatures are contradicting, e.g.:

def _bundle_minimum_distance_matrix(double [:, ::1] static,
                                    double [:, ::1] moving,
                                    cnp.npy_intp static_size,
                                    cnp.npy_intp moving_size,
                                    cnp.npy_intp rows,
                                    double [:, ::1] D,
                                    num_threads=None):
(...)
    num_threads : int, optional
        Number of threads. If -1 (default) then all available threads will be
        used.

Default is None. Have not looked at all signatures, but I had a look at a few of them, and all them fall into this contradiction. I'd dare to say that for people that will be looking at the documentation this is quite confusing, and IMHO it is as confusing from a developer's point of view.

drombas · 2021-04-14T14:29:28Z

Thanks @jhlegarreta !

Default is None. Have not looked at all signatures, but I had a look at a few of them, and all them fall into this contradiction. I'd dare to say that for people that will be looking at the documentation this is quite confusing,

The intention was to keep None as the default but also tell the user that None in practice behaves as -1 (all cores). Any suggestion for a less confusing docstring ?

We could also return to the original plan and get rid of None by setting -1 as the default in the method signature.

jhlegarreta · 2021-04-14T20:08:33Z

The intention was to keep None as the default but also tell the user that None in practice behaves as -1 (all cores). Any suggestion for a less confusing docstring ?

I still believe that using two values with the same meaning is confusing, and this means that I'd be revisiting the use of None in the CLI that triggered the issue this PR tries to fix. I think @grlee77 's proposal #2352 (comment), beyond the terminology, would not have any such contradiction.

Also, it looks like every method involved is forced to do its if/else block or implement its own logic to potentially arrive at the same conclusion (provided that the convention is clear); I'd have one single utility function named e.g. determine_threads(num_threads), compute_threads(num_threads) or similar, that is defined at a single place, and shared and used across all involved methods. Might not be straightforward, and might need some deprecation cycle if the logic changes for some methods, but from that moment on, we'd only need to have a look at/change things at a single place (less prone to errors, etc).

Sorry, but I feel that I cannot comment further being still not convinced by the solution.

Thanks for the effort @drombas.

skoudoro · 2021-04-14T20:43:55Z

this means that I'd be revisiting the use of None in the CLI that triggered the issue this PR tries to fix

This PR does not try to fix this anymore @jhlegarreta. it tries to standardize how we set up num_threads since there are different rules in each function. Also, it tries to keep the backward compatibility with the None so it is normal that 2 values will have the same meaning.

I think @grlee77 's proposal #2352 (comment), beyond the terminology, would not have any such contradiction.

that's why, the first step here is -1 for all cores. Then, a new PR will be needed to deprecate None in this specific case. (concerning the other case, we need to manage None at the worflow level). The proposal does not say what is the behavior when you have 0 or None as a parameter. Should we raise an error? or interpret it as all_cores which means 3 values for the same behavior (-1, 0, None). I need to look at scikit-learn code or maybe @grlee77 have an answer.

also, it looks like every method involved is forced to do its if/else block or implement its own logic to potentially arrive at the same conclusion (provided that the convention is clear); I'd have one single utility function named e.g. determine_threads(num_threads), compute_threads(num_threads) or similar, that is defined at a single place, and shared and used across all involved methods

I agree with this point. could you create a function @drombas in dipy.utils.omp.pyx? something like determine_num_threads() as @jhlegarreta propose and then use it everywhere. Would be easier to maintain in the future. Thanks a lot!

skoudoro · 2021-04-14T20:58:53Z

@jhlegarreta @drombas @grlee77 : More details about their rules below or in this link. I am ok to follow the same rules

For n_threads = None,
- if the OMP_NUM_THREADS environment variable is set, return
  openmp.omp_get_max_threads()
- otherwise, return the minimum between openmp.omp_get_max_threads()
  and the number of cpus, taking cgroups quotas into account. Cgroups
  quotas can typically be set by tools such as Docker.
  The result of omp_get_max_threads can be influenced by environment
  variable OMP_NUM_THREADS or at runtime by omp_set_num_threads.
For n_threads > 0, return this as the maximal number of threads for
parallel OpenMP calls.
- For n_threads < 0, return the maximal number of threads minus
  |n_threads + 1|. In particular n_threads = -1 will use as many
  threads as there are available cores on the machine.
Raise a ValueError for n_threads = 0.

jhlegarreta · 2021-04-15T13:27:10Z

We might want to check how the above (#2352 (comment), #2352 (comment)) would work when calling the CLIs, but looks like a step forward.

drombas · 2021-04-15T16:27:16Z

I agree with this point. could you create a function @drombas in dipy.utils.omp.pyx? something like determine_num_threads() as @jhlegarreta propose and then use it everywhere.

okey, let's try that. I think we already have something similar in omp.pyx that we can adapt to the scikit-learn logic. I will give it a try during next days.

skoudoro · 2021-04-20T16:48:22Z

Hi @drombas,

Do you think you will have time to finish this PR before Friday and the new DIPY release? or should I move this PR for the next release cycle in June?

Thank you for your feedback

drombas · 2021-04-20T18:52:27Z

Hi @skoudoro,

By tomorrow I should have finished the changes so, if it is ok we can decide tomorrow.

skoudoro · 2021-04-20T19:07:17Z

sounds like a plan 👍🏾 . no problem

skoudoro

This is very nice work @drombas. Thanks a lot for that.

Overall, it looks good. I still need to look at it more carefully. See below some comments.

Also, Can you add your docstring as a note on the [api_change.rst](https://github.com/dipy/dipy/blob/master/doc/api_changes.rst). You can create a section for DIPY 1.4.1.

Thanks!

dipy/utils/multiproc.py

dipy/utils/omp.pyx

dipy/workflows/reconst.py

drombas · 2021-04-21T15:34:21Z

Thanks for the comments @skoudoro.

In summary, the selection of the number of cores is now centralized in two files:

omp.pyx: for OpenMP parallelization
multiproc.py: for multiprocessing parallelization

Finally I split it in two as the logic is slightly different: for OpenMP the environment variable OMP_NUM_THREADS is considered while for multiprocessing it is not. I also thought it could be confusing to use omp.pyx to define the logic of parallelization using multiprocessing package. To help with the review a bit here is an organized list of the main changed files.

skoudoro

It is good to go!

I will wait until Friday evening to see if there is any additional comment and then go ahead and merge it.

jhlegarreta

Thanks for the hard work and for having persevered @drombas.

Looks very clean.

Thanks for the spreadsheet. Very helpful. On a related note, automatically knowing which methods are multi-threaded can be helpful to developers and users at some point. But that is a separate endeavor.

Not sure if I followed the removal in here
https://github.com/dipy/dipy/pull/2352/files#diff-5fcc99cde72e6ea9a640d32a91756c178d916d26ae479e0da4c766a61c45dd4cL226

But if it's OK, then dismiss the observation.

I am missing dedicated test methods for the accepted values in the determine_num_processes and determine_num_threads methods.

A minor in-line comment.

The last two comments can be addressed at a latter time if it is preferred to have this merged as soon as possible.

doc/interfaces/gibbs_unringing_flow.rst

dipy/utils/tests/test_multiproc.py

jhlegarreta

💯 to the hard work @drombas.

drombas · 2021-04-23T15:12:50Z

Thanks for the quick feedback!

Not sure if I followed the removal in here
https://github.com/dipy/dipy/pull/2352/files#diff-5fcc99cde72e6ea9a640d32a91756c178d916d26ae479e0da4c766a61c45dd4cL226

After the changes that part is not reached. It is true, though, that there is a catch of NotImplementedError exception that I did not implement (not sure how critical it is as it is the only place I saw that check).

I added it just in case.

skoudoro · 2021-04-25T12:55:20Z

thank you @drombas! merging

jhlegarreta · 2021-04-25T14:49:45Z

I'd say that the exception added in 1e439ce should get tested. Thanks.

drombas · 2021-04-26T06:48:27Z

That exception is raised when the number of cores is undetermined and TBH I don't know how we could test it.

jhlegarreta · 2021-04-26T13:40:46Z

That exception is raised when the number of cores is undetermined and TBH I don't know how we could test it.

Can the exception be forced to be raised and the expected message or result be checked?

drombas · 2021-04-27T09:07:33Z

Can the exception be forced to be raised and the expected message or result be checked?

I imagine it should be possible if we knew exactly how the number of cores is determined.

RF:configure num_threads<=0 as the value to use all cores

9270917

skoudoro reviewed Apr 8, 2021

View reviewed changes

dipy/align/reslice.py Outdated Show resolved Hide resolved

dipy/align/reslice.py Outdated Show resolved Hide resolved

skoudoro mentioned this pull request Apr 10, 2021

[Upcoming] Release 1.4.1 #2367

Merged

17 tasks

skoudoro added the type:Enhancement label Apr 10, 2021

RF: set -1 as the num_threads value for all cores

f4fb5d5

drombas changed the title ~~RF: configure num_threads<=0 as the value to use all cores~~ RF: configure num_threads==-1 as the value to use all cores Apr 13, 2021

RF: keep 1 as the default number of cores in reslice

0a3cd51

jhlegarreta reviewed Apr 14, 2021

View reviewed changes

dipy/align/bundlemin.pyx Outdated Show resolved Hide resolved

jhlegarreta reviewed Apr 14, 2021

View reviewed changes

dipy/denoise/denspeed.pyx Outdated Show resolved Hide resolved

jhlegarreta reviewed Apr 14, 2021

View reviewed changes

dipy/denoise/shift_twist_convolution.pyx Outdated Show resolved Hide resolved

jhlegarreta reviewed Apr 14, 2021

View reviewed changes

dipy/denoise/shift_twist_convolution.pyx Outdated Show resolved Hide resolved

jhlegarreta requested changes Apr 14, 2021

View reviewed changes

STYLE: remove forgotten whitespaces

647c4d5

skoudoro mentioned this pull request Apr 16, 2021

Allow use of all threads in the gibbs ringing workflow #2341

Closed

drombas added 5 commits April 21, 2021 12:41

RF: centralize num_threads selection in a single function

83fb45d

FIX: correct sign in num_thread calculation

6150b5c

Add check for invalid num_threads values

cff985e

TEST: add test for invalid num_threads

7177afc

RF: standardize python multiprocessing

b32c66d

skoudoro reviewed Apr 21, 2021

View reviewed changes

dipy/utils/multiproc.py Outdated Show resolved Hide resolved

dipy/utils/omp.pyx Outdated Show resolved Hide resolved

dipy/workflows/reconst.py Outdated Show resolved Hide resolved

dipy/workflows/reconst.py Outdated Show resolved Hide resolved

skoudoro added this to the 1.4.1 milestone Apr 21, 2021

drombas added 3 commits April 21, 2021 17:40

Address comments

11e1e5b

DOC: add entry on API changes

49ae054

DOC: modify the API changes

d2f4181

skoudoro approved these changes Apr 22, 2021

View reviewed changes

jhlegarreta reviewed Apr 23, 2021

View reviewed changes

doc/interfaces/gibbs_unringing_flow.rst Show resolved Hide resolved

drombas added 2 commits April 23, 2021 16:51

TEST: add tests for determine_num_processes

514edfe

DOC: extend gibbs_ringing tutorial explanation

5752f95

jhlegarreta reviewed Apr 23, 2021

View reviewed changes

dipy/utils/tests/test_multiproc.py Outdated Show resolved Hide resolved

jhlegarreta reviewed Apr 23, 2021

View reviewed changes

dipy/utils/tests/test_multiproc.py Show resolved Hide resolved

TEST: add tests for determine_num_threads

9c694a6

jhlegarreta approved these changes Apr 23, 2021

View reviewed changes

Add NotImplementedError exception catch

1e439ce

skoudoro merged commit a3d0fed into dipy:master Apr 25, 2021

drombas mentioned this pull request Apr 27, 2021

RF: standardize the argument name for the number of threads/cores #2377

Closed

drombas deleted the change-num_threads-None branch May 9, 2021 05:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RF: configure num_threads==-1 as the value to use all cores #2352

RF: configure num_threads==-1 as the value to use all cores #2352

drombas commented Apr 6, 2021

codecov bot commented Apr 6, 2021 •

edited

grlee77 commented Apr 7, 2021 •

edited

jhlegarreta commented Apr 8, 2021 •

edited

skoudoro left a comment

drombas commented Apr 9, 2021

skoudoro commented Apr 9, 2021

drombas commented Apr 14, 2021

jhlegarreta commented Apr 14, 2021 •

edited

jhlegarreta left a comment

drombas commented Apr 14, 2021

jhlegarreta commented Apr 14, 2021 •

edited

skoudoro commented Apr 14, 2021

skoudoro commented Apr 14, 2021

jhlegarreta commented Apr 15, 2021

drombas commented Apr 15, 2021

skoudoro commented Apr 20, 2021

drombas commented Apr 20, 2021

skoudoro commented Apr 20, 2021

skoudoro left a comment

drombas commented Apr 21, 2021 •

edited

skoudoro left a comment

jhlegarreta left a comment •

edited

jhlegarreta left a comment •

edited

drombas commented Apr 23, 2021 •

edited

skoudoro commented Apr 25, 2021

jhlegarreta commented Apr 25, 2021

drombas commented Apr 26, 2021

jhlegarreta commented Apr 26, 2021

drombas commented Apr 27, 2021

RF: configure num_threads==-1 as the value to use all cores #2352

RF: configure num_threads==-1 as the value to use all cores #2352

Conversation

drombas commented Apr 6, 2021

Proposed Changes

Important point

codecov bot commented Apr 6, 2021 • edited

Codecov Report

grlee77 commented Apr 7, 2021 • edited

jhlegarreta commented Apr 8, 2021 • edited

skoudoro left a comment

Choose a reason for hiding this comment

drombas commented Apr 9, 2021

skoudoro commented Apr 9, 2021

drombas commented Apr 14, 2021

jhlegarreta commented Apr 14, 2021 • edited

jhlegarreta left a comment

Choose a reason for hiding this comment

drombas commented Apr 14, 2021

jhlegarreta commented Apr 14, 2021 • edited

skoudoro commented Apr 14, 2021

skoudoro commented Apr 14, 2021

jhlegarreta commented Apr 15, 2021

drombas commented Apr 15, 2021

skoudoro commented Apr 20, 2021

drombas commented Apr 20, 2021

skoudoro commented Apr 20, 2021

skoudoro left a comment

Choose a reason for hiding this comment

drombas commented Apr 21, 2021 • edited

skoudoro left a comment

Choose a reason for hiding this comment

jhlegarreta left a comment • edited

Choose a reason for hiding this comment

jhlegarreta left a comment • edited

Choose a reason for hiding this comment

drombas commented Apr 23, 2021 • edited

skoudoro commented Apr 25, 2021

jhlegarreta commented Apr 25, 2021

drombas commented Apr 26, 2021

jhlegarreta commented Apr 26, 2021

drombas commented Apr 27, 2021

codecov bot commented Apr 6, 2021 •

edited

grlee77 commented Apr 7, 2021 •

edited

jhlegarreta commented Apr 8, 2021 •

edited

jhlegarreta commented Apr 14, 2021 •

edited

jhlegarreta commented Apr 14, 2021 •

edited

drombas commented Apr 21, 2021 •

edited

jhlegarreta left a comment •

edited

jhlegarreta left a comment •

edited

drombas commented Apr 23, 2021 •

edited