New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multiprocessing in face detection gallery example #7140
base: main
Are you sure you want to change the base?
Conversation
This change is informed by several issues: The previous version stacked `delayed` put a delayed call around a generator. Starting with dask 2023.9.2 this lead the traceback attached at the end of this message. I opted to remove dask completely because even though I managed to get it working with the newer dask version, the task apparently isn't gaining much from multi-threading. Even with only two workers, CPU cores aren't at 100% on my machine. Using a multi-processing scheduler resulted in errors ("worker unexpectetly finished" or something like that). So instead I opted to use Python's default multiprocessing pool which speeds up the example nicely (from 33s to 9s on my machine). Traceback (most recent call last): File "/home/lg/Res/scikit-image/doc/examples/applications/plot_haar_extraction_selection_classification.py", line 81, in <module> X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=150, [... shortened ...] File "/home/lg/.local/lib/venv/skimagedev/lib/python3.11/site-packages/sklearn/utils/validation.py", line 347, in _num_samples raise TypeError( TypeError: Singleton array array(<generator object <genexpr> at 0x7f5cee64d2a0>, dtype=object) cannot be considered a valid collection.
plot_haar_extraction_selection_classification.py
When trying to use
I am not sure how to add a main guard in context of the sphinx gallery example. It runs fine when I copy and run the content of |
When running the plot_haar_extraction_selection_classification.py directly it fails with the error attached at the end. However, executing the file via sphinx gallery or its content in an IPython console seems to work. So test it in the CI. RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. To fix this issue, refer to the "Safe importing of main module" section in https://docs.python.org/3/library/multiprocessing.html
I am hoping to find a solution that allows for parallelism in this example because the long build time without even using all resources offends me personally. 😅 So I tried multiple approaches (therefore the going back and forth) locally and here:
Not sure how to proceed. Right now the only approach that works everywhere is the single-threaded one. But then the example is better off just ripping out the references to dask and parallelism... For now I'll settle on making a hopefully temporary hotfix in another PR and keep this one open for investigation... (see #7141) |
This is meant to be a temporary hotfix while [1] is being investigated. It naturally has a similar performance compared to the single-threaded dask solution. [1] scikit-image#7140
Aha, I just noticed that 0079f69 is actually passing fine everywhere except for Some digging with git blame shows that this has been an issue before:
I am also seeing this or similar looking behavior on Linux actually. What I don't understand about this explanation is, how this aspect is managed when running the example through sphinx or through IPython. @soupault, do you remember the context for testing the gallery example in two different ways on Windows? Examples are run indirectly through sphinx, and directly by passing the files to Python? I guess we want to keep it like that... Though I am noticing that we only do so on Windows. |
This is a temporary hotfix while [1] is being investigated. Naturally, it has a similar performance compared to the single-threaded Dask solution. [1] #7140
Sounds like a good topic to investigate further :).
Oh, I am not really sure. It might have been connected to the way how documentation errors were handled by AppVeyor. Most likely, it was just me making an exact replica of the setup on Travis (Linux/MacOS) once we moved to the fast and unlimited Azure. Building documentation on Windows, at this point, sounds rather unnecessary. |
Hello scikit-image core devs! There hasn't been any activity on this PR for more than 180 days. I have marked it as "dormant" to make it easy to find. |
Description
This change is informed by several issues:
The previous version stacked
delayed
put a delayed call around a generator. Starting with dask 2023.9.2 this lead the traceback attached at the end of this message.I opted to remove dask completely because even though I managed to get it working with the newer dask version, the task apparently isn't gaining much from multi-threading. Even with only two workers, CPU cores aren't at 100% on my machine. Using a multi-processing scheduler resulted in errors ("worker unexpectetly finished" or something like that). So instead I opted to use Python's default multiprocessing pool which speeds up the example nicely (from 33s to 9s on my machine).
See also our currently failing CI, e.g. https://github.com/scikit-image/scikit-image/actions/runs/6171455485/job/16883631460#step:5:2520.
Checklist
./doc/examples
for new featuresRelease note
Summarize the introduced changes in the code block below in one or a few sentences. The
summary will be included in the next release notes automatically: