Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandarallel could stuck without raising any errors when using all the physical cores #262

Open
2 tasks done
LawrentChen opened this issue Dec 16, 2023 · 2 comments
Open
2 tasks done

Comments

@LawrentChen
Copy link

General

  • Operating System: Windows 11 Professional 22H2 22621.2715
  • Python version: 3.10
  • Pandas version: 2.0.3
  • Pandarallel version: 1.6.5

Acknowledgement

  • My issue is NOT present when using pandas without alone (without pandarallel)
  • If I am on Windows, I read the Troubleshooting page
    before writing a new bug report

Bug description

Pandarallel could stuck without raising any errors when using all the physical cores, while some of them may be occupied by other tasks in the background at the same time.

Observed behavior

My CPU is Intel(R) Core(TM) i7-14700K, which has 20 physical cores as shown by psutil.cpu_count(logical=False). But when I try using all those cores in Pandarallel, it could stuck without raising any errors. If I turn on the progress_bar, I can see that few of bars not moving at all.
I am pretty sure that there is nothing wrong within my code, because after I reboot the system and re-run the code (without doing anything else and no other tasks in the background), it could work totally as expected.

I think this problem is similar to #183 and #226.

Expected behavior

It's best for Pandarallel to dispatch all the available cores at real-time, maybe like how joblib does.
I've compared my code using joblib with 20 workers and other background tasks running at my Win11. It could work. Maybe slower than really using all 20 cores, but at least it won't stuck without raising an error.

A sub-optimal way might be raising an error to let users know their cores have been occupied.

I left 4 cores out to get around for now (nb_workers=20-4), it works well with my code (a bit slower somehow).

Minimal but working code sample to ease bug fix for pandarallel team

Sorry that I am not a good developer. All I can do is describing this issue.
Pandarallel is an awesome package after all. Thank you very much.

@nalepae
Copy link
Owner

nalepae commented Jan 23, 2024

Pandaral·lel is looking for a maintainer!
If you are interested, please open an GitHub issue.

@shermansiu
Copy link

I'm glad that your code works if you leave out 4 of your 20 cores!

Nevertheless, it will be difficult to fix this issue if you don't provide some minimal code to reproduce your bug. Neither of the issues you linked included a minimal code sample either.

Since you have a viable workaround and there isn't a minimal code sample, I would prefer to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants