Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock due to lack of cross thread signalling #2

Open
Squadrick opened this issue Feb 2, 2023 · 2 comments
Open

Deadlock due to lack of cross thread signalling #2

Squadrick opened this issue Feb 2, 2023 · 2 comments

Comments

@Squadrick
Copy link

Squadrick commented Feb 2, 2023

Consider a series of enqueues to a thread pool with two threads: func1(), stalled_func(), func2(), wait a second, func3(). Execution of func3() will unstall stalled_func().

Since enqueue are round-robin, func1() and func2() are enqueued of Thread 1, stalled_func() and func3() on Thread 2.

After func2(), Thread 1 goes to sleep waiting to be signalled sem.acquire_many(), since _in_flight == 0. After a second, when func3() is enqueued, it is pushed to Thread 2's queue and its semaphore is signalled, but it cannot execute func3() since it is in the middle of executing stalled_func(). Thread 1 will continue to wait without stealing Thread 2's pending work.

This will cause the thread-pool to deadlock.


If this behaviour is not supported:

Execution of func3() will unstall stalled_func().

then, that's fair. It's a decent limitation for a thread pool. This behaviour happens when executing a DAG of dynamically connected tasks (e.g.: reading assets from disk). A task node can continue execution only if all its parent nodes have finished execution.

I'm opening an issue for posterity if/when anyone wants to handle this case too.

@ConorWilliams
Copy link
Owner

Many thanks for this, I will have to have a think, this:

This behaviour happens when executing a DAG of dynamically connected tasks

is definitly a use case I would like to support.

@Squadrick
Copy link
Author

Here's an idea to get started: Tracking idle threads. Before a thread goes to sleep (sem.acquire_many), add it to an idle thread queue. Whenever a task needs to be enqueued, pop a thread from the idle thread queue and push it to that thread.

This works reasonably well for some thread pools I've implemented. A very mature, advanced version of this is what is used by rayon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants