Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Why does General.Pool.step call worker (and hence run two tasks per thread)? #823

Open
mpickering opened this issue Mar 2, 2022 · 2 comments

Comments

@mpickering
Copy link
Contributor

When there is more work to do step spawns a new thread and runs the action new >> worker pool. I can't understand what exactly is going on with the worker pool call.

How I understand the code sequence:

  1. New thread if spawned which initially runs the action (now)
  2. worker pool runs, which calls withPool, and so will block until step finishes (because the thread is forked whilst step owns the Pool variable.
  3. step finishes and the spawned thread is added to the pool.
  4. This unblocks worker pool, which then unqueues another task? (This is the part I don't get) and runs it.
  5. Thread finishes and the spawned thread is removed from the Pool.

Step 4 is the part I don't understand, why does each spawned thread get two actions run in it?

Code in question:

worker :: Pool -> IO ()
worker pool = withPool pool $ \s -> pure $ case Heap.uncons $ todo s of
    Nothing -> (s, pure ())
    Just (Heap.Entry _ (_, now), todo2) -> (s{todo = todo2}, now >> worker pool)

-- | Given a pool, and a function that breaks the S invariants, restore them.
--   They are only allowed to touch threadsLimit or todo.
--   Assumes only requires spawning a most one job (e.g. can't increase the pool by more than one at a time)
step :: Pool -> (S -> IO S) -> IO ()
-- mask_ is so we don't spawn and not record it
step pool@(Pool _ done) op = mask_ $ withPool_ pool $ \s -> do
    s <- op s
    case Heap.uncons $ todo s of
        Just (Heap.Entry _ now, todo2) | threadsCount s < threadsLimit s -> do
            -- spawn a new worker
            t <- newThreadFinally (now >> worker pool) $ \t res -> case res of
                Left e -> withPool_ pool $ \s -> do
                    signalBarrier done $ Left e
                    pure (remThread t s){alive = False}
                Right _ ->
                    step pool $ pure . remThread t
            pure (addThread t s){todo = todo2}
@ndmitchell
Copy link
Owner

The reasoning for step spawning a new thread was to free up any used stack that might be on the thread already, and to make sure the exceptions didn't have any chance of bubbling up. Whether those are legitimate reasons is probably something you are most qualified to answer than me! Does that sound sensible or silly?

@mpickering
Copy link
Contributor Author

@ndmitchell Spawning a new thread is fine but the question is why two actions are run in the spawned thread rather than one thread per action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants