-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For Comment Only: Lums/sc 25954/random scheduler #3978
base: dev
Are you sure you want to change the base?
Conversation
…with `make_edge` but if given a mimo node, the user has to specify in `make_edge` which port is being connected from or to.
Revert "Very small update [skip ci]" This reverts commit 828e501. Revert "Implement API for MIMO nodes in taskgraph. Ports are still connected with `make_edge` but if given a mimo node," This reverts commit 8441b8b. Revert "Revert "Very small update [skip ci]"" This reverts commit bd118d4. Revert "Revert "Implement API for MIMO nodes in taskgraph. Ports are still connected with `make_edge` but if given a mimo node,"" This reverts commit 9389acf. include ctad file in unit_segmented_nodes [skip ci] Fix CTAD deduction guides for function nodes [skip ci]
…with `make_edge` but if given a mimo node, the user has to specify in `make_edge` which port is being connected from or to.
Revert "Very small update [skip ci]" This reverts commit 828e501. Revert "Implement API for MIMO nodes in taskgraph. Ports are still connected with `make_edge` but if given a mimo node," This reverts commit 8441b8b. Revert "Revert "Very small update [skip ci]"" This reverts commit bd118d4. Revert "Revert "Implement API for MIMO nodes in taskgraph. Ports are still connected with `make_edge` but if given a mimo node,"" This reverts commit 9389acf. include ctad file in unit_segmented_nodes [skip ci] Fix CTAD deduction guides for function nodes [skip ci]
This pull request has been linked to Shortcut Story #25954: Implement random scheduler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a nice read, thanks a lot. It's not exactly clear to me how the random scheduler is supposed to prove the robustness of the system, I'd love if you could explain this a bit in more details. How is graph supposed to work overall if its nodes are executed in random order? Do nodes yield execution if they have nothing on the input ports and no matter in which order you execute the tasks wrapping the nodes, the nodes will eventually end up running according to how data flows within the graph?
On Apr 10, 2023, at 7:47 AM, Robert Bindar ***@***.***> wrote:
@robertbindar approved this pull request.
This was a nice read, thanks a lot. It's not exactly clear to me how the random scheduler is supposed to prove the robustness of the system, I'd love if you could explain this a bit in more details. How is graph supposed to work overall if its nodes are executed in random order? Do nodes yield execution if they have nothing on the input ports and no matter in which order you execute the tasks wrapping the nodes, the nodes will eventually end up running according to how data flows within the graph?
That is exactly right. And the point is to show that the performance of doing it that way is as good as doing all the things of putting tasks to sleep and putting them in waiting queues, waking them up with notification signals etc. With random a node will execute if it has input and won’t execute if it doesn’t.
By robustness I mean that the nodes and edges should behave properly and flow data through the graph even if they are selected to run at random rather than in topological sorted order, for example. I expect that after a graph (that is initially topologically sorted) is executed a few times the order of execution of nodes will become scrambled anyway.
There is a bit of contention and other overhead (due to too much shuffling) at the moment on the random task queue but I expect we can reduce those without too much effort (use double-ended queue, be smarter about when to shuffle, &c.). Right now we re-shuffle on every pop.
Cheers,
Andrew Lumsdaine
|
The random scheduler is intended to demonstrate the robustness of the task graph I/O - compute infrastructure, in preparation for full state management in the task graph.
This is probably the simplest possible finite thread pool scheduler, namely one that simply picks tasks at random and executes them. The nodes in the tasks are still stateful in that they have a "program counter" indicating the next step to be taken in their I/O - compute loop. However, the scheduler does not need to be aware of the node state. It simply picks a task and executes it (by invoking the
resume
method).The tasks are kept in a thread-safe
RandomizedQueue
, which is similar to theBoundedBufferQueue
except the items in the queue are kept in an underlying vector, which is shuffled on everypop()
from the queue.The random scheduler maintains all tasks in the single randomized queue. Resuming a task from any state is safe. In particular, if
resume
is invoked for a node that is waiting, it will simply continue to wait.Preliminary performance results (from the sieve benchmark) show that the random scheduler is as effective as the full Duff's device scheduler (which puts tasks into running, waiting, or runnable queues) -- as long as the number of threads per core does not become too large. Otherwise, the overhead of locking the single job queue dominates execution time. There are likely a number of optimizations that can be made to reduce these overheads.
TYPE: FEATURE
DESC: A scheduler that simply picks tasks at random to run.