New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to avoid recursive async() from hanging. #575
Comments
For reference, what we're trying to do is convert the deal.II (https://www.dealii.org) library from using the TBB to Taskflow. The |
For the record, I also tried to use the
|
try this way #include <taskflow/algorithm/for_each.hpp>
#include <taskflow/taskflow.hpp>
#include <iostream>
#include <thread>
tf::Executor executor (2); // executor having two threads
void bottom()
{
std::cout << " Starting task at the bottom\n";
std::cout << " ... ... ...\n";
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << " Ending task at the bottom\n";
}
void middle(tf::Runtime& rt)
{
std::cout << " Starting task in the middle\n";
rt.silent_async([]() { bottom(); } );
std::cout << " Waiting for sub-task in the middle\n";
rt.corun_all();
std::cout << " Ending task in the middle\n";
}
void top(tf::Runtime& rt)
{
std::cout << " Starting task at the top\n";
rt.silent_async([](tf::Runtime& rt) { middle(rt); } );
std::cout << " Waiting for sub-task at the top\n";
rt.corun_all();
std::cout << " Ending task at the top\n";
}
int
main()
{
std::cout << "Starting task in main()\n";
executor.silent_async([](tf::Runtime& rt) { top(rt); } );
std::cout << "Waiting for task in main\n";
executor.wait_for_all();
std::cout << "Done in main\n";
} |
@makeuptransfer I have not quite been able to find out what In other words, the question comes down to where the |
@bangerth, In my opinion, a runtime is tied to a node, when you silent_async a task(node) using executor, it will +1 to the whole executor pending_cnt, but when you silent_async a task using runtime, it will add the pending_cnt of rt's parent(which refers the task who silent_async this task, e.g. top 's rt silent_async mid will add top's pending). the runtime corun_all method will only wait the task it silent_async so it will not block in this case. |
Hi @bangerth , yes,
tf::Executor executor(1);
auto fu1 = executor.async([](){
// ... do some stuff
});
auto fu2 = executor.async([&](){
executor.corun_until([&](){ // the worker will not block until fu1 completes but joins the work-stealing loop to run tasks until fu1 becomes ready
return fu1.wait_for(std::chrono::seconds(0)) == future_status::ready;
});
});
Does this make sense to you? |
@makeuptransfer and @tsung-wei-huang Thank you for your suggestions. I believe that the In the end, the kind of semantics I look for are basically like this:
In order for this to work without deadlocks, the @tsung-wei-huang I believe that your solution works for this. It makes sure that we return as soon as Out of curiosity, you wrote the second half of the code as a separate task:
Why not just the following?
|
Hi @bangerth, |
Hm, but then we're back to a chicken and egg problem: If I can only corun from within that second task, how do I make the outer task wait for the first task? In your example, I'd have to wait for |
Yes, ultimately something needs to synchronize it. If you have complicated dependencies around async tasks like this, I would recommend using Dependent Async, which allows you to do more fine-grained dependency building around dynamic task graphs. Have you ever looked into this? |
But I don't actually have complicated dependencies. Waiting for a previously submitted sub-task is not an uncommon operation. I believe that I must be misunderstanding something conceptual because I am certain that you have run into this case before. I just don't understand how you have solved it :-) The key piece is you can't use Do I read it right, then, that |
Hi @bangerth , your understanding is correct - Yes, I was just giving a hint if you have a complicated task graph that needs to be created on the fly, |
@bangerth ,i dont get it, there are two threads (one from main, one from executor), why wait fu2 will block? |
@makeuptransfer Try it out :-) The executor allows for at most 2 tasks to run concurrently, but I'm recursively creating 3 tasks, so the innermost task will simply never be scheduled because the outer two are still running, and they will never finish because they are waiting for the innermost one to complete. |
@tsung-wei-huang Let me start by saying thank you for your patience with my question. Your feedback is much appreciated! I played with this some more. First, this approach with
But this is not what one wants. It creates an So we need a global executor object. This suggests the following code, but (see below) this does not work:
This deadlocks (of course) because the I then went through the list of members of
The second line needs to be something that references The functions of
Implementations of these approaches would look like this:
Unfortunately, this segfaults in the very first invokation of the wait operation in
As mentioned, this actually works even though the call to |
So I'm afraid that in truth, I'm not really any further towards a solution than I was before. Two key observations:
|
tf::Executor executor (2); // executor having two threads
void bottom()
{
std::cout << " Starting task at the bottom\n";
std::cout << " ... ... ...\n";
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout << " Ending task at the bottom\n";
}
void middle()
{
std::cout << " Starting task in the middle\n";
auto t = executor.async([] () { bottom(); });
std::cout << " Waiting for sub-task in the middle\n";
executor.corun_until ([&t](){ return (t.wait_for(std::chrono::seconds(0)) == std::future_status::ready); });
std::cout << " Ending task in the middle\n";
}
void top()
{
std::cout << " Starting task at the top\n";
auto t = executor.async([] () { middle(); });
std::cout << " Waiting for sub-task at the top\n";
executor.corun_until ([&t](){ return (t.wait_for(std::chrono::seconds(0)) == std::future_status::ready); });
std::cout << " Ending task at the top\n";
}
int
main()
{
try
{
std::cout << "Starting task in main()\n";
auto t = executor.async([] () { top(); });
std::cout << "Waiting for task in main\n";
// executor.corun_until ([&t](){ return (t.wait_for(std::chrono::seconds(0)) == std::future_status::ready); });
// just wait_for_all()
executor.wait_for_all();
std::cout << "Done in main\n";
}
catch (std::exception &e)
{
std::cout << "Exception: " << e.what() << std::endl;
}
} Hi, @bangerth, why not wait_for_all in the main ? Totally there are three threads in total, on main thread, two executor threads, the exexcutor threads will never block when you call corun_until in the task. |
@makeuptransfer I don't want to wait for all tasks. In the example, there are no others around, but in other contexts I may have started more asynchronous tasks. I want to wait for a specific one. |
@bangerth thx, i get it. it seems like you need let the tasks after the "specific one" scheduled by the "specific one" , or use Dependent Async,will they help? |
I think I have finally figured it out. My solution 2 above was the right approach, but suffered from the fact that one can only call
|
@bangerth , yes, your understanding about the execution logic and |
I put that into deal.II in dealii/dealii#16976 and it runs successfully with all 13,000 of our tests. So this seems to work as intended :-) Thanks for your help! |
I must be missing something fundamental about how to make
tf::Executor::async()
work in actual practice. In the following little program, I'm telling the executor that it can use 2 threads, and then I have three levels of tasks I create withasync()
. Each time I create a task, I wait for it:This program, perhaps unsurprisingly hangs:
The reason, of course, is that we create a task in
main()
, which creates a task intop()
, which gets us intomiddle()
which queues up a third task, but because we already have two tasks running, that third task never runs, and somiddle()
never stops waiting.This is not surprising.
executor.async()
returns astd::future
object, and waiting for it does not communicate to theexecutor
that the current task isn't doing anything and that now would be a good time to run other tasks.The question is, then, how one is supposed to use
async()
in contexts in which one has no control over the number of recursive invocations ofasync()
?The text was updated successfully, but these errors were encountered: