-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
analysis with CGraph #1
Comments
I am not very sure weather my code for scheduling is at the best point, but it really have some span in the linear situation between CGraph and scheduling on my computer. can you please give some benchmark on your workspace? thank you. using namespace CGraph;
unsigned g_node_cnt = 0;
class TestNode : public CGraph::GNode {
public:
CStatus run() override {
g_node_cnt++;
return CStatus();
}
};
void test_performance() {
// 串行执行32次,对应第二个例子,1thread,32串行,1000w次
GPipelinePtr pipeline = GPipelineFactory::create();
CStatus status;
GElementPtr arr[32];
pipeline->registerGElement<TestNode>(&arr[0]);
for (int i = 1; i < 32; i++) {
pipeline->registerGElement<TestNode>(&arr[i], {arr[i - 1]});
}
pipeline->makeSerial();
pipeline->setAutoCheck(false);
status += pipeline->init();
{
for (int t = 0; t < 1000000; t++) {
pipeline->run();
}
std::cout << g_node_cnt << std::endl;
}
status += pipeline->destroy();
GPipelineFactory::remove(pipeline);
}
int main() {
auto start_ts = std::chrono::steady_clock::now();
test_performance();
std::chrono::duration<double, std::milli> span = std::chrono::steady_clock::now() - start_ts;
printf("time counter is : [%0.2lf] ms \n", span.count());
return 0;
} |
Hi! Thanks a lot for the reference to CGraph! I didn't know about CGraph earlier, this is a very interesting repository! Please give me some time to take a deeper loop and provide benchmarks, I will be very glad to do this. |
Thank you very much |
Haven't had a chance to look into it in detail yet, but here are some comments which I can make now:
Thanks again for your interest in Scheduling and for the reference to CGraph! |
oh, no, some mistake in your current demo. not create one node, and run this node 1000,000 times, the same as you, i think, create once and then run some times, is a common situation for users. hope schuduling can support more scenarios, thanks for your excellent code and sharing |
yes yes, I understand what you mean, unfortunately this use case is currently not supported by scheduling. I will try to add support for this use case, thanks a lot for this great and very useful idea! I will post an update here once implemented. I have added a link to CGraph in my readme, I hope you don't mind. I will be glad to study your excellent library in more detail! |
it is my honor |
The honor is all mine, you did a great job writing CGraph, there's a lot to learn from this lib! |
I have reopened the issue if you don't mind, since the comparison with CGraph is not done yet. I would like to implement the use case which you suggested and provide benchmarks, so let's keep it opened until it's done. Thank you! |
Hi @ChunelFeng! I hope you are doing well! I have a small question about CGraph which I would be glad to discuss with you here when you have a moment. I tried the code snippet which you suggested above and observed that everything happens on the main thread: CBool GElement::isAsync() const {
// 如果timeout != 0, 则异步执行
return this->timeout_ != CGRAPH_DEFAULT_ELEMENT_TIMEOUT;
} I tried setting the timeout for the tasks manually to make them async like below, but this did not seem to be working: for (const auto& task : arr) {
task->setTimeout(1000);
} May I ask you kindly if you have a specific logic in your code which controls the number of available threads and affinity to avoid context switching? I have updated Scheduling to be able to run the same graph multiple times (not merged yet). When I limit the number of threads to one, I obviously get much better results compared to the multiple threads version which suffers from context switching very heavily in this use case... |
two ways both join effect to a DAG as a single line. first, we use makeSerial(), to check weather this dag can go well in one signal thread. pipeline->makeSerial(); second, in the code : we use analysisDagType to make sure the lineal nodes worked in the same thread(the main thread). analysisDagType(elements); the two methods only worked when this dag is a line. and, when you as a node's timeout value, it can not worked in one thread for the reason of async timeout. |
we can simple check weather the dag can work in one thread, CStatus status = pipeline->makeSerial(); if it can, the status is ok. else, it can not worked in one thread. |
so does it automatically define the minimum number of threads that are necessary to run a graph? |
not automatically. in other case, you can set config just like this example. https://github.com/ChunelFeng/CGraph/blob/main/tutorial/T01-Simple.cpp. and you can automatic get the min thread num with this function: |
cool, thanks a lot for the explanations! |
what is your tools in your picture? it seems very useful |
This is Concurrency Visualizer in Visual Studio, https://learn.microsoft.com/en-us/visualstudio/profiling/threads-view-parallel-performance?view=vs-2022 |
Hi @ChunelFeng! Below are a few observations that I can make:
Below is the Scheduling code which I used for experiments (you can also find it in unit tests): auto counter = 0;
ThreadPool thread_pool;
std::vector<Task> tasks(32);
tasks[0] = Task([&] { ++counter; });
for (auto i = tasks.begin(), j = std::next(i); j != tasks.end(); ++i, ++j) {
*j = Task([&] { ++counter; });
j->Succeed(&*i);
}
for (int i = 0; i < 1'000'000; ++i) {
thread_pool.Submit(&tasks[0]);
thread_pool.Wait();
} |
in this case, CGraph run only in one thread, is because we use
if you remove this line, CGraph will create 8 thread in backend. i give you some code, to run CGraph with 8 thread in this case: please replace #include "../_Materials/TestInclude.h"
using namespace CGraph;
void test_performance_02() {
// 串行执行32次,对应第二个例子,1thread,32串行,1000w次
GPipelinePtr pipeline = GPipelineFactory::create();
CStatus status;
GElementPtr arr[32];
pipeline->registerGElement<TestAdd1GNode>(&arr[0]);
for (int i = 1; i < 32; i++) {
pipeline->registerGElement<TestAdd1GNode>(&arr[i], {arr[i - 1]});
}
pipeline->setAutoCheck(false);
UThreadPoolConfig config;
config.max_thread_size_ = 8;
config.primary_thread_busy_epoch_ = 2;
config.primary_thread_empty_interval_ = 0;
pipeline->setUniqueThreadPoolConfig(config);
status += pipeline->init();
{
UTimeCounter counter("test_performance_02");
for (int t = 0; t < 1000000; t++) {
pipeline->run();
}
}
if (32000000 != g_test_node_cnt) {
std::cout << "test_performance_02: g_test_node_cnt is not right : " << g_test_node_cnt << std::endl;
}
status += pipeline->destroy();
GPipelineFactory::remove(pipeline);
}
int main() {
test_performance_02();
return 0;
}
|
Thanks a lot for the code snippet, @ChunelFeng! Now the code runs 8 background threads as expected. I ran it multiple times, on my computer the test usually takes around 2.8 sec. Below is the sample output:
To run the similar test for Scheduling, you can use the ThreadPoolTest.ResubmitGraph unit test. Test duration on my computer mostly varies between 1.5 and 1.9 sec. The timing is much less stable compared to CGraph, I guess the variation comes from the random context switching issue which I described above, still not sure how to fix it. Below is a sample output:
|
by the way, please update CGraph to the newest version. we optimize this situation |
Hi @ChunelFeng! Thank you for letting me know about the updates! There are two commits which I didn't have, shown on the screenshot below: May I ask you kindly what timings do you get on your computer? |
I can see you have Release on your screenshots in both cases, so accidentally running it in Debug is not the case. Could be some MacOS specific stuff, honestly I don't know.. 🤔 |
May I ask you also what results do you get if you force Scheduling to use 8 threads? This can be done as below: ThreadPool thread_pool(8); I think a possible explanation could be as follows: By default, Scheduling uses as many threads as are available on your computer (defined by |
Thank you for the quick response @ChunelFeng! Hm, okay, looks like now I've run out of ideas😄 But it's clear that I have a lot of work to do with Scheduling😉 Please give me some time to think about this, this is a very interesting discussion for me, I would be very glad if we could stay in touch about this. I will post here as soon as I have any updates on this topic |
Hi @ChunelFeng! |
Hi, nice to see this repo.
CGraph is a repo with almost same function with scheduling.
can you please make a performance testing with CGraph too, such as taskflow
thank you very much
The text was updated successfully, but these errors were encountered: