-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerated DAG: Support dynamic resizing of shared memory channels #45323
base: master
Are you sure you want to change the base?
Conversation
c4702ce
to
ced81ea
Compare
e1b8fd0
to
2ff3953
Compare
This is ready for another pass. |
a8ed927
to
2269eaf
Compare
8e2f5c4
to
017ea05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python changes overall look good to me. I am still trying to understand the C++ changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, very clean :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still trying to understand the C++ parts.
|
||
io_services_.push_back(std::make_unique<instrumented_io_context>()); | ||
instrumented_io_context &io_service = *io_services_.back(); | ||
io_works_.push_back(std::make_unique<boost::asio::io_service::work>(io_service)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use io_context
instead? io_service
seems to be deprecated boostorg/asio#110.
std::vector<std::unique_ptr<boost::asio::io_service::work>> io_works_; | ||
// Contexts in which the application looks for local changes to mutable objects and | ||
// sends the changes to remote nodes via the network. | ||
std::vector<std::unique_ptr<instrumented_io_context>> io_services_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to io_contexts_
.
}); | ||
} | ||
|
||
void MutableObjectProvider::RunIOService() { | ||
void MutableObjectProvider::RunIOService(instrumented_io_context &io_service) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void MutableObjectProvider::RunIOService(instrumented_io_context &io_service) { | |
void MutableObjectProvider::RunIOContext(instrumented_io_context &io_context) { |
|
||
io_services_.push_back(std::make_unique<instrumented_io_context>()); | ||
instrumented_io_context &io_service = *io_services_.back(); | ||
io_works_.push_back(std::make_unique<boost::asio::io_service::work>(io_service)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need boost::asio::io_service::work
here? Where do we call work.reset()
?
@@ -177,14 +183,19 @@ void MutableObjectProvider::PollWriterClosure( | |||
object->GetData()->Size(), | |||
object->GetMetadata()->Size(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raylet_client_factory_
: Creates a function for each object. This function waits for changes on the object and then sends those changes to a remote node via RPC.
The above comment is from experimental_mutable_object_provider.h
. Can you explain how does reader->PushMutableObject
"sends those changes to a remote node via RPC"? Thanks!
Why are these changes needed?
In an accelerated DAG, all channels have their backing store specific upfront. However, this size may not be large enough for all objects that are communicated via channels, particularly since channels are reused across multiple DAG invocations.
This PR adds support for increasing the size of a shared memory channel backing store. Once the backing store size is increased, the size is maintained (i.e., it is not decreased) until it needs to be increased again.
Related issue number
Addresses one issue noted in #45597
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.