Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage_ipc] Option II: Provides IPC extensions for 3rd devices. #126373

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mengpenghui
Copy link

I am working on IPC-related interfaces to support third-party devices.
This is another option for IPC support for third party devices.

  1. The share_cuda_ interface is changed to share_device_
  2. The backend uses getAccelerator to determine the device and distribute different devices.
  3. Add the share_device implementation of third-party devices to PrivateUse1HooksInterface

In this way, third-party devices and CUDA will use the same interface, and the device distinction is hidden in the backend. Users also need to manually set the device type (although these are internal interfaces).

another plan: #125122

Do you think which is feasible? Looking forward to your suggestions
Fixes #124902

cc @albanD

Copy link

pytorch-bot bot commented May 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126373

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b7ead53 with merge base b24ad7e (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@drisspg drisspg added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 20, 2024
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, that sounds good in principle but I think the implementation needs a bit of work for two things:

  • the aten level hooks should not take as input and return PyObject*. This is the c++ layer and should not deal with arg-parsing in there. I think in this case, you want to refactor the arg-parsing from the cuda code to be shared. And then call into a c++ only hook.
  • The hook should be generic, I think the AcceleratorHookInterface is the right place to do that. This way it can be shared by CUDA/PrivateUse1/others
  • The python layer should be generic using acceleratorhooks so that you don't need if/else for each device.

@@ -294,7 +294,7 @@ struct C10_API StorageImpl : public c10::intrusive_ptr_target {
bool resizable_;
// Identifies that Storage was received from another process and doesn't have
// local to process cuda memory allocation
bool received_cuda_;
bool received_device_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on the line above needs updating as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for IPC features for PrivateUse devices
5 participants