Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redefine FI_HMEM Interface In Libfabric 2.0 #9447

Open
a-szegel opened this issue Oct 18, 2023 · 2 comments
Open

Redefine FI_HMEM Interface In Libfabric 2.0 #9447

a-szegel opened this issue Oct 18, 2023 · 2 comments
Labels

Comments

@a-szegel
Copy link
Contributor

a-szegel commented Oct 18, 2023

FI_HMEM support in libfabric is a boolean on/off to represent a wide variety of HMEM capabilities (GDR, memory synchronization status, async copies, etc...). We should use the opportunity of Libfabric 2.0 to define an interface that clearly defines our HMEM capabilities, and what we expect our users to do for correct Libfabric behavior (i.e.. set CU_POINTER_ATTRIBUTE_SYNC_MEMOPS on every cuda pointer they pass into our interface).

@j-xiong
Copy link
Contributor

j-xiong commented Apr 2, 2024

what are the possible capabilities? p2p RDMA with device memory support, dev mem map (gdr copy) (application may not need to know), option to disable device copy method. p2p copy (IPC).

how many of them need to be exposed to application?

maybe we need some internal provider only options? hmem_ops?

expose hints about hw supported offload? instead of sw emulated.

HMEM interface name as additional input, e.g. "cuda", "rocr", .. today we have FI_HMEM env --> want a programmatic version, We need some extra parameter in fi_getinfo(). Nice to have.

@shijin-aws
Copy link
Contributor

shijin-aws commented Apr 9, 2024

Another related capability: whether the hmem iface support dmabuf reg.

Currently application won't know this without calling fi_mr_regattr... Ideally this can be returned during fi_getinfo...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants