Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable shared-memory bypass comms for co-locales on InfiniBand #24944

Closed

Conversation

bonachea
Copy link
Contributor

Overview

This PR enables GASNet-level shared-memory bypass communication for all GASNet conduits (in fast/large segment modes), which should notably greatly accelerate co-locale communication for CHPL_COMM=gasnet CHPL_COMM_SUBSTRATE=ibv.

This fixes the defect acknowledged in the original PR #13473 regarding use of PSHM with CHPL_COMM_SUBSTRATE=ibv.

TODO:

  1. This PR has not yet been tested for correctness or performance, and thus should be considered a proof-of-concept.
  2. The PR currently only modifies the GASNet-EX backend (i.e. CHPL_GASNET_VERSION=ex). Assuming it proves valuable there, then an analogous change probably must be back-ported to the legacy GASNet-1 backend, to avoid breaking it in the presence of co-locales.

CC: @PHHargrove , @ronawho , @jhh67

…use of co-locales

This is necessary to ensure asynchronous progress of AMs arriving
via the PSHM channel on ibv-conduit.

Signed-off-by: bonachea <dobonachea@lbl.gov>
PSHM is the GASNet term for shared-memory bypass communication, which
routinely improves latency and overhead by 2-3 orders of magnitude
relative to communication using a loopback network channel.

PSHM is now only disabled for CHPL_MAKE_COMM_SEGMENT=everything,
where it is currently unavailable in GASNet.

This should result in greatly accelerated performance for co-locale
communication when using CHPL_COMM_SUBSTRATE={ibv,aries,mpi}, where PSHM
was previously disabled.

Signed-off-by: bonachea <dobonachea@lbl.gov>
@bonachea
Copy link
Contributor Author

bonachea commented Jun 1, 2024

This PR is superseded by #25140 where @jhh67 is currently preparing to instead use GASNet's new GEX_FLAG_DEFER_THREADS and gex_System_QueryProgressThreads() features that allow direct control of the GASNet progress threads, a feature introduced in the 2024.5.0 GASNet-EX release.

@bonachea bonachea closed this Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant