Skip to content

Add a framework and components to expose shared memory single copy to all components #9154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

hjelmn
Copy link
Member

@hjelmn hjelmn commented Jul 19, 2021

This PR moves the shared-memory single-copy code out of btl/sm and puts it in its own framework. This will allow other components to make use of this functionality without relying on btl/sm.

@hjelmn hjelmn requested review from jsquyres, bosilca and bwbarrett July 19, 2021 19:04
@hjelmn hjelmn force-pushed the add_a_component_to_expose_shared_memory_single_copy_to_all_components branch 2 times, most recently from add6656 to 2124629 Compare July 19, 2021 23:19
Copy link
Member

@bwbarrett bwbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks like a massive improvement. A couple of nits included.

One larger question... In the unlikely scenario that two processes come to different smsc selection answers, it looks like we're going to handle that poorly. Maybe I'm reading the code wrong, but only one of the get_endpoint() calls seems to check the other side is actually using the same smsc. Am I reading that wrong?

@gpaulsen
Copy link
Member

We discussed this in the Web-ex Today. Consensus was that it'd be good to get this into v5.0 as there are other optimizations and code-cleanup that Nathan's working on that this will enable.

@hjelmn hjelmn force-pushed the add_a_component_to_expose_shared_memory_single_copy_to_all_components branch 2 times, most recently from 4430b6e to 6943e24 Compare July 28, 2021 17:08
@open-mpi open-mpi deleted a comment from ibm-ompi Jul 28, 2021
hjelmn added 5 commits July 29, 2021 07:28
…pport

This commit introduces opal/mca/smsc which exposes support for shared-memory single copy
mechanisms. The target is to support Linux CMA, XPMEM, and KNEM. This component will
initially support btl/sm but will also be used to provide single copy for other
components (coll/sm, osc/rdma).

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit adds a component to the shared-memory single-copy framework to
support Linux Cross Memory Attach (CMA). This component supports copy_to
and copy_from without registration.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit adds a knem component to the shared-memory single-copy framework. This
modules supports copy_to and copy_from with registration required on the remote
peer.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit adds a new shared-memory single-copy component supporting
Cray/SGI XPMEM. This component supports copy_to, copy_from, and memory
mapping.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit removes the dedicated shared-memory single-copy support from
btl/sm. This support is now provided by the shared-memory single-copy
(smsc) framework. The btl_sm_single_copy_mechanism MCA variable has
been removed. Use either the component priority parameters (ex.
scsm_xpmem_priority) or component selection (--mca smsc) to select the
single-copy mechanism.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
jsquyres added a commit to jsquyres/ompi that referenced this pull request Aug 1, 2021
since this will likely be merged for v5.0.0

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this warning when compiling:

2032 make[2]: Entering directory `/home/jsquyres/git/ompi/opal/mca/smsc'
2033   CC       base/smsc_base_frame.lo
2034 In file included from ../../../opal/mca/smsc/base/base.h:14,
2035                  from base/smsc_base_frame.c:27:
2036 ../../../opal/mca/smsc/smsc.h:216:5: warning: "MCA_opal_smsc_DIRECT_CALL" is not defined, evaluates to 0 [-Wundef]
2037  #if MCA_opal_smsc_DIRECT_CALL
2038      ^~~~~~~~~~~~~~~~~~~~~~~~~

(repeats for all the BTL SM + SMSC component source files, too)

Feels like that is significant...

@jsquyres jsquyres dismissed bwbarrett’s stale review August 1, 2021 22:19

Brian's away on leave. @jsquyres will take over the review of this PR.

@hjelmn hjelmn force-pushed the add_a_component_to_expose_shared_memory_single_copy_to_all_components branch from 6943e24 to 95eb83f Compare August 11, 2021 19:19
@hjelmn
Copy link
Member Author

hjelmn commented Aug 11, 2021

@jsquyres Fixed. File missing from one of the commits. Please re-test.

@hjelmn hjelmn requested a review from jsquyres August 11, 2021 20:05
Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning fixed.

@hjelmn hjelmn merged commit 386ba16 into open-mpi:master Aug 12, 2021
@gpaulsen gpaulsen mentioned this pull request Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants