You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current OMPI dpm code uses the PMIx publish/lookup mechanism for rendezvous during MPI comm_spawn and connect/accept operations. This mechanism has proven somewhat weak over time and doesn't really scale all that well.
A better mechanism would be to use the PMIx "group" functions as these are designed to scale. We couldn't do this before now because the "group" operations weren't in earlier versions of PMIx and PRRTE - but we now are requiring high enough versions to ensure this support is present.
It would therefore be advisable to update the dpm to take advantage of those faster and more robust operations. The required code would be identical to that used for creating an MPI "session" - a simple group construct (called by all participants) that includes a request to assign a new CID would suffice, and would eliminate a bunch of complex code currently in the dpm.
The text was updated successfully, but these errors were encountered:
The current OMPI dpm code uses the PMIx publish/lookup mechanism for rendezvous during MPI comm_spawn and connect/accept operations. This mechanism has proven somewhat weak over time and doesn't really scale all that well.
A better mechanism would be to use the PMIx "group" functions as these are designed to scale. We couldn't do this before now because the "group" operations weren't in earlier versions of PMIx and PRRTE - but we now are requiring high enough versions to ensure this support is present.
It would therefore be advisable to update the dpm to take advantage of those faster and more robust operations. The required code would be identical to that used for creating an MPI "session" - a simple group construct (called by all participants) that includes a request to assign a new CID would suffice, and would eliminate a bunch of complex code currently in the dpm.
The text was updated successfully, but these errors were encountered: