-
Notifications
You must be signed in to change notification settings - Fork 897
MPI_THREAD_MULTIPLE and UCX #6593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@nevion need to configure ucx with multi thread support: $ <ucx_dir>/contrib/configure-release --enable-mt then, the check |
@yosefe ah, I didn't glean from the comment block that it was a build time related issue. I thought I tried that, but I will try more carefully. I'm still interested in hearing how tested the threading support is with openmpi and ucx, though. |
@nevion we are constantly running MTT with OpenMPI+UCX and multi-threaded tests. |
@yosefe Can you guys add an FAQ item about this on the OMPI web site? That would make this kind of question Google-able. Thanks! |
Since UCX is the way of the future and openib was... problematic wrt MPI_THREAD_MULTIPLE, what is the status and plan for MPI_THREAD_MULTIPLE and UCX?
It seems like it's still unsupportedompi/ompi/mca/pml/ucx/pml_ucx.c
Lines 282 to 288 in 07e5c54
What would it take to get it there and what plans are in the works?Is there any infiniband based path that works with MPI_THREAD_MULTIPLE besides ipoib with openmpi?Do any other MPI implementations have support for UCX and MPI_THREAD_MULTIPLE?
Is UCX ready for it internally wrt infiniband or mlnx5 transports? Cursory review of the uct code seems like it's thought about, at least.
I used to use openib w/ threads, but at some point I started getting hangs in software using it, like #4863 - so I moved back to ipiob . And then there's the roadmap - so it's somewhat understood the state of openib, but it's a bit of a trying time to have no performant MPI_THREAD_MULTIPLE options and nothing materializing.
The text was updated successfully, but these errors were encountered: