-
Notifications
You must be signed in to change notification settings - Fork 899
--host, binding and cpuset does not seem to work #6966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Let me note that I did those different Then I get $> mpirun --report-bindings --bind-to core -np 1 --cpu-set 0 ./a.out : -np 1 --cpu-set 1 ./a.out
[nicpa-dtu:10600] MCW rank 0 is not bound (or bound to all available processors)
[nicpa-dtu:10600] MCW rank 1 is not bound (or bound to all available processors)
$> mpirun --bind-to core -np 1 --cpu-set 0 --report-bindings ./a.out : -np 1 --cpu-set 1 ./a.out
[nicpa-dtu:10611] MCW rank 0 is not bound (or bound to all available processors)
[nicpa-dtu:10611] MCW rank 1 is not bound (or bound to all available processors)
$> mpirun --bind-to core -np 1 --cpu-set 0 ./a.out : -np 1 --cpu-set 1 --report-bindings ./a.out
[nicpa-dtu:10628] MCW rank 1 bound to SK0:L30:L20:L10:CR0:HT0-1
$> mpirun --bind-to core -np 1 --cpu-set 0 --report-bindings ./a.out : -np 1 --cpu-set 1 --report-bindings ./a.out
[nicpa-dtu:10633] MCW rank 0 is not bound (or bound to all available processors)
[nicpa-dtu:10633] MCW rank 1 is not bound (or bound to all available processors) note the 3rd line! |
I'll check - it is possible that the confusion lies in the output. If the cpu-set is a single processor and you tell us to bind-to core, then the proc sees that "all available processors" is just the one that it is executing upon - which it interprets as being "bound to all available processors" as the message says. |
@rhc54 ok, but I would have suspected something like:
Also, doing: $> mpirun --report-bindings --bind-to core -np 1 --cpu-set 0 ./a.out : --bind-to core -np 1 --cpu-set 2 ./a.out
[nicpa-dtu:07197] MCW rank 0 is not bound (or bound to all available processors)
[nicpa-dtu:07197] MCW rank 1 is not bound (or bound to all available processors) However, if I do (with an application file): $> cat appfile
--report-bindings --bind-to core -np 1 numactl --physcpubind=0 ./a.out
--report-bindings --bind-to core -np 1 numactl --physcpubind=1 ./a.out I get $> mpirun --app appfile
[nicpa-dtu:08893] MCW rank 0 bound to SK0:L30:L20:L10:CR0:HT0-1
[nicpa-dtu:08894] MCW rank 1 bound to SK0:L30:L21:L11:CR1:HT2-3 which looks correct (although I am not used to the |
What do you get if you run: mpirun --report-bindings --bind-to core -np 1 --cpu-set 0 numactl --show : --bind-to core -np 1 --cpu-set 2 numactl --show |
On my local machine I get: $> mpirun --report-bindings --bind-to core -np 1 --cpu-set 0 numactl --show : --bind-to core -np 1 --cpu-set 2 numactl --show
policy: default
preferred node: current
physcpubind: 0 2
cpubind: 0
nodebind: 0
membind: 0
[nicpa-dtu:12015] MCW rank 0 is not bound (or bound to all available processors)
[nicpa-dtu:12015] MCW rank 1 is not bound (or bound to all available processors)
policy: default
preferred node: current
physcpubind: 0 2
cpubind: 0
nodebind: 0
membind: 0 |
With the $> cat appfile
--report-bindings --bind-to core -np 1 numactl --physcpubind=0 --show
--report-bindings --bind-to core -np 1 numactl --physcpubind=1 --show
$> mpirun --app appfile
policy: default
preferred node: current
physcpubind: 0
cpubind: 0
nodebind: 0
membind: 0
policy: default
preferred node: current
physcpubind: 1
cpubind: 0
nodebind: 0
membind: 0 |
I added two more tests on the cluster script: # 7. Create an appfile
{
echo --report-bindings --bind-to none -np 1 --host $hosti numactl --physcpubind=$cpuseti ./run
echo --report-bindings --bind-to none -np 1 --host $hostj numactl --physcpubind=$cpusetj ./run
} > appfile
mpirun --app appfile > appfile-none.$hosti.$cpuseti-$hostj.$cpusetj
# 8. Create an appfile
{
echo --report-bindings --bind-to core -np 1 --host $hosti numactl --physcpubind=$cpuseti ./run
echo --report-bindings --bind-to core -np 1 --host $hostj numactl --physcpubind=$cpusetj ./run
} > appfile
mpirun --app appfile > appfile-core.$hosti.$cpuseti-$hostj.$cpusetj In this case I find that 7. works correctly (YAY!): for [n-62-27-29:47440] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././././././././././././././././././././././././././.][./././././././././././././././././././././././././././././././.]
[n-62-27-29:47441] MCW rank 1 bound to socket 0[core 3[hwt 0]]: [./././B/./././././././././././././././././././././././././././.][./././././././././././././././././././././././././././././././.] However, for 8. I get: libnuma: Warning: cpu argument 3 is out of range which suggests that the binding for MPI (and thus CPUSET) is limited prior to the execution, however, specifying I hope this can be used to dig out what is happening and how to use OMPI to control this. Using simpler command-line-arguments would be nice for end-users, since the |
A couple of us looked into this and found that:
We concluded that trying to backport a fix to the v3.x series was unlikely to be easy nor small, which made the release manager for that series reluctant to consider accepting it. Fixing it for v5.x is definitely something we will do, and it might (depending on the solution) be acceptable to the release managers to backport that fix to v4.x. Meantime, a workaround that appeared to work for us was: mpirun --map-by core --bind-to core -np 1 --cpu-set 0,3 app1 : -np 1 app2 This tells mpirun to utilize an envelope of cpus 0 and 3, and to map/bind the procs by core within that envelope. So the first proc in the job will be bound to cpu0 and the second proc in the job (which is the first proc of app2) will be bound to cpu3. Give that a try and see if it works for you. |
Thanks! I am ok with having this for 5, but it would be really nice to have in 4. ;) IFF the options are not working, it should probably be added to the documentation for the next 3.X release to ensure people are aware of it? (well, not too much trouble for me, :)) I have also tried your suggestion. However, I can't get it to work: $> mpirun --map-by core --bind-to core -np 1 --cpu-set $cpuseti,$cpusetj ./run : -np 1 ./run
--------------------------------------------------------------------------
Conflicting directives for mapping policy are causing the policy
to be redefined:
New policy: RANK_FILE
Prior policy: BYCORE
Please check that only one policy is defined.
--------------------------------------------------------------------------
$> mpirun --map-by core -np 1 --cpu-set $cpuseti,$cpusetj ./run : -np 1 ./run
--------------------------------------------------------------------------
Conflicting directives for mapping policy are causing the policy
to be redefined:
New policy: RANK_FILE
Prior policy: BYCORE
Please check that only one policy is defined.
--------------------------------------------------------------------------
$> mpirun --bind-to core -np 1 --cpu-set $cpuseti,$cpusetj ./run : -np 1 ./run
[n-62-27-29:05179] MCW rank 0 is not bound (or bound to all available processors)
[n-62-27-29:05179] MCW rank 1 is not bound (or bound to all available processors) It doesn't seem to work. :( Also, the above won't work on 2 different hosts. Well, it seems that the app-file solution is sufficient and solves all problems. If you want me to test more, please do not hesitate to contact me. |
This should be retested with master/v5.0.x with prrte. |
Adding the blocker label for now - but it may not be. |
This might be fixed in v5.0.x, just needs retesting. |
Re-testing this on current master, it does appear to be fixed. Here's some example runs with master/prrte (v5.0) - Note that Local host:
Remote host:
I think we can remove the v5 label, and possibly close this if there is no intention of bringing any required fixes to v4. |
Confirmed also works with the current v5.0.x branch. |
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
ompi: 3.1.4
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Manual installation.
Please describe the system on which you are running
ScientificLinux 7.3
EPYC 7551 (currently just testing single node)
not important (single node)
Details of the problem
I would like to fully control MPI ranking and binding on the command-line interface and optionally do it with
--host
assignments.In particular I would like to run the OSU benchmarks and do some benchmarks.
However, I have the same problem for simple codes.
In the following I will use this code (for brevity)
Cluster script:
I am requesting 64 cores on a 2-socket EPYC7551 machine (totalling 64 cores).
Explaining the script
Although I am allocating entire nodes and not using everything I would still expect OpenMPI to obey my requested bindings.
This gives me:
regardless of
--cpu-set <args>
2-6. All yield exactly the same output:
I have also tried adding
--bind-to core
with the same output.Possibly related issues:
The text was updated successfully, but these errors were encountered: