-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nexus: Set workstation thread binding to none #5240
base: develop
Are you sure you want to change the base?
Conversation
Is it possible to test if it is OpenMPI in use and add the binding option if the answer is yes? |
A fair question for @jtkrogel My take is that since the current workstation class assumes you are using some flavor of MPI, we could have different variants for different MPI implementations (OpenMPI, MPICH, even no MPI [but set threads properly]). Hopefully this current change can be merged quickly if there is not a near ready-to-go better route. I'll note though that improvements to the Workstation class are not academic -- if we are to run the Nexus examples in the nightlies or even CI, efficiency will be key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed that both OpenMPI and MPICH accept --bind-to none
. Unclear about MVAPICH.
Test this please |
@prckent nexus tests failing on sulfur |
But why those sulfur configurations? |
Proposed changes
The Nexus workstation configuration, e.g. ws16, implicitly assumes codes are built with OpenMPI for MPI runs. When also used with threading, the threads will be wrongly bound for our uses, giving poor performance. The binding is the default behavior of all recent OpenMPI releases and is easily verified in a system with hyperthreading enabled where the per process CPU% will show as 50% instead of the expected 100%, or in a 16 "core" run, 16 cores will not be used. Performance is consequently halved. Added --bind-to none, similar to what has already been done in some of the Supercomputer job definitions.
Open to changing to more optimal settings or moving the location of the setting, but for everyday workstation runs I think this is close enough and simple.
Noticed while testing #5214.
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
What systems has this change been tested on?
nitrogen2, GCC14 + OpenMPI, standard nightly configuration.
Checklist