Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable oversubscribing in OpenMPI 2.X #3335

Merged
merged 3 commits into from
Nov 22, 2019
Merged

Conversation

jngrad
Copy link
Member

@jngrad jngrad commented Nov 21, 2019

Fixes #3333

Centralize the TEST_NP and MPIEXEC_OVERSUBSCRIBE default value
deduction mechanism to the dedicated unit_test module.
The OpenMPI -oversubscribe feature was backported to the 2.X series.
Without the -oversubscribe flag, tests running on hyperthreaded CPUs
with more threads than cores using OpenMPI 2.X may exit early with
an exit code of 0, causing these tests to silently fail.
When no default value is provided for TEST_NP, value `nproc / 2 + 1`
is used instead, which can be large on hyperthreaded CPUs. This leads
to issues in parallel tests that don't scale well with a large number
of cores (e.g. when there aren't enough particles) or with a large
odd number of cores (e.g. LB or P3M).
@jngrad jngrad added this to the Espresso 4.1.2 milestone Nov 21, 2019
@jngrad jngrad requested a review from mkuron November 21, 2019 17:48
@codecov
Copy link

codecov bot commented Nov 21, 2019

Codecov Report

Merging #3335 into python will decrease coverage by <1%.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           python   #3335   +/-   ##
======================================
- Coverage      86%     86%   -1%     
======================================
  Files         538     538           
  Lines       25350   25350           
======================================
- Hits        21834   21833    -1     
- Misses       3516    3517    +1
Impacted Files Coverage Δ
src/core/electrostatics_magnetostatics/p3m.cpp 85% <0%> (-1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c91ea5e...d1440da. Read the comment docs.

Copy link
Member

@mkuron mkuron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you‘ve checked that OpenMPI 2.0.0 already had this option, then this is fine.

@jngrad
Copy link
Member Author

jngrad commented Nov 22, 2019

Yes, I checked the source code. I've also now compiled 2.0.0 on Fedora 31 and the -oversubscribe flag is recognized.

@jngrad
Copy link
Member Author

jngrad commented Nov 22, 2019

bors r=fweik

bors bot added a commit that referenced this pull request Nov 22, 2019
3334: Update io.rst r=jngrad a=jonaslandsgesell

link to h5md plugin for VMD

3335: Enable oversubscribing in OpenMPI 2.X r=fweik a=jngrad

Fixes #3333

3336: Fallthrough and remove duplicate line r=jngrad a=hirschsn

Description of changes:
 - Remove a duplicated line in `topology_check_resort` by falling through as done for other cell systems in this function.


PR Checklist
------------
 - [ ] Tests?
   - [ ] Interface
   - [ ] Core 
 - [ ] Docs?


Co-authored-by: Jonas Landsgesell <jonaslandsgesell@users.noreply.github.com>
Co-authored-by: Jean-Noël Grad <jgrad@icp.uni-stuttgart.de>
Co-authored-by: Steffen Hirschmann <steffen.hirschmann@ipvs.uni-stuttgart.de>
@bors
Copy link
Contributor

bors bot commented Nov 22, 2019

Build succeeded

@bors bors bot merged commit d1440da into espressomd:python Nov 22, 2019
@jngrad jngrad deleted the fix-3333 branch January 18, 2022 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

some python tests fail silently on some architectures with OpenMPI 2.X
4 participants