-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New arch: HPC2020 nvhpc/23.7 #103
base: develop
Are you sure you want to change the base?
Conversation
9eefa04
to
8af29b6
Compare
Thanks for rebasing! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this doesn't seem to work out-of-the-box for me.
Building with
./cloudsc-bundle build --arch arch/ecmwf/hpc2020/nvhpc/23.7/ --with-gpu --with-loki --with-cuda
fails to discover OpenMP correctly:
-- Found OpenMP_Fortran: -mp -mp=gpu,bind,allcores,numa
-- Could NOT find OpenMP_C (missing: OpenMP_acchost_LIBRARY)
-- Could NOT find OpenMP (missing: OpenMP_C_FOUND C)
-- dwarf-p-cloudsc FAILED to find OPTIONAL package OpenMP
-- Could NOT find package OpenMP required for feature OMP --
CMake Error at ecbuild/cmake/ecbuild_log.cmake:190 (message):
CRITICAL - Feature OMP cannot be enabled -- following required
packages weren't found: OpenMP
Call Stack (most recent call first):
ecbuild/cmake/ecbuild_add_option.cmake:260 (ecbuild_critical)
cloudsc-dwarf/CMakeLists.txt:78 (ecbuild_add_option)
-- Configuring incomplete, errors occurred!
This is likely due to the explicit OpenMP library override in the toolchain.cmake:
dwarf-p-cloudsc/arch/toolchains/ecmwf-hpc2020-nvhpc.cmake
Lines 23 to 28 in 82fdf4b
# Note: OpenMP_C_FLAGS and OpenMP_C_LIB_NAMES have to be provided _both_ to | |
# keep FindOpenMP from overwriting the FLAGS variable (the cache entry alone | |
# doesn't have any effect here as the module uses FORCE to overwrite the | |
# existing value) | |
set( OpenMP_C_FLAGS "-mp -mp=bind,allcores,numa" CACHE STRING "" ) | |
set( OpenMP_C_LIB_NAMES "acchost" CACHE STRING "") |
With these lines removed, the build is successful and has also been successful for me with the 22.11 arch. Iirc, this may have been fixed in CMake and the 3.25 version that we use in both arch files doesn't require the workaround anymore. I would suggest just removing the relevant bit from the toolchain file.
However, even with this almost all variants fail with a SIGFPE:
The following tests FAILED:
3 - dwarf-cloudsc-fortran-serial (NUMERICAL)
4 - dwarf-cloudsc-fortran-omp (NUMERICAL)
10 - dwarf-cloudsc-gpu-scc-serial (NUMERICAL)
11 - dwarf-cloudsc-gpu-scc-stack-serial (NUMERICAL)
12 - dwarf-cloudsc-gpu-scc-hoist-serial (NUMERICAL)
13 - dwarf-cloudsc-gpu-scc-k-caching-serial (NUMERICAL)
14 - dwarf-cloudsc-gpu-omp-scc-serial (NUMERICAL)
15 - dwarf-cloudsc-gpu-omp-scc-stack-serial (NUMERICAL)
16 - dwarf-cloudsc-gpu-omp-scc-hoist-serial (NUMERICAL)
17 - dwarf-cloudsc-gpu-omp-scc-k-caching-serial (NUMERICAL)
18 - dwarf-cloudsc-gpu-scc-cuf-serial (NUMERICAL)
19 - dwarf-cloudsc-gpu-scc-cuf-k-caching-serial (NUMERICAL)
20 - dwarf-cloudsc-loki-idem-serial (NUMERICAL)
21 - dwarf-cloudsc-loki-idem-omp (NUMERICAL)
22 - dwarf-cloudsc-loki-idem-stack-serial (NUMERICAL)
23 - dwarf-cloudsc-loki-idem-stack-omp (NUMERICAL)
24 - dwarf-cloudsc-loki-sca-serial (NUMERICAL)
25 - dwarf-cloudsc-loki-scc-serial (NUMERICAL)
26 - dwarf-cloudsc-loki-scc-stack-serial (NUMERICAL)
27 - dwarf-cloudsc-loki-scc-hoist-serial (NUMERICAL)
28 - dwarf-cloudsc-loki-scc-cuf-parametrise-serial (NUMERICAL)
29 - dwarf-cloudsc-loki-scc-cuf-hoist-serial (NUMERICAL)
30 - dwarf-cloudsc-loki-c-serial (NUMERICAL)
31 - dwarf-cloudsc-loki-c-omp (NUMERICAL)
Could you have another look at that please?
Add new env and toolchain file for NVHPC 23.7