You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the beginning of December 2022, it is no longer possible to build a CUDA-enabled ESPResSo project on Ubuntu 22.04, either via NVCC or Clang. Below we share our experience and the solution that worked for us.
Problem statement
Error 1: NVCC cannot compile a simple CUDA file
CMake Error at /usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:726 (message):
Compiling the CUDA compiler identification source file
"CMakeCUDACompilerId.cu" failed.
Compiler: /usr/local/cuda-11.2/bin/nvcc
139 | #error -- unsupported GNU version! gcc versions later than 10 are not supported! The nvcc flag
| '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported
| host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
Error 2: Clang cannot find C++ standard headers like cmath or iostream
/usr/lib/llvm-14/lib/clang/14.0.0/include/__clang_cuda_runtime_wrapper.h:41:10: fatal error: 'cmath' file not found
#include <cmath>
^~~~~~~
1 error generated when compiling for sm_61
Error 3: CMake fails to detect MPI libraries when Clang is the compiler.
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found suitable version "3.1")
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS) (Required is at least version "3.0")
CMake Error at /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find MPI (missing: MPI_CXX_FOUND) (found suitable version "3.1",
minimum required is "3.0")
Call Stack (most recent call first):
/usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.22/Modules/FindMPI.cmake:1830 (find_package_handle_standard_args)
CMakeLists.txt:323 (find_package)
Error 4: The nvidia-cuda-toolkit cannot be installed together with nvidia-driver-515.
$ sudo apt install nvidia-cuda-toolkit
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be REMOVED:
libnvidia-compute-515 libnvidia-compute-515:i386 libnvidia-decode-515 libnvidia-decode-515:i386
libnvidia-encode-515 libnvidia-encode-515:i386 nvidia-compute-utils-515 nvidia-driver-515 nvidia-utils-515
The following NEW packages will be installed:
libnvidia-compute-495 libnvidia-compute-510 nvidia-cuda-dev nvidia-cuda-gdb nvidia-cuda-toolkit [...]
Do you want to continue? [Y/n] n
$ sudo apt install nvidia-cuda-toolkit nvidia-driver-515
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
nvidia-driver-515 is already the newest version (515.86.01-0ubuntu0.22.04.1).
Some packages could not be installed. This may mean that you have requested an impossible situation
or if you are using the unstable distribution that some required packages have not yet been created
or been moved out of Incoming. The following information may help to resolve the situation:
The following packages have unmet dependencies:
libnvidia-decode-515 : Depends: libnvidia-compute-515 (= 515.86.01-0ubuntu0.22.04.1) but it is not installable
nvidia-compute-utils-515 : Depends: libnvidia-compute-515 but it is not installable
nvidia-driver-515 : Depends: libnvidia-compute-515 (= 515.86.01-0ubuntu0.22.04.1) but it is not installable
Recommends: libnvidia-compute-515:i386 (= 515.86.01-0ubuntu0.22.04.1)
Recommends: libnvidia-decode-515:i386 (= 515.86.01-0ubuntu0.22.04.1)
Recommends: libnvidia-encode-515:i386 (= 515.86.01-0ubuntu0.22.04.1)
nvidia-utils-515 : Depends: libnvidia-compute-515 but it is not installable
E: Unable to correct problems, you have held broken packages.
Things to know before attempting to solve these issues
CUDA 11 requires GCC <= 10.
When Clang 14 compiles CUDA sources, it automatically selects the most recent GCC version, i.e. GCC 12 on Ubuntu 22.04.
GCC 12 cannot be removed: on Ubuntu 22.04, the apt package manager must satisfy the following chain of hard dependencies: nvidia-driver-515 -> nvidia-dkms-515 -> dkms -> gcc-12. Even though the dkms page indicates a dependency on any gcc or c-compiler version, only gcc-12 satisfies that dependency from the point of view of apt (as of December 13, 2022), in spite of the fact that nvidia-driver-515 is in reality compatible with nvidia-cuda-toolkit.
Inside an Ubuntu 22.04 Docker container with nvidia-cuda-toolkit installed and no NVIDIA driver installed, as long as the container was started with docker run --runtime=nvidia, the NVIDIA driver of the host will be used and the NVIDIA toolkit will work as expected (no issues with CMake, NVCC, GCC or Clang).
Posts on StackOverflow suggest to create a new folder containing symlinks to GCC 10 include and lib directories and pass that folder via --gcc-toolchain=my_folder to restrict GCC version detection to GCC 10, unfortunately the C++ headers files are actually split in two separate directories, so that workaround no longer works.
How we solved these issues
To use Clang 14, it is necessary to install a complete GCC 10 toolchain.
do apt install gcc-10 g++-10 libstdc++-10-dev
pass extra flags to make Clang traverse the GCC 10 header files before traversing the GCC 12 header files:
Description of changes:
- restrict Clang-Tidy checks to the main project
- external libraries obtained via FetchContent and their consumer targets in ESPResSo no longer emit diagnostics
- use native CUDA support in CMake 3.22
- project option `ESPRESSO_CUDA_COMPILER` was removed
- the waLBerla library obtained via FetchContent can now be compiled with `WALBERLA_BUILD_WITH_CUDA=ON`
- the CUDA 11 circular dependency in Ubuntu 22.04 packages is now documented (closes#4630)
Since the beginning of December 2022, it is no longer possible to build a CUDA-enabled ESPResSo project on Ubuntu 22.04, either via NVCC or Clang. Below we share our experience and the solution that worked for us.
Problem statement
Error 1: NVCC cannot compile a simple CUDA file
Error 2: Clang cannot find C++ standard headers like
cmath
oriostream
Error 3: CMake fails to detect MPI libraries when Clang is the compiler.
Error 4: The
nvidia-cuda-toolkit
cannot be installed together withnvidia-driver-515
.Things to know before attempting to solve these issues
apt
package manager must satisfy the following chain of hard dependencies:nvidia-driver-515
->nvidia-dkms-515
->dkms
->gcc-12
. Even though thedkms
page indicates a dependency on any gcc or c-compiler version, onlygcc-12
satisfies that dependency from the point of view ofapt
(as of December 13, 2022), in spite of the fact thatnvidia-driver-515
is in reality compatible withnvidia-cuda-toolkit
.nvidia-cuda-toolkit
installed and no NVIDIA driver installed, as long as the container was started withdocker run --runtime=nvidia
, the NVIDIA driver of the host will be used and the NVIDIA toolkit will work as expected (no issues with CMake, NVCC, GCC or Clang).include
andlib
directories and pass that folder via--gcc-toolchain=my_folder
to restrict GCC version detection to GCC 10, unfortunately the C++ headers files are actually split in two separate directories, so that workaround no longer works.How we solved these issues
apt install gcc-10 g++-10 libstdc++-10-dev
nvidia-cuda-toolkit
nvidia-driver-515
/usr/local/cuda-11.5
sh cuda_11.5.2_495.29.05_linux.run
as a regular userContinue
and accept the EULA to go toCUDA Installer
CUDA Toolkit 11.5
Options
Toolkit Options
Change Toolkit Install Path
/usr/local/cuda-11.5/
Done
Install
Many thanks to our IT administrator for helping us troubleshoot the compiler errors and CUDA packages.
The text was updated successfully, but these errors were encountered: