Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sanity_check_rpath should not check for libcuda.so.1 #4095

Closed
casparvl opened this issue Oct 13, 2022 · 0 comments · Fixed by #4119
Closed

sanity_check_rpath should not check for libcuda.so.1 #4095

casparvl opened this issue Oct 13, 2022 · 0 comments · Fixed by #4119
Milestone

Comments

@casparvl
Copy link
Contributor

casparvl commented Oct 13, 2022

When installing Clang, e.g. Clang-12.0.1-GCCcore-10.3.0.eb with --rpath, it errors out with

== 2022-10-12 13:53:54,035 build_log.py:169 ERROR EasyBuild crashed with an error (at easybuild/RHEL8/2022/software/EasyBuild/4.6.1/lib/python3.6/site-packages/easybuild/base/exceptions.py:124 in __init__): Sanity check failed: One or more required libraries not found for /home/casparl/.local/easybuild/RHEL8/2022/software/Clang/12.0.1-GCCcore-10.3.0/lib/libomptarget.rtl.cuda.so:   linux-vdso.so.1 (0x00007fffb959f000)
        libcuda.so.1 => not found
        libelf.so.1 => /sw/arch/Centos8/EB_production/2021/software/elfutils/0.185-GCCcore-10.3.0/lib64/libelf.so.1 (0x0000152603766000)
        libstdc++.so.6 => /home/casparl/.local/easybuild/Centos8/2021/software/GCCcore/10.3.0/lib64/libstdc++.so.6 (0x000015260339b000)
        libm.so.6 => /lib64/libm.so.6 (0x0000152603019000)
        libgcc_s.so.1 => /home/casparl/.local/easybuild/Centos8/2021/software/GCCcore/10.3.0/lib64/libgcc_s.so.1 (0x000015260374a000)
        libc.so.6 => /lib64/libc.so.6 (0x0000152602c54000)
        /lib64/ld-linux-x86-64.so.2 (0x0000152603582000)
        libz.so.1 => /sw/arch/Centos8/EB_production/2021/software/zlib/1.2.11-GCCcore-10.3.0/lib/../lib64/libz.so.1 (0x0000152603726000)

One or more required libraries not found for /home/casparl/.local/easybuild/RHEL8/2022/software/Clang/12.0.1-GCCcore-10.3.0/lib64/libomptarget.rtl.cuda.so:     linux-vdso.so.1 (0x000014f10ef12000)
        libcuda.so.1 => not found
        libelf.so.1 => /sw/arch/Centos8/EB_production/2021/software/elfutils/0.185-GCCcore-10.3.0/lib64/libelf.so.1 (0x000014f10eec6000)
        libstdc++.so.6 => /home/casparl/.local/easybuild/Centos8/2021/software/GCCcore/10.3.0/lib64/libstdc++.so.6 (0x000014f10eb01000)
        libm.so.6 => /lib64/libm.so.6 (0x000014f10e77f000)
        libgcc_s.so.1 => /home/casparl/.local/easybuild/Centos8/2021/software/GCCcore/10.3.0/lib64/libgcc_s.so.1 (0x000014f10eeaa000)
        libc.so.6 => /lib64/libc.so.6 (0x000014f10e3ba000)
        /lib64/ld-linux-x86-64.so.2 (0x000014f10ece8000)
        libz.so.1 => /sw/arch/Centos8/EB_production/2021/software/zlib/1.2.11-GCCcore-10.3.0/lib/../lib64/libz.so.1 (0x000014f10ee86000)
 (at easybuild/RHEL8/2022/software/EasyBuild/4.6.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py:3471 in _sanity_check_step)

However, libcuda.so.1 should never be RPATH-ed. Nvidia supports cross compilation by linking (at compile time) against stubs libraries. Those stubs libraries contain function declarations, but no implementations. Thus, at runtime, they are never meant to be used. See #2683

We made sure that the RPATH wrappers allow specifying certain paths to never be RPATH-ed, and we use that to exclude the stubs libraries from CUDA. See for the implementation #2725 and #3850

The problem that we didn't tackle is that if you filter those directories from being RPATH-ed, the sanity_check_rpath, the libraries that are linked from those directories will fail the sanity_check_rpath (as above).

Minimal reproducer
Put the following cpp in a tar.gz:

# toy.cpp
// Your First C++ Program

#include <iostream>

int main() {
    std::cout << "Hello World!";
    return 0;
}

Then create an EasyConfig that does e.g.

# toy-1.0.0-GCCcore-11.3.0.eb

easyblock='CmdCp'

name = 'Toy'
version = '1.0.0'

homepage = 'https://whatever.org/'
description = """Toy"""

toolchain = {'name': 'GCCcore', 'version': '11.3.0'}

sources = [
    'toy.tar.gz',
]

cmds_map = [('toy', 'g++ -o toy -lcuda toy.cpp')]

files_to_copy = [(["toy"], 'bin')]

builddependencies = [
    ('binutils', '2.38'),
]

dependencies = [
    ('CUDA', '11.7.0', '', SYSTEM),
]

sanity_check_paths = {
    'files': ['bin/toy'],
    'dirs': [],
}

moduleclass = 'tools'

Then, try to install with

eb toy-1.0.0-GCCcore-11.3.0-eb --sourcepath=<path/containing/toy/tarball>

We discussed a bit on the Slack, and we should probably alter

def sanity_check_rpath(self, rpath_dirs=None):
to skip certain libraries, much like we hard-enabled filtering RPATH for de CUDA stubs dir in
# always include filter for 'stubs' library directory,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants