-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/usr/bin/toolbox linked against glibc-2.34 doesn't run on older glibc #821
Comments
Hi @kuba3351, thank you for opening this ticket. Could you, please, provide the output of |
Yes, I have got another error:
|
Ok, I figured out how to partially workaround this issue. I found that toolbox binary on my host machine was upgraded and built with never GLIBC, and it is mounted into the container with older GLIBC. So, I run the following commands:
After that I can create working toolboxes based on fedora 35 but older toolboxes still not work. Any idea what to deal with it? |
Created PR with proposal solution for this issue (#822) |
This is basically a duplicate of #529 We need to update |
…create and pthread_detach. Fixes containers#821
…stacksize, pthread_create and pthread_detach. Fixes containers#821
…stacksize, pthread_create and pthread_detach. Fixes containers#821
Do you have a doc somewhere explaining why we need to have |
It might make sense to create a separate, fully statically linked binary to use inside the toolbox for setup operations. This would thus be completely independent of the version of the glibc available inside the container. Some container images may not even have glibc installed at all. |
There's a paragraph in
|
Agreed. Let's discuss it more after we release v0.1. |
Filed as #832 for tracking. |
An upgrade of glibc has caused an issue on Fedora Rawhide[0]. We need a clear indicator that a change in glibc could cause it. [0] containers#821
An upgrade of glibc has caused an issue on Fedora Rawhide[0]. We need a clear indicator that a change in glibc could cause it. [0] containers#821 containers#834
In the meantime, would it help if we had an image that actually contained the newer glibc? |
There is a new |
https://bugzilla.redhat.com/show_bug.cgi?id=1995439 is proposed as an F35 blocker. |
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. Based on ideas from Alexander Larsson and Ray Strode. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
The /usr/bin/toolbox binary is not only used to interact with toolbox containers and images from the host. It's also used as the entry point of the containers by bind mounting the binary from the host into the container. This means that the /usr/bin/toolbox binary on the host must also work inside the container, even if they have different operating systems. In the past, this worked perfectly well with the POSIX shell implementation because it got intepreted by whichever /bin/sh was available. However, the Go implementation, can run into ABI compatibility issues because binaries built on newer toolchains aren't meant to be run against older runtimes. The previous approach [1] of restricting the versions of the glibc symbols that are linked against isn't actually supported by glibc, and breaks if the early process start-up code changes. This is seen in glibc-2.34, which is used by Fedora 35 onwards, where a new version of the __libc_start_main symbol [2] was added as part of some security hardening: $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 __libc_start_main 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_detach 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_create 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.34 pthread_attr_getstacksize This means that /usr/bin/toolbox binaries built against glibc-2.34 on newer Fedoras fail to run against older glibcs in older Fedoras. Another option is to make the host's runtime available inside the toolbox container and ensure that the binary always runs against it. Luckily, almost all supported containers have the host's /usr available at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to /run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing the path of the dynamic linker (ie., PT_INTERP) to the one inside /run/host. Unfortunately, there can only be one PT_INTERP entry inside the binary, so there must be a /run/host on the host too. Therefore, a /run/host symbolic link is created on the host that points to the host's /. Based on ideas from Alexander Larsson and Ray Strode. [1] Commit 6ad9c63 containers#534 [2] glibc commit 035c012e32c11e84 https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84 https://sourceware.org/bugzilla/show_bug.cgi?id=23323 containers#821
This should be fixed by #897 |
The path of the dynamic linker (ie., PT_INTERP), as specified in an architecture's ABI, often starts with /lib or /lib64, not /usr/lib or /usr/lib64. eg., it's /lib/ld-linux-aarch64.so.1 for aarch64 and /lib64/ld-linux-x86-64.so.2 for x86_64. Unfortunately, until very recently [1], only the host's /usr was present inside a toolbox container's /run/host, not /lib or /lib64. Therefore, simply prepending /run/host to the /usr/bin/toolbox binary's existing PT_INTERP entry wouldn't locate the host's dynamic linker inside the toolbox container. This broke backwards compatibility with every container out there, except the ones created with the current development version in Git. To restore backwards compatibility, the /lib and /lib64 symbolic links must be resolved to their respective locations inside /usr. The following caveats must be noted: * With glibc, even the basename of the path of the dynamic linker as specified in an architecture's ABI, is a symbolic link to a file named ld-<glibc-version>.so. However, this file can't be used as the PT_INTERP entry, because its name will change when glibc is updated and the PT_INTERP entry will become invalid until the /usr/bin/toolbox binary is rebuilt. * On Debian, a path like /lib64/ld-linux-x86-64.so.2 doesn't resolve to something inside /usr/lib64. Instead it ends up inside /usr/lib/x86_64-linux-gnu through a series of symbolic links: - /lib64 -> usr/lib64 - /usr/lib64/ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-2.28.so - /lib -> usr/lib * It's assumed that a symbolic link with the basename specified in the ABI lives in the same directory as the actual dynamic linker binary named ld-<glibc-version>.so. Fallout from 6063eb2 [1] Commit d03a5fe containers#827 containers#821
The path of the dynamic linker (ie., PT_INTERP), as specified in an architecture's ABI, often starts with /lib or /lib64, not /usr/lib or /usr/lib64. eg., it's /lib/ld-linux-aarch64.so.1 for aarch64 and /lib64/ld-linux-x86-64.so.2 for x86_64. Unfortunately, until very recently [1], only the host's /usr was present inside a toolbox container's /run/host, not /lib or /lib64. Therefore, simply prepending /run/host to the /usr/bin/toolbox binary's existing PT_INTERP entry wouldn't locate the host's dynamic linker inside the toolbox container. This broke backwards compatibility with every container out there, except the ones created with the current development version in Git. To restore backwards compatibility, the /lib and /lib64 symbolic links must be resolved to their respective locations inside /usr. The following caveats must be noted: * With glibc, even the basename of the path of the dynamic linker as specified in an architecture's ABI, is a symbolic link to a file named ld-<glibc-version>.so. However, this file can't be used as the PT_INTERP entry, because its name will change when glibc is updated and the PT_INTERP entry will become invalid until the /usr/bin/toolbox binary is rebuilt. * On Debian, a path like /lib64/ld-linux-x86-64.so.2 doesn't resolve to something inside /usr/lib64. Instead it ends up inside /usr/lib/x86_64-linux-gnu through a series of symbolic links: - /lib64 -> usr/lib64 - /usr/lib64/ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-2.28.so - /lib -> usr/lib * It's assumed that a symbolic link with the basename specified in the ABI lives in the same directory as the actual dynamic linker binary named ld-<glibc-version>.so. Fallout from 6063eb2 [1] Commit d03a5fe containers#827 containers#821
Fedora 32 reached End of Life on 25th May 2021: https://docs.fedoraproject.org/en-US/releases/eol/ That's quite old because right now Fedora 35 is nearing its End of Life. Since the tests are intended for Toolbx, not the Fedora infrastructure, it will be better to use something newer, because images that are too old can get lost from registry.fedoraproject.org. The fedora-toolbox:34 image can be a drop-in replacement for the fedora-toolbox:32 image for the purposes of this test suite, and has the advantage of being newer. Note that fedora-toolbox:34 is also old enough to test that the toolbox binary runs against it's build-time ABI (ie., the host's ABI), and not the Toolbx container's ABI, when it's invoked as the entry point of the container [1]. This is important because the subsequent commit will add a test to ensure that. [1] Commit 6063eb2 containers#821
Fedora 32 reached End of Life on 25th May 2021: https://docs.fedoraproject.org/en-US/releases/eol/ That's quite old because right now Fedora 35 is nearing its End of Life. Since the tests are intended for Toolbx, not the Fedora infrastructure, it will be better to use a newer image, because images that are too old can get lost from registry.fedoraproject.org. The fedora-toolbox:34 image can be a drop-in replacement for the fedora-toolbox:32 image for the purposes of this test suite, and has the advantage of being newer. Note that fedora-toolbox:34 is also old enough to test that the toolbox binary runs against it's build-time ABI from the host, and not the Toolbx container's ABI, when it's invoked as the entry point of the container [1,2]. This is important because the subsequent commit will add a test to ensure that. [1] Commit 6063eb2 containers#821 [2] Commit 6ad9c63 containers#529 containers#1187
Commit ae43560 had added a test with a similar intention. When the test suite is run on a Fedora Rawhide host, it tests whether the containers for the two previous stable Fedora releases start or not. Fedora N-2 reaches End of Life four weeks after Fedora N is released. So, testing the containers for Fedora Rawhide and the two previous stable releases on a Fedora Rawhide host is a decent test of general backwards compatibility. However, as seen recently [1], this isn't enough to catch some known ABI compatibility issues [2,3]. These involve toolbox binaries built on hosts with newer toolchains that aren't meant to be run against containers with older runtimes. A targeted test is needed to defend against these scenarios. The fedora-toolbox:34 image has glibc-2.33, which is old enough to be unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer. [1] containers#1180 [2] Commit 6063eb2 containers#821 [3] Commit 6ad9c63 containers#529 https://docs.fedoraproject.org/en-US/releases/ containers#1187
Commit ae43560 had added a test with a similar intention. When the test suite is run on a Fedora Rawhide host, it tests whether the containers for the two previous stable Fedora releases start or not. Fedora N-2 reaches End of Life four weeks after Fedora N is released. So, testing the containers for Fedora Rawhide and the two previous stable releases on a Fedora Rawhide host is a decent test of general backwards compatibility. However, as seen recently [1], this isn't enough to catch some known ABI compatibility issues [2,3]. These involve toolbox binaries built on hosts with newer toolchains that aren't meant to be run against containers with older runtimes. A targeted test is needed to defend against these scenarios. The fedora-toolbox:34 image has glibc-2.33, which is old enough to be unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer. [1] containers#1180 [2] Commit 6063eb2 containers#821 [3] Commit 6ad9c63 containers#529 https://docs.fedoraproject.org/en-US/releases/ containers#1187
Commit ae43560 had added a test with a similar intention. When the test suite is run on a Fedora Rawhide host, it tests whether the containers for the two previous stable Fedora releases start or not. Fedora N-2 reaches End of Life 4 weeks after Fedora N is released [1]. So, testing the containers for Fedora Rawhide and the two previous stable releases on a Fedora Rawhide host is a decent test of general backwards compatibility. However, as seen recently [2], this isn't enough to catch some known ABI compatibility issues [3,4]. These involve toolbox binaries built on hosts with newer toolchains that aren't meant to be run against containers with older runtimes. A targeted test is needed to defend against these scenarios. The fedora-toolbox:34 image has glibc-2.33, which is old enough to be unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer. [1] https://docs.fedoraproject.org/en-US/releases/ [2] containers#1180 [3] Commit 6063eb2 containers#821 [4] Commit 6ad9c63 containers#529 containers#1187
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Care was taken to not load and link libsubid.so twice to separately validate the subordinate ranges for the user and the group. Sadly, there doesn't seem to be a way to close the file descriptor that's used by libsubid.so for logging. Hence, this one file descriptor (currently number 3, unless the parent process passes down others) pointing to /dev/null is leaked for the life cycle of the toolbox process. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Care was taken to not load and link libsubid.so twice to separately validate the subordinate ranges for the user and the group. Sadly, there doesn't seem to be a way to close the file descriptor that's used by libsubid.so for logging. Hence, this one file descriptor (currently number 3, unless the parent process passes down others) pointing to /dev/null is leaked for the life cycle of the toolbox process. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. Therefore, its necessary to use libsubid.so to check the subordinate ID ranges. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Care was taken to not load and link libsubid.so twice to separately validate the subordinate ID ranges for the user and the group. Sadly, there doesn't seem to be a way to close the file descriptor that's used by libsubid.so for logging. Hence, this one file descriptor (currently number 3, unless the parent process passes down others) pointing to /dev/null is leaked for the life cycle of the toolbox process. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. Therefore, its necessary to use libsubid.so to check the subordinate ID ranges. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Care was taken to not load and link libsubid.so twice to separately validate the subordinate ID ranges for the user and the group. Sadly, there doesn't seem to be a way to close the file descriptor that's used by libsubid.so for logging. Hence, this one file descriptor (currently number 3, unless the parent process passes down others) pointing to /dev/null is leaked for the life cycle of the toolbox process. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
The Toolbx source code was being repeatedly re-built with 'go run ...' to generate each of the shell completions. A following commit will use CGO to link to libsubid.so, which will sufficiently complicate the build that a simple 'go run ...' won't be enough. Hence, it's better to use the same mechanisms in src/go-build-wrapper that are used to generate the Toolbx binary. However, the main Toolbx binary only works if /run/host exists [1], which is only created after 'meson install' and may not exist during 'meson compile'. This is solved by using a throwaway binary that lacks the ABI protections necessary to run as a container's entry, but is enough to generate the shell completions. Fallout from bafbbe8 [1] Commit 6063eb2 containers#821 containers#1074
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. Therefore, its necessary to use libsubid.so to check the subordinate ID ranges. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. Unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Care was taken to not load and link libsubid.so twice to separately validate the subordinate ID ranges for the user and the group. Sadly, there doesn't seem to be a way to close the file descriptor that's used by libsubid.so for logging. Hence, this one file descriptor (currently number 3, unless the parent process passes down others) pointing to /dev/null is leaked for the life cycle of the toolbox process. Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,7] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
On enterprise FreeIPA set-ups, the subordinate user and group IDs are provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS) functionality of the GNU C Library. They are not listed in /etc/subuid and /etc/subgid. Therefore, its necessary to use libsubid.so to check the subordinate ID ranges. The CGO interaction with libsubid.so is loosely based on 'readSubid' in github.com/containers/storage/pkg/idtools [1]. However, unlike 'readSubid', this code considers the absence of any range (ie., nRanges == 0) to be an error as well. More importantly, this code uses dlopen(3) and friends to dynamically load the symbols from libsubid.so, instead of linking to libsubid.so at build-time and having the dependency noted in the /usr/bin/toolbox binary. This is done because libsubid.so itself depends on several other shared libraries, and indirect dependencies can't be influenced by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence, when the binary is used inside Toolbx containers (eg., as the entry point), those indirect dependencies won't be picked from the host's runtime against which the binary was built. This can render the binary useless due to ABI compatibility issues. Using dlopen(3) avoids this problem, especially because libsubid.so is only used when running on the host. Care was taken to not load and link libsubid.so twice to separately validate the subordinate ID ranges for the user and the group. Note that libsubid_init() must be passed a FILE pointer for logging. Otherwise, it will create it's own for logging, and there's no way to close it during dlclose(3). Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10, which is newer than the versions shipped on RHEL 8 and Debian 10 [5], and even that newer version had some problems [6]. Therefore, support for older versions, with the relevant workarounds, is necessary. Fortunately, the oldest that needs to be support is Shadow 4.9 because that's when libsubid.so was introduced [7]. Note that SUBID_ABI_VERSION was only introduced with version 4 of the libsubid.so API/ABI released in Shadow 4.10 [8]. The first release of libsubid.so in Shadow 4.9 already had an ABI version of 3.0.0 [9], since it was bumped a few times during development, so that's what's assumed when SUBID_ABI_VERSION is absent. This code doesn't set the public variables Prog and shadow_logfd that older Shadow versions used to expect for logging, because from Shadow 4.9 onwards there's a separate function [4,10] to specify these. This can be changed if there are libsubid.so versions in the wild that really do need those public variables to be set. Finally, ISO C99 is required because of the use of <stdbool.h> in the libsubid.so API. Some changes by Debarshi Ray. [1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go [2] https://man7.org/linux/man-pages/man8/ld.so.8.html [3] Commit 6063eb2 containers#821 [4] Shadow commit 32f641b207f6ddff shadow-maint/shadow@32f641b207f6ddff shadow-maint/shadow#443 [5] https://packages.debian.org/source/buster/shadow [6] Shadow commit 79157cbad87f42cd shadow-maint/shadow@79157cbad87f42cd shadow-maint/shadow#465 [7] Shadow commit 0a7888b1fad613a0 shadow-maint/shadow@0a7888b1fad613a0 shadow-maint/shadow#154 [8] Shadow commit 0c9f64140852e8d5 shadow-maint/shadow@0c9f64140852e8d5 shadow-maint/shadow#449 [9] Shadow commit 3d670ba7ed58f910 shadow-maint/shadow@3d670ba7ed58f910 shadow-maint/shadow#339 [10] Shadow commit 2b22a6909dba60d shadow-maint/shadow@2b22a6909dba60d shadow-maint/shadow#325 containers#1074 Signed-off-by: Martin Jackson <martjack@redhat.com>
Hello
I am using an Fedora Silverblur Rawhide 35 and after toolbox update to version 0.0.99.2 I cannot start any container created before or after the upgrade, because of error: invalid entry point PID of container
I am attaching a logs with verbose:
Please help for resolving the issue.
The text was updated successfully, but these errors were encountered: