Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/usr/bin/toolbox linked against glibc-2.34 doesn't run on older glibc #821

Closed
kuba3351 opened this issue Jun 30, 2021 · 17 comments
Closed
Labels
1. Regression Something worked in a previous release but does not now 2. Container Realm The issue is related to what happens inside of a toolbox container

Comments

@kuba3351
Copy link
Contributor

kuba3351 commented Jun 30, 2021

Hello

I am using an Fedora Silverblur Rawhide 35 and after toolbox update to version 0.0.99.2 I cannot start any container created before or after the upgrade, because of error: invalid entry point PID of container

I am attaching a logs with verbose:

[jakub@NBJSIERZEGA sbpwsdlvalidator]$ toolbox enter -vvv
DEBU Running as real user ID 1000                 
DEBU Resolved absolute path to the executable as /usr/bin/toolbox 
DEBU Running on a cgroups v2 host                 
DEBU Checking if /etc/subgid and /etc/subuid have entries for user jakub 
DEBU Validating sub-ID file /etc/subuid           
DEBU Validating sub-ID file /etc/subgid           
DEBU TOOLBOX_PATH is /usr/bin/toolbox             
DEBU Migrating to newer Podman                    
DEBU Toolbox config directory is /var/home/jakub/.config/toolbox 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called version.PersistentPreRunE(podman --log-level debug version --format json) 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Merged system config "/usr/share/containers/containers.conf" 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/jakub/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/jakub/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/jakub/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/jakub/.local/share/containers/storage/volumes 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Found CNI network podman (type=bridge) at /var/home/jakub/.config/cni/net.d/87-podman.conflist 
DEBU[0000] Default CNI network name podman is unchangeable 
INFO[0000] Setting parallel job count to 25             
DEBU[0000] Called version.PersistentPostRunE(podman --log-level debug version --format json) 
DEBU Current Podman version is 3.3.0-dev          
DEBU Creating runtime directory /run/user/1000/toolbox 
DEBU Old Podman version is 3.3.0-dev              
DEBU Migration not needed: Podman version 3.3.0-dev is unchanged 
DEBU Resolving container and image names          
DEBU Container: ''                                
DEBU Distribution: ''                             
DEBU Image: ''                                    
DEBU Release: ''                                  
DEBU Resolved container and image names           
DEBU Container: 'fedora-toolbox-35'               
DEBU Image: 'fedora-toolbox:35'                   
DEBU Release: '35'                                
DEBU Checking if container fedora-toolbox-35 exists 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called exists.PersistentPreRunE(podman --log-level debug container exists fedora-toolbox-35) 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Merged system config "/usr/share/containers/containers.conf" 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/jakub/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/jakub/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/jakub/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/jakub/.local/share/containers/storage/volumes 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Found CNI network podman (type=bridge) at /var/home/jakub/.config/cni/net.d/87-podman.conflist 
DEBU[0000] Default CNI network name podman is unchangeable 
INFO[0000] Setting parallel job count to 25             
DEBU[0000] Called exists.PersistentPostRunE(podman --log-level debug container exists fedora-toolbox-35) 
DEBU Inspecting mounts of container fedora-toolbox-35 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called inspect.PersistentPreRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-35) 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Merged system config "/usr/share/containers/containers.conf" 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/jakub/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/jakub/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/jakub/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/jakub/.local/share/containers/storage/volumes 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Found CNI network podman (type=bridge) at /var/home/jakub/.config/cni/net.d/87-podman.conflist 
DEBU[0000] Default CNI network name podman is unchangeable 
INFO[0000] Setting parallel job count to 25             
DEBU[0000] Called inspect.PersistentPostRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-35) 
DEBU Starting container fedora-toolbox-35         
DEBU Inspecting entry point of container fedora-toolbox-35 
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called inspect.PersistentPreRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-35) 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Merged system config "/usr/share/containers/containers.conf" 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/home/jakub/.local/share/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/home/jakub/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /var/home/jakub/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /var/home/jakub/.local/share/containers/storage/volumes 
DEBU[0000] overlay storage already configured with a mount-program 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] overlay: mount_program=/usr/bin/fuse-overlayfs 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Found CNI network podman (type=bridge) at /var/home/jakub/.config/cni/net.d/87-podman.conflist 
DEBU[0000] Default CNI network name podman is unchangeable 
INFO[0000] Setting parallel job count to 25             
DEBU[0000] Called inspect.PersistentPostRunE(podman --log-level debug inspect --format json --type container fedora-toolbox-35) 
DEBU Entry point PID is a float64                 
DEBU Entry point of container fedora-toolbox-35 is toolbox (PID=0) 
Error: invalid entry point PID of container fedora-toolbox-35

Please help for resolving the issue.

@HarryMichal HarryMichal added 1. Regression Something worked in a previous release but does not now 2. Container Realm The issue is related to what happens inside of a toolbox container labels Jun 30, 2021
@HarryMichal
Copy link
Member

Hi @kuba3351, thank you for opening this ticket. Could you, please, provide the output of podman start --attach fedora-toolbox-35?

@kuba3351
Copy link
Contributor Author

Yes, I have got another error:

[jakub@NBJSIERZEGA ~]$ podman start --attach fedora-toolbox-35
toolbox: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by toolbox)

@kuba3351
Copy link
Contributor Author

kuba3351 commented Jun 30, 2021

Ok, I figured out how to partially workaround this issue.

I found that toolbox binary on my host machine was upgraded and built with never GLIBC, and it is mounted into the container with older GLIBC. So, I run the following commands:

podman run -it --name temporary registry.fedoraproject.org/fedora-toolbox:35 /bin/bash
sudo dnf update
exit
podman commit temporary registry.fedoraproject.org/fedora-toolbox:35

After that I can create working toolboxes based on fedora 35 but older toolboxes still not work. Any idea what to deal with it?

@kuba3351
Copy link
Contributor Author

Created PR with proposal solution for this issue (#822)

@debarshiray
Copy link
Member

This is basically a duplicate of #529

We need to update src/libc-wrappers/libc-wrappers.c for the new GNU C Library release.

olivergs added a commit to olivergs/toolbox that referenced this issue Jul 5, 2021
olivergs added a commit to olivergs/toolbox that referenced this issue Jul 5, 2021
olivergs added a commit to olivergs/toolbox that referenced this issue Jul 5, 2021
@travier
Copy link
Member

travier commented Jul 7, 2021

Do you have a doc somewhere explaining why we need to have toolbox as the entrypoint of the container? Thanks!

@travier
Copy link
Member

travier commented Jul 7, 2021

It might make sense to create a separate, fully statically linked binary to use inside the toolbox for setup operations. This would thus be completely independent of the version of the glibc available inside the container. Some container images may not even have glibc installed at all.

@HarryMichal
Copy link
Member

Do you have a doc somewhere explaining why we need to have toolbox as the entrypoint of the container? Thanks!

There's a paragraph in toolbox-init-container(1) that touches the topic:

OCI containers are inherently immutable. Configuration options passed through podman create are baked into the definition of the OCI container, and can't be changed later. This means that changes and improvements made in newer versions of Toolbox can't be applied to pre-existing toolbox containers created by older versions of Toolbox. This is avoided by using the entry point to configure the container at runtime.

@HarryMichal
Copy link
Member

It might make sense to create a separate, fully statically linked binary to use inside the toolbox for setup operations. This would thus be completely independent of the version of the glibc available inside the container. Some container images may not even have glibc installed at all.

Agreed. Let's discuss it more after we release v0.1.

@travier
Copy link
Member

travier commented Jul 7, 2021

Filed as #832 for tracking.

HarryMichal added a commit to HarryMichal/toolbox that referenced this issue Jul 8, 2021
An upgrade of glibc has caused an issue on Fedora Rawhide[0]. We need a
clear indicator that a change in glibc could cause it.

[0] containers#821
HarryMichal added a commit to HarryMichal/toolbox that referenced this issue Jul 8, 2021
An upgrade of glibc has caused an issue on Fedora Rawhide[0]. We need a
clear indicator that a change in glibc could cause it.

[0] containers#821

containers#834
HarryMichal added a commit that referenced this issue Jul 8, 2021
An upgrade of glibc has caused an issue on Fedora Rawhide[0]. We need a
clear indicator that a change in glibc could cause it.

[0] #821

#834
@nanonyme
Copy link
Contributor

In the meantime, would it help if we had an image that actually contained the newer glibc?

@juhp
Copy link
Contributor

juhp commented Sep 9, 2021

There is a new candidate-registry.fedoraproject.org/fedora-toolbox:35 image now and also fedora-toolbox:36 finally.

@cheese
Copy link

cheese commented Sep 11, 2021

https://bugzilla.redhat.com/show_bug.cgi?id=1995439 is proposed as an F35 blocker.

debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 21, 2021
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.

In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available. However, the Go implementation, can run into ABI
compatibility issues because binaries built on newer toolchains aren't
meant to be run against older runtimes.

The previous approach [1] of restricting the versions of the glibc
symbols that are linked against isn't actually supported by glibc, and
breaks if the early process start-up code changes. This is seen in
glibc-2.34, which is used by Fedora 35 onwards, where a new version of
the __libc_start_main symbol [2] was added as part of some security
hardening:
  $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    __libc_start_main
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_detach
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_create
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_attr_getstacksize

This means that /usr/bin/toolbox binaries built against glibc-2.34 on
newer Fedoras fail to run against older glibcs in older Fedoras.

Another option is to make the host's runtime available inside the
toolbox container and ensure that the binary always runs against it.

Luckily, almost all supported containers have the host's /usr available
at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to
/run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing
the path of the dynamic linker (ie., PT_INTERP) to the one inside
/run/host.

Unfortunately, there can only be one PT_INTERP entry inside the
binary, so there must be a /run/host on the host too. Therefore, a
/run/host symbolic link is created on the host that points to the
host's /.

[1] Commit 6ad9c63
    containers#534

[2] glibc commit 035c012e32c11e84
    https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84
    https://sourceware.org/bugzilla/show_bug.cgi?id=23323

containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 21, 2021
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.

In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available. However, the Go implementation, can run into ABI
compatibility issues because binaries built on newer toolchains aren't
meant to be run against older runtimes.

The previous approach [1] of restricting the versions of the glibc
symbols that are linked against isn't actually supported by glibc, and
breaks if the early process start-up code changes. This is seen in
glibc-2.34, which is used by Fedora 35 onwards, where a new version of
the __libc_start_main symbol [2] was added as part of some security
hardening:
  $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    __libc_start_main
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_detach
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_create
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_attr_getstacksize

This means that /usr/bin/toolbox binaries built against glibc-2.34 on
newer Fedoras fail to run against older glibcs in older Fedoras.

Another option is to make the host's runtime available inside the
toolbox container and ensure that the binary always runs against it.

Luckily, almost all supported containers have the host's /usr available
at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to
/run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing
the path of the dynamic linker (ie., PT_INTERP) to the one inside
/run/host.

Unfortunately, there can only be one PT_INTERP entry inside the
binary, so there must be a /run/host on the host too. Therefore, a
/run/host symbolic link is created on the host that points to the
host's /.

[1] Commit 6ad9c63
    containers#534

[2] glibc commit 035c012e32c11e84
    https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84
    https://sourceware.org/bugzilla/show_bug.cgi?id=23323

containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 21, 2021
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.

In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available. However, the Go implementation, can run into ABI
compatibility issues because binaries built on newer toolchains aren't
meant to be run against older runtimes.

The previous approach [1] of restricting the versions of the glibc
symbols that are linked against isn't actually supported by glibc, and
breaks if the early process start-up code changes. This is seen in
glibc-2.34, which is used by Fedora 35 onwards, where a new version of
the __libc_start_main symbol [2] was added as part of some security
hardening:
  $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    __libc_start_main
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_detach
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_create
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_attr_getstacksize

This means that /usr/bin/toolbox binaries built against glibc-2.34 on
newer Fedoras fail to run against older glibcs in older Fedoras.

Another option is to make the host's runtime available inside the
toolbox container and ensure that the binary always runs against it.

Luckily, almost all supported containers have the host's /usr available
at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to
/run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing
the path of the dynamic linker (ie., PT_INTERP) to the one inside
/run/host.

Unfortunately, there can only be one PT_INTERP entry inside the
binary, so there must be a /run/host on the host too. Therefore, a
/run/host symbolic link is created on the host that points to the
host's /.

[1] Commit 6ad9c63
    containers#534

[2] glibc commit 035c012e32c11e84
    https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84
    https://sourceware.org/bugzilla/show_bug.cgi?id=23323

containers#821
HarryMichal pushed a commit to HarryMichal/toolbox that referenced this issue Oct 21, 2021
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.

In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available. However, the Go implementation, can run into ABI
compatibility issues because binaries built on newer toolchains aren't
meant to be run against older runtimes.

The previous approach [1] of restricting the versions of the glibc
symbols that are linked against isn't actually supported by glibc, and
breaks if the early process start-up code changes. This is seen in
glibc-2.34, which is used by Fedora 35 onwards, where a new version of
the __libc_start_main symbol [2] was added as part of some security
hardening:
  $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    __libc_start_main
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_detach
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_create
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_attr_getstacksize

This means that /usr/bin/toolbox binaries built against glibc-2.34 on
newer Fedoras fail to run against older glibcs in older Fedoras.

Another option is to make the host's runtime available inside the
toolbox container and ensure that the binary always runs against it.

Luckily, almost all supported containers have the host's /usr available
at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to
/run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing
the path of the dynamic linker (ie., PT_INTERP) to the one inside
/run/host.

Unfortunately, there can only be one PT_INTERP entry inside the
binary, so there must be a /run/host on the host too. Therefore, a
/run/host symbolic link is created on the host that points to the
host's /.

[1] Commit 6ad9c63
    containers#534

[2] glibc commit 035c012e32c11e84
    https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84
    https://sourceware.org/bugzilla/show_bug.cgi?id=23323

containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 21, 2021
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.

In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available. However, the Go implementation, can run into ABI
compatibility issues because binaries built on newer toolchains aren't
meant to be run against older runtimes.

The previous approach [1] of restricting the versions of the glibc
symbols that are linked against isn't actually supported by glibc, and
breaks if the early process start-up code changes. This is seen in
glibc-2.34, which is used by Fedora 35 onwards, where a new version of
the __libc_start_main symbol [2] was added as part of some security
hardening:
  $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    __libc_start_main
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_detach
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_create
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_attr_getstacksize

This means that /usr/bin/toolbox binaries built against glibc-2.34 on
newer Fedoras fail to run against older glibcs in older Fedoras.

Another option is to make the host's runtime available inside the
toolbox container and ensure that the binary always runs against it.

Luckily, almost all supported containers have the host's /usr available
at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to
/run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing
the path of the dynamic linker (ie., PT_INTERP) to the one inside
/run/host.

Unfortunately, there can only be one PT_INTERP entry inside the
binary, so there must be a /run/host on the host too. Therefore, a
/run/host symbolic link is created on the host that points to the
host's /.

Based on ideas from Alexander Larsson and Ray Strode.

[1] Commit 6ad9c63
    containers#534

[2] glibc commit 035c012e32c11e84
    https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84
    https://sourceware.org/bugzilla/show_bug.cgi?id=23323

containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 21, 2021
The /usr/bin/toolbox binary is not only used to interact with toolbox
containers and images from the host. It's also used as the entry point
of the containers by bind mounting the binary from the host into the
container. This means that the /usr/bin/toolbox binary on the host must
also work inside the container, even if they have different operating
systems.

In the past, this worked perfectly well with the POSIX shell
implementation because it got intepreted by whichever /bin/sh was
available. However, the Go implementation, can run into ABI
compatibility issues because binaries built on newer toolchains aren't
meant to be run against older runtimes.

The previous approach [1] of restricting the versions of the glibc
symbols that are linked against isn't actually supported by glibc, and
breaks if the early process start-up code changes. This is seen in
glibc-2.34, which is used by Fedora 35 onwards, where a new version of
the __libc_start_main symbol [2] was added as part of some security
hardening:
  $ objdump -T ./usr/bin/toolbox | grep GLIBC_2.34
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    __libc_start_main
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_detach
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_create
  0000000000000000      DF *UND*	0000000000000000  GLIBC_2.34
    pthread_attr_getstacksize

This means that /usr/bin/toolbox binaries built against glibc-2.34 on
newer Fedoras fail to run against older glibcs in older Fedoras.

Another option is to make the host's runtime available inside the
toolbox container and ensure that the binary always runs against it.

Luckily, almost all supported containers have the host's /usr available
at /run/host/usr. This is exploited by embedding RPATHs or RUNPATHs to
/run/host/usr/lib and /run/host/usr/lib64 in the binary, and changing
the path of the dynamic linker (ie., PT_INTERP) to the one inside
/run/host.

Unfortunately, there can only be one PT_INTERP entry inside the
binary, so there must be a /run/host on the host too. Therefore, a
/run/host symbolic link is created on the host that points to the
host's /.

Based on ideas from Alexander Larsson and Ray Strode.

[1] Commit 6ad9c63
    containers#534

[2] glibc commit 035c012e32c11e84
    https://sourceware.org/git/?p=glibc.git;a=commit;h=035c012e32c11e84
    https://sourceware.org/bugzilla/show_bug.cgi?id=23323

containers#821
@debarshiray
Copy link
Member

This should be fixed by #897

debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 25, 2021
The path of the dynamic linker (ie., PT_INTERP), as specified in an
architecture's ABI, often starts with /lib or /lib64, not /usr/lib or
/usr/lib64. eg., it's /lib/ld-linux-aarch64.so.1 for aarch64 and
/lib64/ld-linux-x86-64.so.2 for x86_64.

Unfortunately, until very recently [1], only the host's /usr was
present inside a toolbox container's /run/host, not /lib or /lib64.
Therefore, simply prepending /run/host to the /usr/bin/toolbox
binary's existing PT_INTERP entry wouldn't locate the host's dynamic
linker inside the toolbox container. This broke backwards compatibility
with every container out there, except the ones created with the
current development version in Git.

To restore backwards compatibility, the /lib and /lib64 symbolic links
must be resolved to their respective locations inside /usr.

The following caveats must be noted:

  * With glibc, even the basename of the path of the dynamic linker as
    specified in an architecture's ABI, is a symbolic link to a file
    named ld-<glibc-version>.so. However, this file can't be used as
    the PT_INTERP entry, because its name will change when glibc is
    updated and the PT_INTERP entry will become invalid until the
    /usr/bin/toolbox binary is rebuilt.

  * On Debian, a path like /lib64/ld-linux-x86-64.so.2 doesn't resolve
    to something inside /usr/lib64. Instead it ends up inside
    /usr/lib/x86_64-linux-gnu through a series of symbolic links:
      - /lib64 -> usr/lib64
      - /usr/lib64/ld-linux-x86-64.so.2
          -> /lib/x86_64-linux-gnu/ld-2.28.so
      - /lib -> usr/lib

  * It's assumed that a symbolic link with the basename specified in
    the ABI lives in the same directory as the actual dynamic linker
    binary named ld-<glibc-version>.so.

Fallout from 6063eb2

[1] Commit d03a5fe
    containers#827

containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Oct 25, 2021
The path of the dynamic linker (ie., PT_INTERP), as specified in an
architecture's ABI, often starts with /lib or /lib64, not /usr/lib or
/usr/lib64. eg., it's /lib/ld-linux-aarch64.so.1 for aarch64 and
/lib64/ld-linux-x86-64.so.2 for x86_64.

Unfortunately, until very recently [1], only the host's /usr was
present inside a toolbox container's /run/host, not /lib or /lib64.
Therefore, simply prepending /run/host to the /usr/bin/toolbox
binary's existing PT_INTERP entry wouldn't locate the host's dynamic
linker inside the toolbox container. This broke backwards compatibility
with every container out there, except the ones created with the
current development version in Git.

To restore backwards compatibility, the /lib and /lib64 symbolic links
must be resolved to their respective locations inside /usr.

The following caveats must be noted:

  * With glibc, even the basename of the path of the dynamic linker as
    specified in an architecture's ABI, is a symbolic link to a file
    named ld-<glibc-version>.so. However, this file can't be used as
    the PT_INTERP entry, because its name will change when glibc is
    updated and the PT_INTERP entry will become invalid until the
    /usr/bin/toolbox binary is rebuilt.

  * On Debian, a path like /lib64/ld-linux-x86-64.so.2 doesn't resolve
    to something inside /usr/lib64. Instead it ends up inside
    /usr/lib/x86_64-linux-gnu through a series of symbolic links:
      - /lib64 -> usr/lib64
      - /usr/lib64/ld-linux-x86-64.so.2
          -> /lib/x86_64-linux-gnu/ld-2.28.so
      - /lib -> usr/lib

  * It's assumed that a symbolic link with the basename specified in
    the ABI lives in the same directory as the actual dynamic linker
    binary named ld-<glibc-version>.so.

Fallout from 6063eb2

[1] Commit d03a5fe
    containers#827

containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Dec 5, 2022
Fedora 32 reached End of Life on 25th May 2021:
https://docs.fedoraproject.org/en-US/releases/eol/

That's quite old because right now Fedora 35 is nearing its End of Life.

Since the tests are intended for Toolbx, not the Fedora infrastructure,
it will be better to use something newer, because images that are too
old can get lost from registry.fedoraproject.org.  The fedora-toolbox:34
image can be a drop-in replacement for the fedora-toolbox:32 image for
the purposes of this test suite, and has the advantage of being newer.

Note that fedora-toolbox:34 is also old enough to test that the toolbox
binary runs against it's build-time ABI (ie., the host's ABI), and not
the Toolbx container's ABI, when it's invoked as the entry point of the
container [1].  This is important because the subsequent commit will add
a test to ensure that.

[1] Commit 6063eb2
    containers#821
debarshiray added a commit to debarshiray/toolbox that referenced this issue Dec 7, 2022
Fedora 32 reached End of Life on 25th May 2021:
https://docs.fedoraproject.org/en-US/releases/eol/

That's quite old because right now Fedora 35 is nearing its End of Life.

Since the tests are intended for Toolbx, not the Fedora infrastructure,
it will be better to use a newer image, because images that are too old
can get lost from registry.fedoraproject.org.  The fedora-toolbox:34
image can be a drop-in replacement for the fedora-toolbox:32 image for
the purposes of this test suite, and has the advantage of being newer.

Note that fedora-toolbox:34 is also old enough to test that the toolbox
binary runs against it's build-time ABI from the host, and not the
Toolbx container's ABI, when it's invoked as the entry point of the
container [1,2].  This is important because the subsequent commit will
add a test to ensure that.

[1] Commit 6063eb2
    containers#821

[2] Commit 6ad9c63
    containers#529

containers#1187
debarshiray added a commit to debarshiray/toolbox that referenced this issue Dec 7, 2022
Commit ae43560 had added a test with a similar intention.  When
the test suite is run on a Fedora Rawhide host, it tests whether the
containers for the two previous stable Fedora releases start or not.
Fedora N-2 reaches End of Life four weeks after Fedora N is released.
So, testing the containers for Fedora Rawhide and the two previous
stable releases on a Fedora Rawhide host is a decent test of general
backwards compatibility.

However, as seen recently [1], this isn't enough to catch some known
ABI compatibility issues [2,3].  These involve toolbox binaries built
on hosts with newer toolchains that aren't meant to be run against
containers with older runtimes.  A targeted test is needed to defend
against these scenarios.

The fedora-toolbox:34 image has glibc-2.33, which is old enough to be
unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer.

[1] containers#1180

[2] Commit 6063eb2
    containers#821

[3] Commit 6ad9c63
    containers#529

https://docs.fedoraproject.org/en-US/releases/

containers#1187
debarshiray added a commit to debarshiray/toolbox that referenced this issue Dec 7, 2022
Commit ae43560 had added a test with a similar intention.  When
the test suite is run on a Fedora Rawhide host, it tests whether the
containers for the two previous stable Fedora releases start or not.
Fedora N-2 reaches End of Life four weeks after Fedora N is released.
So, testing the containers for Fedora Rawhide and the two previous
stable releases on a Fedora Rawhide host is a decent test of general
backwards compatibility.

However, as seen recently [1], this isn't enough to catch some known
ABI compatibility issues [2,3].  These involve toolbox binaries built
on hosts with newer toolchains that aren't meant to be run against
containers with older runtimes.  A targeted test is needed to defend
against these scenarios.

The fedora-toolbox:34 image has glibc-2.33, which is old enough to be
unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer.

[1] containers#1180

[2] Commit 6063eb2
    containers#821

[3] Commit 6ad9c63
    containers#529

https://docs.fedoraproject.org/en-US/releases/

containers#1187
debarshiray added a commit to debarshiray/toolbox that referenced this issue Dec 7, 2022
Commit ae43560 had added a test with a similar intention.  When
the test suite is run on a Fedora Rawhide host, it tests whether the
containers for the two previous stable Fedora releases start or not.
Fedora N-2 reaches End of Life 4 weeks after Fedora N is released [1].
So, testing the containers for Fedora Rawhide and the two previous
stable releases on a Fedora Rawhide host is a decent test of general
backwards compatibility.

However, as seen recently [2], this isn't enough to catch some known
ABI compatibility issues [3,4].  These involve toolbox binaries built
on hosts with newer toolchains that aren't meant to be run against
containers with older runtimes.  A targeted test is needed to defend
against these scenarios.

The fedora-toolbox:34 image has glibc-2.33, which is old enough to be
unable to run binaries compiled on Fedora 35 with glibc-2.34 and newer.

[1] https://docs.fedoraproject.org/en-US/releases/

[2] containers#1180

[3] Commit 6063eb2
    containers#821

[4] Commit 6ad9c63
    containers#529

containers#1187
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Dec 15, 2022
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Dec 16, 2022
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Dec 20, 2022
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ranges for the user and the group.  Sadly,
there doesn't seem to be a way to close the file descriptor that's used
by libsubid.so for logging.  Hence, this one file descriptor (currently
number 3, unless the parent process passes down others) pointing to
/dev/null is leaked for the life cycle of the toolbox process.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Jan 26, 2023
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ranges for the user and the group.  Sadly,
there doesn't seem to be a way to close the file descriptor that's used
by libsubid.so for logging.  Hence, this one file descriptor (currently
number 3, unless the parent process passes down others) pointing to
/dev/null is leaked for the life cycle of the toolbox process.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Jan 26, 2023
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.  Therefore, its necessary to use libsubid.so to check
the subordinate ID ranges.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ID ranges for the user and the group.  Sadly,
there doesn't seem to be a way to close the file descriptor that's used
by libsubid.so for logging.  Hence, this one file descriptor (currently
number 3, unless the parent process passes down others) pointing to
/dev/null is leaked for the life cycle of the toolbox process.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Jan 26, 2023
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.  Therefore, its necessary to use libsubid.so to check
the subordinate ID ranges.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ID ranges for the user and the group.  Sadly,
there doesn't seem to be a way to close the file descriptor that's used
by libsubid.so for logging.  Hence, this one file descriptor (currently
number 3, unless the parent process passes down others) pointing to
/dev/null is leaked for the life cycle of the toolbox process.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray added a commit to mhjacks/toolbox that referenced this issue Jan 27, 2023
The Toolbx source code was being repeatedly re-built with 'go run ...'
to generate each of the shell completions.  A following commit will use
CGO to link to libsubid.so, which will sufficiently complicate the build
that a simple 'go run ...' won't be enough.

Hence, it's better to use the same mechanisms in src/go-build-wrapper
that are used to generate the Toolbx binary.

However, the main Toolbx binary only works if /run/host exists [1],
which is only created after 'meson install' and may not exist during
'meson compile'.  This is solved by using a throwaway binary that lacks
the ABI protections necessary to run as a container's entry, but is
enough to generate the shell completions.

Fallout from bafbbe8

[1] Commit 6063eb2
    containers#821

containers#1074
debarshiray pushed a commit to mhjacks/toolbox that referenced this issue Jan 27, 2023
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.  Therefore, its necessary to use libsubid.so to check
the subordinate ID ranges.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

Unlike 'readSubid', this code considers the absence of any range (ie.,
nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ID ranges for the user and the group.  Sadly,
there doesn't seem to be a way to close the file descriptor that's used
by libsubid.so for logging.  Hence, this one file descriptor (currently
number 3, unless the parent process passes down others) pointing to
/dev/null is leaked for the life cycle of the toolbox process.

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,7] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 2b22a6909dba60d
    shadow-maint/shadow@2b22a6909dba60d
    shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
debarshiray pushed a commit to debarshiray/toolbox that referenced this issue Jan 28, 2023
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library.  They are not listed in /etc/subuid
and /etc/subgid.  Therefore, its necessary to use libsubid.so to check
the subordinate ID ranges.

The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].

However, unlike 'readSubid', this code considers the absence of any
range (ie., nRanges == 0) to be an error as well.

More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary.  This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3].  Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built.  This can render the binary
useless due to ABI compatibility issues.  Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.

Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ID ranges for the user and the group.  Note
that libsubid_init() must be passed a FILE pointer for logging.
Otherwise, it will create it's own for logging, and there's no way to
close it during dlclose(3).

Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6].  Therefore, support
for older versions, with the relevant workarounds, is necessary.
Fortunately, the oldest that needs to be support is Shadow 4.9 because
that's when libsubid.so was introduced [7].

Note that SUBID_ABI_VERSION was only introduced with version 4 of the
libsubid.so API/ABI released in Shadow 4.10 [8].  The first release of
libsubid.so in Shadow 4.9 already had an ABI version of 3.0.0 [9], since
it was bumped a few times during development, so that's what's assumed
when SUBID_ABI_VERSION is absent.

This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,10] to specify these.  This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.

Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.

Some changes by Debarshi Ray.

[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go

[2] https://man7.org/linux/man-pages/man8/ld.so.8.html

[3] Commit 6063eb2
    containers#821

[4] Shadow commit 32f641b207f6ddff
    shadow-maint/shadow@32f641b207f6ddff
    shadow-maint/shadow#443

[5] https://packages.debian.org/source/buster/shadow

[6] Shadow commit 79157cbad87f42cd
    shadow-maint/shadow@79157cbad87f42cd
    shadow-maint/shadow#465

[7] Shadow commit 0a7888b1fad613a0
    shadow-maint/shadow@0a7888b1fad613a0
    shadow-maint/shadow#154

[8] Shadow commit 0c9f64140852e8d5
    shadow-maint/shadow@0c9f64140852e8d5
    shadow-maint/shadow#449

[9] Shadow commit 3d670ba7ed58f910
    shadow-maint/shadow@3d670ba7ed58f910
    shadow-maint/shadow#339

[10] Shadow commit 2b22a6909dba60d
     shadow-maint/shadow@2b22a6909dba60d
     shadow-maint/shadow#325

containers#1074

Signed-off-by: Martin Jackson <martjack@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1. Regression Something worked in a previous release but does not now 2. Container Realm The issue is related to what happens inside of a toolbox container
Projects
None yet
Development

No branches or pull requests

7 participants