Entering fails if NVIDIA Persistence Daemon is used #1572

jbtrystram · 2024-10-23T11:47:58Z

Describe the bug
When trying to enter a toolbox podman fails with Error: mount: /run/nvidia-persistenced/socket: mount point does not exist

Steps how to reproduce the behaviour

f40 kinoite with a nvidia GPU
toolbox create
toolbox enter
See error

Expected behaviour
toolbox working awesome, as it's been for months

Actual behaviour

jib@fedora:/var/home/jib$ toolbox enter
Error: mount: /run/nvidia-persistenced/socket: mount point does not exist.
       dmesg(1) may have more information after failed mount system call.
failed to apply mount from Container Device Interface for NVIDIA

Output of toolbox --version (v0.0.90+)

toolbox version 0.0.99.6

Toolbx package info (rpm -q toolbox)

toolbox-0.0.99.6-1.fc40.x86_64

Output of podman version
e.g.,

Client:       Podman Engine
Version:      5.2.3
API Version:  5.2.3
Go Version:   go1.22.7
Built:        Tue Sep 24 02:00:00 2024
OS/Arch:      linux/amd64

Podman package info (rpm -q podman)
podman-5.2.3-1.fc40.x86_64

Info about your OS
universal-blue kinoite-nvidia build (f40)

Additional context

I think this coincide with me setting up a podman container using NVIDIA CUDA capabilities. Note that my other container works fine as expected.
I did ran nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml following the podman documentation

See attached log of toolbox enter -vv
toolbox-nvidia-issue.txt

The text was updated successfully, but these errors were encountered:

lmgarret · 2024-10-23T15:13:04Z

I believe that this is a recently introduced problem; I'm running Bazzite on both my Desktop with a Nvidia GPU, and on a laptop with a disabled Nvidia dGPU (with envycontrol). I recently updated both Bazzite installations and can no longer enter any toolbox container, with the same error message that you shared.

Also, maybe a red herring but dmesg does bring up something about the pid file in the same dir, could it be related?

[   11.540144] systemd[1]: /usr/lib/systemd/system/nvidia-persistenced.service:7: PIDFile= references a path below legacy directory /var/run/, updating /var/run/nvidia-persistenced/nvidia-persistenced.pid → /run/nvidia-persistenced/nvidia-persistenced.pid; please update the unit file accordingly.

LoGaIta99 · 2024-10-23T18:10:15Z

I add myself to the list of affected users. I must say that I am on Fedora Kinoite and the last working build that I pinned was 40.20241011.0. In the meantime these packages were updated and could be related to my issue:

amd-gpu-firmware 20240909-1.fc40 -> 20241017-2.fc40
intel-gpu-firmware 20240909-1.fc40 -> 20241017-2.fc40
nvidia-gpu-firmware 20240909-1.fc40 -> 20241017-2.fc40
toolbox 0.0.99.5-11.fc40 -> 0.0.99.6-1.fc40

Notice that neither the Nvidia driver, neither podman were updated between the working and faulty deployments.

I might also report that my error when the Nvidia GPU is deactivated through envycontrol is:

Error: failed to initialize NVIDIA Management Library

I never installed Nvidia container toolkit and I don't need my Nvidia GPU inside the containers.

I read in a previous issue that everything should work seamlessly. It is not the case now.
Distrobox is unaffected by this problem.
I previously explained my situation on Fedora discussion.

tfmoraes · 2024-10-23T18:31:51Z

I'm having the same problem. Adding log from systemd journal:

Oct 23 15:28:17 watchmen.scartissue conmon[33520]: conmon ccb5befa180ef889abac <ndebug>: failed to write to /proc/self/oom_score_adj: Permissão negada
Oct 23 15:28:17 watchmen.scartissue conmon[33521]: conmon ccb5befa180ef889abac <ndebug>: addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach}
Oct 23 15:28:17 watchmen.scartissue conmon[33521]: conmon ccb5befa180ef889abac <ndebug>: terminal_ctrl_fd: 12
Oct 23 15:28:17 watchmen.scartissue conmon[33521]: conmon ccb5befa180ef889abac <ndebug>: winsz read side: 15, winsz write side: 16
Oct 23 15:28:17 watchmen.scartissue systemd[3000]: Started libpod-ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c.scope - libcrun container.
Oct 23 15:28:17 watchmen.scartissue conmon[33521]: conmon ccb5befa180ef889abac <ndebug>: container PID: 33523
Oct 23 15:28:17 watchmen.scartissue podman[33501]: 2024-10-23 15:28:17.337042921 -0300 -03 m=+0.079351406 container init ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c (image=registry.fedoraproject.org/fedora-toolbox:41, name=fedora-toolbox-41, org.opencontainers.image.url=https://fedoraproject.org/, license=MIT, org.opencontainers.image.name=fedora-toolbox, org.opencontainers.image.license=MIT, org.opencontainers.image.version=41, version=41, vendor=Fedora Project, com.github.containers.toolbox=true, io.buildah.version=1.37.5, name=fedora-toolbox, org.opencontainers.image.vendor=Fedora Project)
Oct 23 15:28:17 watchmen.scartissue podman[33501]: 2024-10-23 15:28:17.340022121 -0300 -03 m=+0.082330606 container start ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c (image=registry.fedoraproject.org/fedora-toolbox:41, name=fedora-toolbox-41, version=41, io.buildah.version=1.37.5, org.opencontainers.image.version=41, name=fedora-toolbox, org.opencontainers.image.url=https://fedoraproject.org/, org.opencontainers.image.vendor=Fedora Project, license=MIT, org.opencontainers.image.license=MIT, org.opencontainers.image.name=fedora-toolbox, vendor=Fedora Project, com.github.containers.toolbox=true)
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Running as real user ID 0"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Resolved absolute path to the executable as /usr/bin/toolbox"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="TOOLBX_DELAY_ENTRY_POINT is "
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="TOOLBX_FAIL_ENTRY_POINT is "
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="TOOLBOX_PATH is /usr/bin/toolbox"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Migrating to newer Podman"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Migration not needed: running inside a container"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Setting up configuration"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Setting up configuration: file /etc/containers/toolbox.conf not found"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Setting up configuration: file /root/.config/containers/toolbox.conf not found"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Resolving container and image names"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Container: ''"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Distribution (CLI): ''"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Image (CLI): ''"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Release (CLI): ''"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Resolved container and image names"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Container: 'fedora-toolbox-41'"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Image: 'fedora-toolbox:41'"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Release: '41'"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating /run/.toolboxenv"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Path /run/host/etc exists"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Resolved /etc/localtime to /run/host/usr/share/zoneinfo/America/Sao_Paulo"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating regular file /etc/machine-id"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /etc/machine-id to /run/host/etc/machine-id"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/libvirt"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/libvirt to /run/host/run/libvirt"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/systemd/journal"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/systemd/journal to /run/host/run/systemd/journal"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/systemd/resolve"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/systemd/resolve to /run/host/run/systemd/resolve"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/systemd/sessions"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/systemd/sessions to /run/host/run/systemd/sessions"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/systemd/system"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/systemd/system to /run/host/run/systemd/system"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/systemd/users"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/systemd/users to /run/host/run/systemd/users"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/udev/data"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/udev/data to /run/host/run/udev/data"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /run/udev/tags"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/udev/tags to /run/host/run/udev/tags"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /tmp"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /tmp to /run/host/tmp"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /var/lib/flatpak"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /var/lib/flatpak to /run/host/var/lib/flatpak"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /var/lib/libvirt"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /var/lib/libvirt to /run/host/var/lib/libvirt"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /var/lib/systemd/coredump"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /var/lib/systemd/coredump to /run/host/var/lib/systemd/coredump"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /var/log/journal"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /var/log/journal to /run/host/var/log/journal"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /var/mnt"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /var/mnt to /run/host/var/mnt"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating directory /sys/fs/selinux"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /sys/fs/selinux to /usr/share/empty"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Preparing to redirect /home to /var/home"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="/var/home isn't a symbolic link"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Redirecting /home to /var/home"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Looking up group for sudo"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Group for sudo is wheel"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Modifying user thiago with UID 1000:"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=usermod
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=--append
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=--groups
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=wheel
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=--home
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=/var/home/thiago
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=--password
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=--shell
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=/usr/bin/fish
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=--uid
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=1000
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg=thiago
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: usermod: Warning: missing or non-executable shell '/usr/bin/fish'
Oct 23 15:28:17 watchmen.scartissue usermod[33559]: change user 'thiago' password
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Removing password for user root"
Oct 23 15:28:17 watchmen.scartissue passwd[33565]: password for 'root' changed by 'root'
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Creating runtime directory /run/user/1000/toolbox"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Loading Container Device Interface for NVIDIA from file /run/user/1000/toolbox/cdi-nvidia.json"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Applying Container Device Interface for NVIDIA"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Binding /run/nvidia-persistenced/socket to /run/host/run/nvidia-persistenced/socket"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: mount: /run/nvidia-persistenced/socket: mount point does not exist.
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]:        dmesg(1) may have more information after failed mount system call.
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: level=debug msg="Applying Container Device Interface for NVIDIA: failed to bind /run/nvidia-persistenced/socket to /run/host/run/nvidia-persistenced/socket"
Oct 23 15:28:17 watchmen.scartissue fedora-toolbox-41[33521]: Error: failed to apply mount from Container Device Interface for NVIDIA
Oct 23 15:28:17 watchmen.scartissue conmon[33521]: conmon ccb5befa180ef889abac <ninfo>: container 33523 exited with status 1
Oct 23 15:28:17 watchmen.scartissue conmon[33521]: conmon ccb5befa180ef889abac <nwarn>: Failed to open cgroups file: /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/user.slice/libpod-ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c.scope/container/memory.events
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Called cleanup.PersistentPreRunE(/usr/bin/podman --root /var/home/thiago/.local/share/containers/storage --runroot /run/user/1000/containers --log-level debug --cgroup-manager systemd --tmpdir /run/user/1000/libpod/tmp --network-config-dir  --network-backend netavark --volumepath /var/home/thiago/.local/share/containers/storage/volumes --db-backend sqlite --transient-store=false --runtime crun --storage-driver overlay --events-backend journald --syslog container cleanup ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c)"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Setting custom database backend: \"sqlite\""
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using conmon: \"/usr/bin/conmon\""
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=info msg="Using sqlite as database backend"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using graph driver overlay"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using graph root /var/home/thiago/.local/share/containers/storage"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using run root /run/user/1000/containers"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using static dir /var/home/thiago/.local/share/containers/storage/libpod"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using tmp dir /run/user/1000/libpod/tmp"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using volume path /var/home/thiago/.local/share/containers/storage/volumes"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using transient store: false"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="[graphdriver] trying provided driver \"overlay\""
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Cached value indicated that overlay is supported"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Cached value indicated that overlay is supported"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Cached value indicated that metacopy is not being used"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Cached value indicated that native-diff is usable"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Initializing event backend journald"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Using OCI runtime \"/usr/bin/crun\""
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=info msg="Setting parallel job count to 49"
Oct 23 15:28:17 watchmen.scartissue podman[33571]: 2024-10-23 15:28:17.416983073 -0300 -03 m=+0.026547419 container died ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c (image=registry.fedoraproject.org/fedora-toolbox:41, name=fedora-toolbox-41, com.github.containers.toolbox=true, license=MIT, org.opencontainers.image.url=https://fedoraproject.org/, org.opencontainers.image.vendor=Fedora Project, version=41, name=fedora-toolbox, org.opencontainers.image.name=fedora-toolbox, org.opencontainers.image.version=41, vendor=Fedora Project, io.buildah.version=1.37.5, org.opencontainers.image.license=MIT)
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Sending signal 9 to container ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Cleaning up container ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Network is already cleaned up, skipping..."
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Successfully cleaned up container ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Unmounted container \"ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c\""
Oct 23 15:28:17 watchmen.scartissue podman[33571]: 2024-10-23 15:28:17.458947593 -0300 -03 m=+0.068511929 container cleanup ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c (image=registry.fedoraproject.org/fedora-toolbox:41, name=fedora-toolbox-41, vendor=Fedora Project, version=41, name=fedora-toolbox, org.opencontainers.image.name=fedora-toolbox, org.opencontainers.image.version=41, license=MIT, org.opencontainers.image.url=https://fedoraproject.org/, org.opencontainers.image.vendor=Fedora Project, org.opencontainers.image.license=MIT, io.buildah.version=1.37.5, com.github.containers.toolbox=true)
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Called cleanup.PersistentPostRunE(/usr/bin/podman --root /var/home/thiago/.local/share/containers/storage --runroot /run/user/1000/containers --log-level debug --cgroup-manager systemd --tmpdir /run/user/1000/libpod/tmp --network-config-dir  --network-backend netavark --volumepath /var/home/thiago/.local/share/containers/storage/volumes --db-backend sqlite --transient-store=false --runtime crun --storage-driver overlay --events-backend journald --syslog container cleanup ccb5befa180ef889abaca823d1fb3b52a21e90529fc66f59b8b14f92c5939d5c)"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=debug msg="Shutting down engines"
Oct 23 15:28:17 watchmen.scartissue /usr/bin/podman[33571]: time="2024-10-23T15:28:17-03:00" level=info msg="Received shutdown.Stop(), terminating!" PID=33571

debarshiray · 2024-10-24T17:29:39Z

I suspect this is because nvidia-persistenced.service is enabled on the host operating system, and it's exposing a bug in our common bind mounting code that only handles directories and regular files, but not sockets.

tfmoraes · 2024-10-24T17:51:42Z

Stoping nvidia-persistenced.service make toolbox works.

debarshiray · 2024-10-24T18:45:25Z

Stoping nvidia-persistenced.service make toolbox works.

Thanks for the confirmation! I won't be able to get to this until Tuesday. Maybe you want to submit a pull request? :)

The problem lies in the mountBind function in src/cmd/initContainer.go. I think the conditional branch for fileMode.IsRegular() also needs to cover fileMode&os.ModeSocket != 0.

When a socket is bind-mounted to the container, also create a file mount point for it. Nvidia CDI on the proprietary driver added a socket for `nvidia-persistence.service` which was failing to be mounted in the container as no mount point existed. More logs in the issue below. Fixes containers#1572

jbtrystram · 2024-10-24T20:49:58Z

Stoping nvidia-persistenced.service make toolbox works.

Thanks for the confirmation! I won't be able to get to this until Tuesday. Maybe you want to submit a pull request? :)

The problem lies in the mountBind function in src/cmd/initContainer.go. I think the conditional branch for fileMode.IsRegular() also needs to cover fileMode&os.ModeSocket != 0.

@debarshiray thanks for giving a hint really really precise.
with that much instructions I couldn't not do it :D
It now works for me ! :)

If the NVIDIA Persistence Daemon is used, then 'enter' fails with: $ sudo systemctl start nvidia-persistenced.service $ toolbox enter Error: mount: /run/nvidia-persistenced/socket: mount point does not exist. dmesg(1) may have more information after failed mount system call. failed to apply mount from Container Device Interface for NVIDIA This is due to the socket at /run/nvidia-persistenced/socket being listed in the Container Device Interface specification when the NVIDIA Persistence Daemon is used. Fallout from 6e848b2 containers#1572

debarshiray · 2024-10-29T14:04:04Z

Fixed by #1576 (and #1577)

Thanks for your contribution, @jbtrystram !

jbtrystram added the 1. Bug Something isn't working label Oct 23, 2024

picsel2 mentioned this issue Oct 24, 2024

Entering fails if Nouveau/Nova is being used while the proprietary NVIDIA driver is still installed #1573

Closed

jbtrystram mentioned this issue Oct 24, 2024

initContainer: also create mount points for sockets #1576

Closed

BraSDon mentioned this issue Oct 26, 2024

Toolbox unusable: failed to apply mount from Container Device Interface for NVIDIA ublue-os/bazzite#1769

Closed

debarshiray mentioned this issue Oct 29, 2024

cmd/initContainer: Unbreak 'enter' if NVIDIA Persistence Daemon is used #1577

Merged

debarshiray changed the title ~~nvidia-persistenced.socket bind error~~ Entering fails if NVIDIA Persistence Daemon is used Oct 29, 2024

debarshiray closed this as completed Oct 29, 2024

lmgarret mentioned this issue Oct 30, 2024

Unable to enter toolbox container after system update ublue-os/bazzite#1787

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Entering fails if NVIDIA Persistence Daemon is used #1572

Entering fails if NVIDIA Persistence Daemon is used #1572

jbtrystram commented Oct 23, 2024 •

edited

Loading

lmgarret commented Oct 23, 2024 •

edited

Loading

LoGaIta99 commented Oct 23, 2024 •

edited

Loading

tfmoraes commented Oct 23, 2024

debarshiray commented Oct 24, 2024

tfmoraes commented Oct 24, 2024

debarshiray commented Oct 24, 2024

jbtrystram commented Oct 24, 2024 •

edited

Loading

debarshiray commented Oct 29, 2024

Entering fails if NVIDIA Persistence Daemon is used #1572

Entering fails if NVIDIA Persistence Daemon is used #1572

Comments

jbtrystram commented Oct 23, 2024 • edited Loading

lmgarret commented Oct 23, 2024 • edited Loading

LoGaIta99 commented Oct 23, 2024 • edited Loading

tfmoraes commented Oct 23, 2024

debarshiray commented Oct 24, 2024

tfmoraes commented Oct 24, 2024

debarshiray commented Oct 24, 2024

jbtrystram commented Oct 24, 2024 • edited Loading

debarshiray commented Oct 29, 2024

jbtrystram commented Oct 23, 2024 •

edited

Loading

lmgarret commented Oct 23, 2024 •

edited

Loading

LoGaIta99 commented Oct 23, 2024 •

edited

Loading

jbtrystram commented Oct 24, 2024 •

edited

Loading