-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entering fails if NVIDIA Persistence Daemon is used #1572
Comments
I believe that this is a recently introduced problem; I'm running Bazzite on both my Desktop with a Nvidia GPU, and on a laptop with a disabled Nvidia dGPU (with envycontrol). I recently updated both Bazzite installations and can no longer enter any toolbox container, with the same error message that you shared. Also, maybe a red herring but
|
I add myself to the list of affected users. I must say that I am on Fedora Kinoite and the last working build that I pinned was 40.20241011.0. In the meantime these packages were updated and could be related to my issue:
Notice that neither the Nvidia driver, neither podman were updated between the working and faulty deployments. I might also report that my error when the Nvidia GPU is deactivated through
I never installed Nvidia container toolkit and I don't need my Nvidia GPU inside the containers. I read in a previous issue that everything should work seamlessly. It is not the case now. |
I'm having the same problem. Adding log from systemd journal:
|
I suspect this is because |
Stoping |
Thanks for the confirmation! I won't be able to get to this until Tuesday. Maybe you want to submit a pull request? :) The problem lies in the |
When a socket is bind-mounted to the container, also create a file mount point for it. Nvidia CDI on the proprietary driver added a socket for `nvidia-persistence.service` which was failing to be mounted in the container as no mount point existed. More logs in the issue below. Fixes containers#1572
@debarshiray thanks for giving a hint really really precise. |
If the NVIDIA Persistence Daemon is used, then 'enter' fails with: $ sudo systemctl start nvidia-persistenced.service $ toolbox enter Error: mount: /run/nvidia-persistenced/socket: mount point does not exist. dmesg(1) may have more information after failed mount system call. failed to apply mount from Container Device Interface for NVIDIA This is due to the socket at /run/nvidia-persistenced/socket being listed in the Container Device Interface specification when the NVIDIA Persistence Daemon is used. Fallout from 6e848b2 containers#1572
Thanks for your contribution, @jbtrystram ! |
Describe the bug
When trying to enter a toolbox podman fails with
Error: mount: /run/nvidia-persistenced/socket: mount point does not exist
Steps how to reproduce the behaviour
Expected behaviour
toolbox working awesome, as it's been for months
Actual behaviour
Output of
toolbox --version
(v0.0.90+)toolbox version 0.0.99.6
Toolbx package info (
rpm -q toolbox
)toolbox-0.0.99.6-1.fc40.x86_64
Output of
podman version
e.g.,
Podman package info (
rpm -q podman
)podman-5.2.3-1.fc40.x86_64
Info about your OS
universal-blue kinoite-nvidia build (f40)
Additional context
I think this coincide with me setting up a podman container using NVIDIA CUDA capabilities. Note that my other container works fine as expected.
I did ran
nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
following the podman documentationSee attached log of
toolbox enter -vv
toolbox-nvidia-issue.txt
The text was updated successfully, but these errors were encountered: