-
Notifications
You must be signed in to change notification settings - Fork 47
GFD returns 'no labels generated from any source' #36
Comments
On k3s you need to update |
Thanks for the answer @klueska I found
But as I did not have Restarting
But unfortunately GFD DS does not start after that
I guess the |
I have tried to follow tutorials that do not set the default runtime to
The labels are set, but I found these issues helpful:
|
After some tinkering I can report, that I got it to work just fine. I forgot the For reference and in order:
If this is running you should see labels being applied to your node.
Only after the device plugin finished, the
After that you can run a GPU pod, such as documented in the k3s guide. |
I'm closing this issue. The use of a runtime class allowed the labels to be generated. |
Dear all,
I have a setup of
k3s
andrancher
on three nodes. One node has two Tesla T4 GPUs.Running
nvidia-smi
on the node directly returnsWhich tells me that the diver is installed correctly and I can proceed in the k3s guide.
The content of
/var/lib/rancher/k3s/agent/etc/containerd/config.toml
is,, where I added the line
default_runtime_name = "nvidia"
I continue with
and also
which shows the following result in the logs of
nfd
Which mentions
nvidia
andpci-10de
, suggesting that the discovery was successful as I do not get these entries on my non-GPU nodes.After applying the above GFD daemonset and checking the logs
It says that no labels were generated. Because of the warning
WARNING: No valid resources detected; using empty manager.
?As
nfd
seems to work butgfd
does not Iexec
into thegfd
DS to rungpu-feature-discovery
from the command line. No luck here to get another output.Notes
I tried this with nvidia-container-toolkit 1.12.1 and 1.13.0-rc.2
nvidia
related packages areThe text was updated successfully, but these errors were encountered: