Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia not getting detected on systems using nvidia-container-toolkit #12

Closed
FrantaNautilus opened this issue Sep 9, 2024 · 5 comments

Comments

@FrantaNautilus
Copy link

Upon installation of harbor I have noticed that the Nvidia GPU is not being used, based on the output of harbor cmd -h which does not include any compose files for Nvidia and the readings from nvtop. The reason for this seems to be the command docker info | grep -q "Runtimes:.*nvidia" which gives false return value because the nvidia-container-runtime is not installed. However nvidia-container-runtime is apparently deprecated by Nvidia and 'nvidia-container-toolkit' is the only needed component for running containers with Nvidia GPU acceleration. For this reason the detection of the Nvidia GPU should use different mechanism, e.g. command nvidia-container-toolkit -version.

Link to Nvidia detection code from harbor.sh in question:
https://github.com/av/harbor/blob/f25978a445099884f248f1a2c70fd70d734f1636/harbor.sh#L206C5-L209C7

@av
Copy link
Owner

av commented Sep 9, 2024

Hi, thank you for trying out Harbor and for such a detailed bug report!

Just fixed it in v0.1.17, switching to the nvidia-container-toolkit detection.

I think the drawback will be that it will fail to detect the case when the toolkit is installed, but have not been configured with:

sudo nvidia-ctk runtime configure --runtime=docker

but I didn't test that

@FrantaNautilus
Copy link
Author

Thank you for the virtually immediate reply and fix! Your speed of developing this software is incredible.
As for the configuration, I never configured the docker this way you and Nvidia container toolkit documentation suggests. Instead I only used the docker with the flag --gpus all and everything worked as expected.
I will test both the fix and the outcome of using sudo nvidia-ctk runtime configure --runtime=docker on my configuration and report back to this issue.

@FrantaNautilus
Copy link
Author

I am sorry it took me so long, but I have the results. I have tested the v0.1.17 on a systems with and without configured 'nvidia-container-toolkit' (that is 'nvidia-container-toolkit' is installed on both but on the first it is configured a runtime and on the second it is only installed) and 'harbor' works on both without issues, tested with default setup ('ollama' + 'webui').

@av
Copy link
Owner

av commented Sep 12, 2024

That's awesome, thanks for a thorough test!

@av
Copy link
Owner

av commented Sep 13, 2024

I'm closing this for now, but please feel free to re-open if there'll be anything to revisit

@av av closed this as completed Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants