Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu: Use NVIDIA server drivers from Canonical #24

Merged
merged 7 commits into from
Dec 15, 2020

Conversation

ajdecon
Copy link
Collaborator

@ajdecon ajdecon commented Nov 16, 2020

  • Add support for using the NVIDIA headless server packages from the upstream Canonical repos
  • Keep support for using the CUDA repository instead, but make non-default

Note that this changes which version of the driver is installed by default, as the Canonical repositories include the -server driver branches for the recommended NVIDIA datacenter driver.

We will default to using -server as datacenter cluster usage represents most of our use of this role, but for development systems the CUDA repository drivers may still be preferred.

Test plan

Canonical repositories

Tested using the DeepOps nvidia-driver playbook.

# Run playbook against test VM
$ ansible-playbook -i virtual/config/inventory -l virtual-gpu01 playbooks/nvidia-software/nvidia-driver.yml

# Check installed packages on test VM
$ vagrant ssh virtual-gpu01
vagrant@ubuntu1804:~$ dpkg -l | grep nvidia
ii  libnvidia-cfg1-450-server:amd64       450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-compute-450-server:amd64    450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA libcompute package
ii  nvidia-compute-utils-450-server       450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA compute utilities
ii  nvidia-dkms-450-server                450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA DKMS package
ii  nvidia-headless-450-server            450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA headless metapackage
ii  nvidia-headless-no-dkms-450-server    450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA headless metapackage - no DKMS
ii  nvidia-kernel-common-450-server       450.80.02-0ubuntu0.18.04.3        amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-450-server       450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA kernel source package
ii  nvidia-utils-450-server               450.80.02-0ubuntu0.18.04.3        amd64        NVIDIA Server Driver support binaries

# Check that nvidia-smi works
vagrant@ubuntu1804:~$ nvidia-smi -L
GPU 0: Tesla P4 (UUID: GPU-XXXXXXXXX)

CUDA repositories

# Run playbook against test VM with CUDA repo enabled
$ ansible-playbook -i virtual/config/inventory -l virtual-gpu01 -e '{"nvidia_driver_ubuntu_install_from_cuda_repo": yes}' playbooks/nvidia-software/nvidia-driver.yml

# Check installed packages on test VM
vagrant@ubuntu1804:~$ dpkg -l | grep nvidia
ii  libnvidia-cfg1-455:amd64              455.32.00-0ubuntu1                amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-455                  455.32.00-0ubuntu1                all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-455:amd64           455.32.00-0ubuntu1                amd64        NVIDIA libcompute package
ii  libnvidia-decode-455:amd64            455.32.00-0ubuntu1                amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-455:amd64            455.32.00-0ubuntu1                amd64        NVENC Video Encoding runtime library
ii  libnvidia-extra-455:amd64             455.32.00-0ubuntu1                amd64        Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-455:amd64              455.32.00-0ubuntu1                amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-455:amd64                455.32.00-0ubuntu1                amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-ifr1-455:amd64              455.32.00-0ubuntu1                amd64        NVIDIA OpenGL-based Inband Frame Readback runtime library
ii  nvidia-compute-utils-455              455.32.00-0ubuntu1                amd64        NVIDIA compute utilities
ii  nvidia-dkms-455                       455.32.00-0ubuntu1                amd64        NVIDIA DKMS package
ii  nvidia-driver-455                     455.32.00-0ubuntu1                amd64        NVIDIA driver metapackage
ii  nvidia-kernel-common-455              455.32.00-0ubuntu1                amd64        Shared files used with the kernel module
ii  nvidia-kernel-source-455              455.32.00-0ubuntu1                amd64        NVIDIA kernel source package
ii  nvidia-modprobe                       455.32.00-0ubuntu1                amd64        Load the NVIDIA kernel driver and create device files
ii  nvidia-prime                          0.8.8.2                           all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                       455.32.00-0ubuntu1                amd64        Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-455                      455.32.00-0ubuntu1                amd64        NVIDIA driver support binaries
ii  xserver-xorg-video-nvidia-455         455.32.00-0ubuntu1                amd64        NVIDIA binary Xorg driver

# Check that nvidia-smi works
vagrant@ubuntu1804:~$ nvidia-smi -L
GPU 0: Tesla P4 (UUID: GPU-09d2c3f1-e54a-055f-c996-601051acf4e9)

- Add support for using the NVIDIA headless server packages from the
upstream Canonical repos
- Keep support for using the CUDA repository instead, but make
non-default
@michael-balint
Copy link
Collaborator

Tested w/ DGX-1

@michael-balint michael-balint merged commit f04c440 into NVIDIA:master Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants