Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix errors due to mismatch of GLIBC version caused from Go 1.20+ #82

Merged
merged 1 commit into from
Nov 10, 2023

Conversation

shivamerla
Copy link
Contributor

@shivamerla shivamerla commented Nov 7, 2023

Fix errors due to mismatch of GLIBC version caused from Go 1.20+ when the binary is built with glibc version different than the target image. This is happening since Go version is updated with commit: 24751ab. To avoid this, we are installing Go with the same base image as the target one.

Errors:

kubevirt-gpu-device-plugin: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by kubevirt-gpu-device-plugin)
kubevirt-gpu-device-plugin: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by kubevirt-gpu-device-plugin)

… the binary is built with glibc version

different than the target image. This is happening since Go version is updated with commit: 24751ab. To avoid this, we are installing Go with the same base image as the target one.

Errors:

kubevirt-gpu-device-plugin: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by kubevirt-gpu-device-plugin)
kubevirt-gpu-device-plugin: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by kubevirt-gpu-device-plugin)

Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
@shivamerla shivamerla changed the title Fix errors due to mismatch of GLIBC version caused from Go 1.20+ when… Fix errors due to mismatch of GLIBC version caused from Go 1.20+ Nov 7, 2023
@shivamerla
Copy link
Contributor Author

Tested with vGPU devices

Allocatable:
  cpu:                            80
  devices.kubevirt.io/kvm:        1k
  devices.kubevirt.io/tun:        1k
  devices.kubevirt.io/vhost-net:  1k
  ephemeral-storage:              189217404206
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         394654312Ki
  nvidia.com/A10-12Q:             0
  nvidia.com/GA102GL_A10:         0
  nvidia.com/NVIDIA_A10-12Q:      0
  nvidia.com/NVIDIA_A10-8Q:       6
  nvidia.com/gpu:                 0
  pods:                           110
cnt-dev@cnt-server-2:~$ kubectl logs -f nvidia-sandbox-device-plugin-daemonset-zxjj8 -n gpu-operator
Defaulted container "nvidia-sandbox-device-plugin-ctr" out of: nvidia-sandbox-device-plugin-ctr, vfio-pci-validation (init), vgpu-devices-validation (init)
2023/11/07 21:31:46 Not a device, continuing
2023/11/07 21:31:46 Nvidia device  0000:3b:00.0
2023/11/07 21:31:46 Nvidia device  0000:3b:00.4
2023/11/07 21:31:46 Nvidia device  0000:3b:00.5
2023/11/07 21:31:46 Nvidia device  0000:3b:00.6
2023/11/07 21:31:46 Nvidia device  0000:3b:00.7
2023/11/07 21:31:46 Nvidia device  0000:3b:01.0
2023/11/07 21:31:46 Nvidia device  0000:3b:01.1
2023/11/07 21:31:46 Nvidia device  0000:3b:01.2
2023/11/07 21:31:46 Nvidia device  0000:3b:01.3
2023/11/07 21:31:46 Nvidia device  0000:3b:01.4
2023/11/07 21:31:46 Nvidia device  0000:3b:01.5
2023/11/07 21:31:46 Nvidia device  0000:3b:01.6
2023/11/07 21:31:46 Nvidia device  0000:3b:01.7
2023/11/07 21:31:46 Nvidia device  0000:3b:02.0
2023/11/07 21:31:46 Nvidia device  0000:3b:02.1
2023/11/07 21:31:46 Nvidia device  0000:3b:02.2
2023/11/07 21:31:46 Nvidia device  0000:3b:02.3
2023/11/07 21:31:46 Nvidia device  0000:3b:02.4
2023/11/07 21:31:46 Nvidia device  0000:3b:02.5
2023/11/07 21:31:46 Nvidia device  0000:3b:02.6
2023/11/07 21:31:46 Nvidia device  0000:3b:02.7
2023/11/07 21:31:46 Nvidia device  0000:3b:03.0
2023/11/07 21:31:46 Nvidia device  0000:3b:03.1
2023/11/07 21:31:46 Nvidia device  0000:3b:03.2
2023/11/07 21:31:46 Nvidia device  0000:3b:03.3
2023/11/07 21:31:46 Nvidia device  0000:3b:03.4
2023/11/07 21:31:46 Nvidia device  0000:3b:03.5
2023/11/07 21:31:46 Nvidia device  0000:3b:03.6
2023/11/07 21:31:46 Nvidia device  0000:3b:03.7
2023/11/07 21:31:46 Nvidia device  0000:3b:04.0
2023/11/07 21:31:46 Nvidia device  0000:3b:04.1
2023/11/07 21:31:46 Nvidia device  0000:3b:04.2
2023/11/07 21:31:46 Nvidia device  0000:3b:04.3
2023/11/07 21:31:46 Nvidia device  0000:86:00.0
2023/11/07 21:31:46 Nvidia device  0000:86:00.4
2023/11/07 21:31:46 Nvidia device  0000:86:00.5
2023/11/07 21:31:46 Nvidia device  0000:86:00.6
2023/11/07 21:31:46 Nvidia device  0000:86:00.7
2023/11/07 21:31:46 Nvidia device  0000:86:01.0
2023/11/07 21:31:46 Nvidia device  0000:86:01.1
2023/11/07 21:31:46 Nvidia device  0000:86:01.2
2023/11/07 21:31:46 Nvidia device  0000:86:01.3
2023/11/07 21:31:46 Nvidia device  0000:86:01.4
2023/11/07 21:31:46 Nvidia device  0000:86:01.5
2023/11/07 21:31:46 Nvidia device  0000:86:01.6
2023/11/07 21:31:46 Nvidia device  0000:86:01.7
2023/11/07 21:31:46 Nvidia device  0000:86:02.0
2023/11/07 21:31:46 Nvidia device  0000:86:02.1
2023/11/07 21:31:46 Nvidia device  0000:86:02.2
2023/11/07 21:31:46 Nvidia device  0000:86:02.3
2023/11/07 21:31:46 Nvidia device  0000:86:02.4
2023/11/07 21:31:46 Nvidia device  0000:86:02.5
2023/11/07 21:31:46 Nvidia device  0000:86:02.6
2023/11/07 21:31:46 Nvidia device  0000:86:02.7
2023/11/07 21:31:46 Nvidia device  0000:86:03.0
2023/11/07 21:31:46 Nvidia device  0000:86:03.1
2023/11/07 21:31:46 Nvidia device  0000:86:03.2
2023/11/07 21:31:46 Nvidia device  0000:86:03.3
2023/11/07 21:31:46 Nvidia device  0000:86:03.4
2023/11/07 21:31:46 Nvidia device  0000:86:03.5
2023/11/07 21:31:46 Nvidia device  0000:86:03.6
2023/11/07 21:31:46 Nvidia device  0000:86:03.7
2023/11/07 21:31:46 Nvidia device  0000:86:04.0
2023/11/07 21:31:46 Nvidia device  0000:86:04.1
2023/11/07 21:31:46 Nvidia device  0000:86:04.2
2023/11/07 21:31:46 Nvidia device  0000:86:04.3
2023/11/07 21:31:46 Not a device, continuing
2023/11/07 21:31:46 Gpu id is 0000:86:00.6
2023/11/07 21:31:46 Vgpu id is NVIDIA_A10-8Q
2023/11/07 21:31:46 Gpu id is 0000:3b:00.6
2023/11/07 21:31:46 Vgpu id is NVIDIA_A10-8Q
2023/11/07 21:31:46 Gpu id is 0000:3b:00.4
2023/11/07 21:31:46 Vgpu id is NVIDIA_A10-8Q
2023/11/07 21:31:46 Gpu id is 0000:86:00.5
2023/11/07 21:31:46 Vgpu id is NVIDIA_A10-8Q
2023/11/07 21:31:46 Gpu id is 0000:3b:00.5
2023/11/07 21:31:46 Vgpu id is NVIDIA_A10-8Q
2023/11/07 21:31:46 Gpu id is 0000:86:00.4
2023/11/07 21:31:46 Vgpu id is NVIDIA_A10-8Q
2023/11/07 21:31:46 Iommu Map map[]
2023/11/07 21:31:46 Device Map map[]
2023/11/07 21:31:46 vGPU Map  map[NVIDIA_A10-8Q:[{32e52bb7-29d9-45e7-8aec-b1f15dbcf887} {3c0b19b0-2355-4026-9bc0-7bc2bfad5b79} {5e535b46-ca45-4cb7-b8b6-169791609fc6} {8437039c-1751-4c83-b3d7-32f45928186b} {9b3bda2c-631b-4de4-8aa7-3e5c8c0c505d} {d808afcf-b924-42fe-b29b-781507a3ba52}]]
2023/11/07 21:31:46 GPU vGPU Map  map[0000:3b:00.4:[5e535b46-ca45-4cb7-b8b6-169791609fc6] 0000:3b:00.5:[9b3bda2c-631b-4de4-8aa7-3e5c8c0c505d] 0000:3b:00.6:[3c0b19b0-2355-4026-9bc0-7bc2bfad5b79] 0000:86:00.4:[d808afcf-b924-42fe-b29b-781507a3ba52] 0000:86:00.5:[8437039c-1751-4c83-b3d7-32f45928186b] 0000:86:00.6:[32e52bb7-29d9-45e7-8aec-b1f15dbcf887]]
2023/11/07 21:31:46 Could not find NVIDIA device with id: NVIDIA_A10-8Q
2023/11/07 21:31:46 DP Name NVIDIA_A10-8Q
2023/11/07 21:31:46 Devicename NVIDIA_A10-8Q
2023/11/07 21:31:46 NVIDIA_A10-8Q Device plugin server ready
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): invoked
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Loading NVML
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Failed to initialize NVML: could not load NVML library
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Adding watch for device path: /sys/bus/mdev/devices/32e52bb7-29d9-45e7-8aec-b1f15dbcf887
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Adding watch for device path: /sys/bus/mdev/devices/3c0b19b0-2355-4026-9bc0-7bc2bfad5b79
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Adding watch for device path: /sys/bus/mdev/devices/5e535b46-ca45-4cb7-b8b6-169791609fc6
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Adding watch for device path: /sys/bus/mdev/devices/8437039c-1751-4c83-b3d7-32f45928186b
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Adding watch for device path: /sys/bus/mdev/devices/9b3bda2c-631b-4de4-8aa7-3e5c8c0c505d
2023/11/07 21:31:46 healthCheck(NVIDIA_A10-8Q): Adding watch for device path: /sys/bus/mdev/devices/d808afcf-b924-42fe-b29b-781507a3ba52

@cdesiniotis
Copy link
Contributor

Thanks @shivamerla. We are avoiding the potential glibc version mismatch by installing go + building go binaries in the target base image (in this case the CUDA base image). This aligns with how we build our other containers. For example, k8s-device-plugin: https://gitlab.com/nvidia/kubernetes/device-plugin/-/blob/main/deployments/container/Dockerfile.ubi8

cc @rthallisey

@rthallisey
Copy link
Collaborator

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants