Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument #116

Open
nnurlan008 opened this issue Nov 4, 2023 · 4 comments
Open

Comments

@nnurlan008
Copy link

nnurlan008 commented Nov 4, 2023

Hello,

I am trying to install nv_peer_memory module to my machine with the following specifications:

OS: Ubuntu 22.04.1
GPU: Nvidia Tesla K40c
Nvidia Driver Version: 470.199.02:
MLNX Driver Version: MLNX_OFED_LINUX-23.07-0.5.1.2
RNIC: Mellanox Connectx-4

I get the following error when I run sudo dpkg -i nvidia-peer-memory-dkms_1.2-0_all.deb:
depmod... modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument dpkg: error processing package nvidia-peer-memory-dkms (--install): installed nvidia-peer-memory-dkms package post-installation script subprocess returned error exit status 1 Errors were encountered while processing: nvidia-peer-memory-dkms

output of ls -l /lib/modules:
total 12
drwxr-xr-x 2 root root 4096 Nov 1 21:30 5.17.0-1035-oem
drwxr-xr-x 5 root root 4096 Nov 1 21:30 6.2.0-26-generic
drwxr-xr-x 6 root root 4096 Nov 3 21:48 6.2.0-36-generic

output of ls -l /usr/src/ofa_kernel/:
total 4
lrwxrwxrwx 1 root root 16 Nov 3 21:48 default -> 6.2.0-36-generic
drwxr-xr-x 3 root root 4096 Nov 3 17:38 x86_64

Can you please help me solve this issue?

Thanks and regards

@nelsonsilva94
Copy link

Hi,

Does anyone have any suggestion for this?
I am facing the same problem

@nnurlan008
Copy link
Author

I solved this issue by installing ubuntu 20.04 and nividia driver 470.

@javo9205
Copy link

Hi, I am running into the same issue. My machine has the following specifications

Property Value
OS Ubuntu 22.04.2
Kernel 6.5.0-41-generic
GPU NVIDIA GeForce GTX 1660
Driver NVIDIA UNIX Open Kernel Module for x86_64 555.42.02
MLNX MLNX_OFED_LINUX-23.10-2.1.3.1

I get the same modprobe error. dmesg spits out:

[ 4973.941875] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[ 4973.941879] nv_peer_mem: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[ 4973.941895] nv_peer_mem: disagrees about version of symbol nvidia_p2p_get_pages
[ 4973.941896] nv_peer_mem: Unknown symbol nvidia_p2p_get_pages (err -22)
[ 4973.941905] nv_peer_mem: disagrees about version of symbol nvidia_p2p_put_pages
[ 4973.941907] nv_peer_mem: Unknown symbol nvidia_p2p_put_pages (err -22)
[ 4973.941930] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_map_pages
[ 4973.941931] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[ 4973.941940] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[ 4973.941941] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[ 4973.941949] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[ 4973.941950] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)

Any assistance would be appreciated!

@nnurlan008
Copy link
Author

Hi,

There is a module called nvidia-peermem which is the same module as nv_peer_mem and provided in the proprietary drivers with version >= 470.
Use sudo modprobe nvidia-peermem to manually the load the module.

But if you specifically want to use nv_peer_mem, I think you will need to downgrade nvidia driver to nividia driver 470 and ubuntu 20, which worked in my case.

Hope this is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants