Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NO SHARED MEMORY FOR YEARS [NVIDIA_UVM] - BASIC FEATURE #663

Closed
2 tasks
bioluks opened this issue Jun 14, 2024 · 9 comments
Closed
2 tasks

NO SHARED MEMORY FOR YEARS [NVIDIA_UVM] - BASIC FEATURE #663

bioluks opened this issue Jun 14, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@bioluks
Copy link

bioluks commented Jun 14, 2024

NVIDIA Open GPU Kernel Modules Version

550.90.07 (latest)

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Multiple Setups (10+), for now on Arch

Kernel Release

multiple ones, right now on "6.9.3-hardened1-1-hardened"

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

NVIDIA GeForce GTX 1050 Ti

Describe the bug

This is ignored everywhere by NVIDIA employees and devs. Since 2016 we have no solution (I'm sure it was like this even before 2016). Do we need viral tweets and Reddit posts here and there bashing the company so they listen to us at all?

NVIDIA_UVM is not working even when loaded, checked via lsmod, also on a 30 series RTX. "nvidia-modprobe" does nothing. There is no dmesg to show since everything loads successfully. If the VRAM is full there is no backup option (no shared RAM like in Windows systems). We have high end graphics cards with very low VRAM, and it's slowly starting to become a fact they were produced this way on purpose.

I'm obviously annoyed. It's 2024. All other known GPU brands (AMD , Intel) don't have this issue; shared memory works just fine. It's a basic feature that should just work, just like in Windows. The NVIDIA driver still has the most annoying issues on Linux, we know you don't care about Linux users. Wayland issues, late incoming optimus support on laptops etc, you name it. If you hate open source this much don't publish the driver at all and stop further updates. From now on I will vote with my wallet (I know this won't change anything), the internet is begging you for bug fixes and you not caring just shows how you all think we have no alternative out there. For anyone here looking for fixes (there are none at the moment) check out:

  1. NVIDIA Forum Post from 2016 about this very issue
  2. Same issue on a 2023 NVIDIA Forum post with details
  3. Someone also raised this issue in the Discussions, but again. Dead silence.

No error logs are needed at this point, it's known shared memory (nvidia_uvm - unified shared memory) simply does not work.

If you don't want to buy an expensive GPU from NVIDIA, your only bet is to use Windows so your Games/Apps do not crash twhen your VRAM is full. The nvidia_uvm you see in lsmod acts like a placeholder for an empty file. Buy an AMD or Intel GPU for now. Like Linus said this is the worst company they had to deal with.

So the question is when this advertised as working feature of yours will start to work at all?

To Reproduce

Just install the latest proprietary driver and for once test the driver yourself as a dev. NVIDIA_UVM does not work, and if it works you used hidden parameters not known to us. Like mentioned below the nvidia-bug-report.sh script does not work, no matter which parameter passed.

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.sh is not working no matter what I do, tried the safe mode parameter, reboot etc. Of course ran as root. You have bigger issues if this is even hanging.
Since I do not know if a bot/AI manages these issues I will upload an empty log.gz file.
nvidia-bug-report.log.gz

More Info

You know the problem better than me. Please check the links I posted. Important forum posts like these should at least get an answer.

@bioluks bioluks added the bug Something isn't working label Jun 14, 2024
@cngkyt
Copy link

cngkyt commented Jun 26, 2024

I bought multiple nvidia cards for a business and they are in rubbish bin now.
I have to use windows or i have to use AMD or INTEL cards instead of this rubbish cards
They dont have this feature and they wont in a short time.
They dont care non profit developments
DONT BUY NVIDIA

@bioluks
Copy link
Author

bioluks commented Jul 15, 2024

This issue is still getting ignored like I said before. I wonder how long NVIDIA will dodge enabling this feature we should now have for years. It seems the AI wave made them ignore everything else. We are not even getting answers here.

I don't think anyone will buy the "lacking manpower/budget/time" argument anymore looking at the NVIDIA profits for the last 6 months.

We won't be running Windows servers, there are always alternatives.

@MishaProductions
Copy link

Encountering the same issue as well. I am using KDE 6.1 with Wayland, and launching any 3d game that uses a lot of VRAM causes this issue, works fine on Windows 10.

Error log:

Sep 16 09:11:37 laptop-misha brave[12123]: src/gbm_drv_common.c:131: GBM-DRV error (get_bytes_per_component): Unknown or not supported format: 808530000
Sep 16 09:11:37 laptop-misha brave[12123]: src/gbm_drv_common.c:131: GBM-DRV error (get_bytes_per_component): Unknown or not supported format: 808530000
Sep 16 09:11:37 laptop-misha brave[12123]: src/gbm_drv_common.c:131: GBM-DRV error (get_bytes_per_component): Unknown or not supported format: 808530000
Sep 16 09:11:37 laptop-misha brave[12123]: src/gbm_drv_common.c:131: GBM-DRV error (get_bytes_per_component): Unknown or not supported format: 808530000
Sep 16 09:11:37 laptop-misha brave[12123]: src/gbm_drv_common.c:131: GBM-DRV error (get_bytes_per_component): Unknown or not supported format: 808530000
Sep 16 09:11:37 laptop-misha kwin_wayland[11127]: kf.windowsystem: static bool KX11Extras::mapViewport() may only be used on X11
Sep 16 09:11:38 laptop-misha wpa_supplicant[1041]: wlp2s0: CTRL-EVENT-SIGNAL-CHANGE above=0 signal=-81 noise=9999 txrate=103200
Sep 16 09:11:38 laptop-misha kwin_wayland[11127]: kwin_scene_opengl: 0x501: GL_INVALID_VALUE error generated. <levels>, <width> and <height> must be 1 or greater.
Sep 16 09:11:38 laptop-misha kwin_wayland[11127]: kwin_scene_opengl: Invalid framebuffer status:  "GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT"
Sep 16 09:11:38 laptop-misha kwin_wayland[11127]: kwin_scene_opengl: 0x502: GL_INVALID_OPERATION error generated. Framebuffer name must be generated before being bound.
Sep 16 09:11:38 laptop-misha kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
Sep 16 09:11:38 laptop-misha kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
Sep 16 09:11:38 laptop-misha kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
Sep 16 09:11:38 laptop-misha kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
Sep 16 09:11:38 laptop-misha kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
Sep 16 09:11:38 laptop-misha kernel: [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
Sep 16 09:11:38 laptop-misha kwin_wayland_wrapper[11191]: src/nv_gbm.c:123: GBM-DRV error (nv_gbm_bo_create): DRM_IOCTL_NVIDIA_GEM_ALLOC_NVKMS_MEMORY failed (ret=-1)
Sep 16 09:11:38 laptop-misha kwin_wayland_wrapper[11191]: src/nv_gbm.c:123: GBM-DRV error (nv_gbm_bo_create): DRM_IOCTL_NVIDIA_GEM_ALLOC_NVKMS_MEMORY failed (ret=-1)
Sep 16 09:11:38 laptop-misha kwin_wayland_wrapper[11191]: src/nv_gbm.c:123: GBM-DRV error (nv_gbm_bo_create): DRM_IOCTL_NVIDIA_GEM_ALLOC_NVKMS_MEMORY failed (ret=-1)
Sep 16 09:11:38 laptop-misha kwin_wayland_wrapper[11191]: src/nv_gbm.c:123: GBM-DRV error (nv_gbm_bo_create): DRM_IOCTL_NVIDIA_GEM_ALLOC_NVKMS_MEMORY failed (ret=-1)
Sep 16 09:11:38 laptop-misha kwin_wayland_wrapper[11191]: src/nv_gbm.c:123: GBM-DRV error (nv_gbm_bo_create): DRM_IOCTL_NVIDIA_GEM_ALLOC_NVKMS_MEMORY failed (ret=-1)
Sep 16 09:11:38 laptop-misha kwin_wayland_wrapper[11191]: src/nv_gbm.c:123: GBM-DRV error (nv_gbm_bo_create): DRM_IOCTL_NVIDIA_GEM_ALLOC_NVKMS_MEMORY failed (ret=-1)

This is ridiculous of how NVIDIA is treating its customers. I will no longer buy anything with the word "NVIDIA" in it or recommend others this company. I highly doubt spending a little money to improve their Linux drivers would cause them to no longer be the "most valuable company in the world". However, who knows, maybe Microsoft is paying off NVIDIA to make their Linux drivers worse.

@Hellzbellz123
Copy link

cant play any multiplayer games because my computer randomly crashes and locks up when my 1500$ gpu runs out of vram, wish the 64gb of system memory thats always empty was useable by the nv driver so this stopped happening. def learned my lesson

@MishaProductions
Copy link

Yeah, I learned my lesson too, never buying anything with the "nvidia" logo ever again.

@v1993
Copy link

v1993 commented Sep 27, 2024

FWIW: nvidia_uvm does provide shared memory - for CUDA, that is:

NVIDIA Unified Memory kernel module (/lib/modules/uname -r/kernel/drivers/video/nvidia-uvm.ko); this kernel module provides functionality for sharing memory between the CPU and GPU in CUDA programs. It is generally loaded into the kernel when a CUDA program is started, and is used by the CUDA driver on supported platforms.

I don't think Nvidia claims otherwise anywhere? I guess from their perspective the Linux GPU compute market is generally more important than desktop one and I do wonder if NVK and the family will be the better option for non-CUDA usecases soon enough.

@MishaProductions
Copy link

MishaProductions commented Sep 28, 2024

This doesn't appear to be an kernel issue, but with the nvidia drivers specifically. After switching to a computer with an AMD gpu, the issue is gone. When I used my laptop with an nvidia gpu, Nouveau worked pretty well for me and had better performance compared to the closed source junk, but USB-C DSC is unsupported.

@mtijanic
Copy link
Collaborator

I think it's time to close this now (yes, yes I know).

As mentioned many times before, this is a repo for kernel modules, monitored by developers working on the kernel modules, and the only issues that belong here are bug reports relating to kernel modules. This is a feature request, rather than a bug report, and one that has no kernel component. nvidia_uvm.ko is not relevant here.

The proper place to make these requests is the forums, where the overall end user sentiment is collected and sent up the management chain to someone who has the power to prioritize work on a feature. The developers on this repo cannot do this.

@martynhare
Copy link

In my opinion, this bug report was closed in error.

This report is for a kernel component and it's not a feature request, it's people reporting that the new kernel module is lacking standard DRM functionality (namely GTT support). It would be similar to a bug report if the open kernel module failed to set the GPU clock speed beyond its initial performance state (user rightfully expects it to work).

NVIDIA used to have this working in the proprietary drivers for GeForce TurboCache support about 12 years ago and any long term buyer of NVIDIA's products expects this to still work on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants