Add cmake option to build without CUDA VMM #6889

WilliamTambellini · 2024-04-25T00:12:28Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[X ] I carefully followed the README.md.
[X ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[X ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

As today ggml-cuda.cu tries to take advantage of cuda VMM if possible:

llama.cpp/ggml-cuda.cu

Line 116 in 784e11d

CUdevice device;

This is not necessarilly possible/desired eg:
https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4

Would you mind if I add a cmake/make option/define in order to build ggml-cuda with(default) or without VMM support ?

Best
WT

Motivation

In order to build llamacpp/ggml without vmm.

Possible Implementation

Adding a cmake arg 'GGML_USE_CUDA_VMM' (default ON)
and then line 115:

#if !defined(GGML_USE_HIPBLAS) && defined(GGML_USE_CUDA_VMM)
        CUdevice device;
        CU_CHECK(cuDeviceGet(&device, id));
        CU_CHECK(cuDeviceGetAttribute(&device_vmm, CU_DEVICE_ATTRIBUTE_VIRTUAL_MEMORY_MANAGEMENT_SUPPORTED, device));
     ...

The text was updated successfully, but these errors were encountered:

slaren · 2024-04-25T00:20:28Z

Feel free to open a PR if this is useful for you, I never was able to measure a performance difference. However, make the flag negative so that it is still enabled without additional compilation flags, and use the GGML_CUDA prefix (eg. something like GGML_CUDA_NO_VMM).

Add an option to build ggml cuda without CUDA VMM resolves ggerganov#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4

== Relevant log messages from source repo: commit 858f6b73f6e57a62523d16a955d565254be889b4 Author: William Tambellini <william.tambellini@gmail.com> Date: Mon May 6 11:12:14 2024 -0700 Add an option to build without CUDA VMM (#7067) Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4

Add an option to build ggml cuda without CUDA VMM resolves ggerganov#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4

Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4

WilliamTambellini added the enhancement New feature or request label Apr 25, 2024

WilliamTambellini mentioned this issue May 4, 2024

Add an option to build without CUDA VMM #7067

Merged

slaren closed this as completed in 858f6b7 May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cmake option to build without CUDA VMM #6889

Add cmake option to build without CUDA VMM #6889

WilliamTambellini commented Apr 25, 2024 •

edited

Loading

slaren commented Apr 25, 2024

Add cmake option to build without CUDA VMM #6889

Add cmake option to build without CUDA VMM #6889

Comments

WilliamTambellini commented Apr 25, 2024 • edited Loading

Prerequisites

Feature Description

Motivation

Possible Implementation

slaren commented Apr 25, 2024

WilliamTambellini commented Apr 25, 2024 •

edited

Loading