-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cmake option to build without CUDA VMM #6889
Labels
enhancement
New feature or request
Comments
Feel free to open a PR if this is useful for you, I never was able to measure a performance difference. However, make the flag negative so that it is still enabled without additional compilation flags, and use the |
WilliamTambellini
added a commit
to WilliamTambellini/llama.cpp
that referenced
this issue
May 4, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
WilliamTambellini
added a commit
to WilliamTambellini/llama.cpp
that referenced
this issue
May 6, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
github-actions bot
pushed a commit
to KerfuffleV2/ggml-sys-bleedingedge
that referenced
this issue
May 7, 2024
== Relevant log messages from source repo: commit 858f6b73f6e57a62523d16a955d565254be889b4 Author: William Tambellini <william.tambellini@gmail.com> Date: Mon May 6 11:12:14 2024 -0700 Add an option to build without CUDA VMM (#7067) Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
teleprint-me
pushed a commit
to teleprint-me/llama.cpp
that referenced
this issue
May 7, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
ggerganov
pushed a commit
to ggerganov/ggml
that referenced
this issue
May 11, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
ggerganov
pushed a commit
to ggerganov/ggml
that referenced
this issue
May 11, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
ggerganov
pushed a commit
to ggerganov/whisper.cpp
that referenced
this issue
May 12, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
ggerganov
pushed a commit
to ggerganov/whisper.cpp
that referenced
this issue
May 13, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
iThalay
pushed a commit
to iThalay/whisper.cpp
that referenced
this issue
Sep 23, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
iThalay
pushed a commit
to iThalay/whisper.cpp
that referenced
this issue
Sep 23, 2024
Add an option to build ggml cuda without CUDA VMM resolves ggerganov/llama.cpp#6889 https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
As today ggml-cuda.cu tries to take advantage of cuda VMM if possible:
llama.cpp/ggml-cuda.cu
Line 116 in 784e11d
This is not necessarilly possible/desired eg:
https://forums.developer.nvidia.com/t/potential-nvshmem-allocated-memory-performance-issue/275416/4
Would you mind if I add a cmake/make option/define in order to build ggml-cuda with(default) or without VMM support ?
Best
WT
Motivation
In order to build llamacpp/ggml without vmm.
Possible Implementation
Adding a cmake arg 'GGML_USE_CUDA_VMM' (default ON)
and then line 115:
The text was updated successfully, but these errors were encountered: