Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ROCM aliases for CUDA pool stuff #3918

Merged
merged 1 commit into from
Nov 2, 2023

Conversation

KerfuffleV2
Copy link
Collaborator

@KerfuffleV2 KerfuffleV2 commented Nov 2, 2023

Hey, guess what? ROCM is broken again! The good news is it seems like an easy fix. (BTW, sorry GG, I didn't see your other request for review in time.)

@KerfuffleV2 KerfuffleV2 added bug Something isn't working build Compilation issues AMD GPU Issues specific to AMD GPUs labels Nov 2, 2023
@ggerganov ggerganov merged commit 629f917 into ggerganov:master Nov 2, 2023
27 of 31 checks passed
@KerfuffleV2
Copy link
Collaborator Author

KerfuffleV2 commented Nov 2, 2023

Though it compiles with this, something still seems wrong. I'm not sure if it's ROCM specific. Offloading the last non-repeating layer (KV) produces all NaNs. I.E. Orca 3B model with 26 actual layers is okay offloading up to 28, but 29 doesn't work. Mistral with 32 real layers is okay is okay at 34 but 35 doesn't work.

edit: Actually it seem like it's okay when compiling without LLAMA_FAST. I still don't know if it only affects ROCM though.

@cebtenzzre
Copy link
Collaborator

edit: Actually it seem like it's okay when compiling without LLAMA_FAST. I still don't know if it only affects ROCM though.

Could be related to #2268 (comment)

slaren added a commit that referenced this pull request Nov 4, 2023
ggerganov pushed a commit that referenced this pull request Nov 5, 2023
* Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)"

This reverts commit 629f917.

* Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)"

This reverts commit d606905.

ggml-ci
@KerfuffleV2 KerfuffleV2 deleted the fix-rocm-build branch November 17, 2023 03:12
olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
* Revert "cuda : add ROCM aliases for CUDA pool stuff (ggerganov#3918)"

This reverts commit 629f917.

* Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (ggerganov#3903)"

This reverts commit d606905.

ggml-ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMD GPU Issues specific to AMD GPUs bug Something isn't working build Compilation issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants