Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD support is completly broken - no load is placed on GPU #1684

Closed
Expro opened this issue Feb 6, 2024 · 6 comments
Closed

AMD support is completly broken - no load is placed on GPU #1684

Expro opened this issue Feb 6, 2024 · 6 comments
Labels
bug Something isn't working unconfirmed

Comments

@Expro
Copy link

Expro commented Feb 6, 2024

LocalAI version:

Tested from 2.6.x to most current commit.

Environment, CPU architecture, OS, and Version:
x64, Fedora 39

Describe the bug
Successfully compiled LocalAI for both CLBlast and HipBlast for lama.cpp backend. DEBUG=true logs says layers are offloaded to GPU, but that's lie - GPU monitoring tools shows 0% load all the time. Placing other workloads on GPU shows load on GPU in monitoring tools, so that's not an issue with monitoring tools.

To Reproduce

  1. Compile LocalAI for CLBlast or HipBlast.
  2. Launch it on machine with AMD GPU, amdgpu driver and ROCm installed.
  3. See that logs says "layers offloaded", but no load is placed on GPU. Only CPU is utilized.

Expected behavior

Load is place placed on GPU.

EDIT: Seems like bug only triggers when build only for single backend. I rebuild LocalAI for all backends and this time it works, despite using same backend as was build when building for single backend.


Considering that AMD GPUs gives cheaper access to bigger pool of VRAM it would be very beneficial to have it properly supported.

@Expro Expro added bug Something isn't working unconfirmed labels Feb 6, 2024
@mudler
Copy link
Owner

mudler commented Feb 7, 2024

@Expro did you set up gpu_layers in the model file? https://localai.io/features/gpu-acceleration/#model-configuration

Maybe we should just expose that option from the CLI, or we should default to a high number as if there is no GPU seems harmless.

@Expro
Copy link
Author

Expro commented Feb 7, 2024

I did setup gpu_layers in model file, despite documentation stating in at least 2 places that gpu_layers is only used with cublast, so not for AMD. I swapped between build for single backend vs multiple backends without touching models and one of them ran on GPU, another one didn't/

Defaulting to high number of layers seems like good idea, so does exposure though environment variable and CLI.

@TheDarkTrumpet
Copy link

I'd also set threads to 0, and ensure your GPU layers is set to something like 100-120 in your case, to keep it off the CPU. I don't have an AMD GPU to reproduce this issue, so taking a bit of a stab in the dark.

@jamiemoller
Copy link

This is the same sort of issue that I have had for a while, this issue is adjacent to #1592

  • Limited success building on metal - it will build on opensuse leap 15.4 and i can run llama.cpp directly, but there is some issue with localai <-> llama.cpp calls that prevents localai from working
  • still no success at all in docker on either deb12, ubu22.04 or opns15.4 - these installs were only ever performed manually
  • recent builds with arch work but fail to execute in the same manner as noted in this issue with 0 gpu usage

I will be rebuilding this workstation soon and may have time after next week to do some more build tests and debugging

@jtwolfe
Copy link
Contributor

jtwolfe commented Jun 1, 2024

This issue should be considered resolved, i have been continuously using my RadeonVII for at least 2 months now without issue

@mudler
Copy link
Owner

mudler commented Jun 1, 2024

Right - closing then, and thanks @jtwolfe for confirming!

@mudler mudler closed this as completed Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

5 participants