AMD support is completly broken - no load is placed on GPU #1684

Expro · 2024-02-06T07:01:44Z

LocalAI version:

Tested from 2.6.x to most current commit.

Environment, CPU architecture, OS, and Version:
x64, Fedora 39

Describe the bug
Successfully compiled LocalAI for both CLBlast and HipBlast for lama.cpp backend. DEBUG=true logs says layers are offloaded to GPU, but that's lie - GPU monitoring tools shows 0% load all the time. Placing other workloads on GPU shows load on GPU in monitoring tools, so that's not an issue with monitoring tools.

To Reproduce

Compile LocalAI for CLBlast or HipBlast.
Launch it on machine with AMD GPU, amdgpu driver and ROCm installed.
See that logs says "layers offloaded", but no load is placed on GPU. Only CPU is utilized.

Expected behavior

Load is place placed on GPU.

EDIT: Seems like bug only triggers when build only for single backend. I rebuild LocalAI for all backends and this time it works, despite using same backend as was build when building for single backend.

Considering that AMD GPUs gives cheaper access to bigger pool of VRAM it would be very beneficial to have it properly supported.

mudler · 2024-02-07T08:43:16Z

@Expro did you set up gpu_layers in the model file? https://localai.io/features/gpu-acceleration/#model-configuration

Maybe we should just expose that option from the CLI, or we should default to a high number as if there is no GPU seems harmless.

Expro · 2024-02-07T12:49:07Z

I did setup gpu_layers in model file, despite documentation stating in at least 2 places that gpu_layers is only used with cublast, so not for AMD. I swapped between build for single backend vs multiple backends without touching models and one of them ran on GPU, another one didn't/

Defaulting to high number of layers seems like good idea, so does exposure though environment variable and CLI.

TheDarkTrumpet · 2024-02-08T10:06:48Z

I'd also set threads to 0, and ensure your GPU layers is set to something like 100-120 in your case, to keep it off the CPU. I don't have an AMD GPU to reproduce this issue, so taking a bit of a stab in the dark.

jamiemoller · 2024-02-14T00:33:40Z

This is the same sort of issue that I have had for a while, this issue is adjacent to #1592

Limited success building on metal - it will build on opensuse leap 15.4 and i can run llama.cpp directly, but there is some issue with localai <-> llama.cpp calls that prevents localai from working
still no success at all in docker on either deb12, ubu22.04 or opns15.4 - these installs were only ever performed manually
recent builds with arch work but fail to execute in the same manner as noted in this issue with 0 gpu usage

I will be rebuilding this workstation soon and may have time after next week to do some more build tests and debugging

jtwolfe · 2024-06-01T08:13:03Z

This issue should be considered resolved, i have been continuously using my RadeonVII for at least 2 months now without issue

mudler · 2024-06-01T13:15:13Z

Right - closing then, and thanks @jtwolfe for confirming!

Expro added bug Something isn't working unconfirmed labels Feb 6, 2024

mudler closed this as completed Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD support is completly broken - no load is placed on GPU #1684

AMD support is completly broken - no load is placed on GPU #1684

Expro commented Feb 6, 2024 •

edited

Loading

mudler commented Feb 7, 2024

Expro commented Feb 7, 2024

TheDarkTrumpet commented Feb 8, 2024

jamiemoller commented Feb 14, 2024

jtwolfe commented Jun 1, 2024

mudler commented Jun 1, 2024

AMD support is completly broken - no load is placed on GPU #1684

AMD support is completly broken - no load is placed on GPU #1684

Comments

Expro commented Feb 6, 2024 • edited Loading

mudler commented Feb 7, 2024

Expro commented Feb 7, 2024

TheDarkTrumpet commented Feb 8, 2024

jamiemoller commented Feb 14, 2024

jtwolfe commented Jun 1, 2024

mudler commented Jun 1, 2024

Expro commented Feb 6, 2024 •

edited

Loading