feat: fix CUDA images and update go-llama to use full GPU offloading #618

mudler · 2023-06-17T13:29:31Z

nvcc is not compatible with GCC 12 yet. golang:1.20 images got upgraded to debian bookworm, that uses GCC-12 by default. This PR downgrades images to use bullseye.

Fixes: #611

GCC 12 is not compatible with nvcc yet. Fixes: #611 Signed-off-by: mudler <mudler@localai.io>

Signed-off-by: mudler <mudler@localai.io>

Aisuko

I have tested this PR on Linux(x86_64) with docker build -t localai . and docker build --build-arg BUILD_TYPE=cublas localai . but it is still not working very well.

mudler · 2023-06-18T06:27:07Z

I have tested this PR on Linux(x86_64) with docker build -t localai . and docker build --build-arg BUILD_TYPE=cublas localai . but it is still not working very well.

Don't get that. What's not working well? CI builds fine.

Works here

Aisuko · 2023-06-18T11:28:07Z

I have tested this PR on Linux(x86_64) with docker build -t localai . and docker build --build-arg BUILD_TYPE=cublas localai . but it is still not working very well.

Don't get that. What's not working well? CI builds fine.

I test it again, it works for me now.

m4xw · 2023-06-19T11:37:36Z

After 6-7 requests i stumbled upon
CUDA error 12 at /build/go-llama/llama.cpp/ggml-cuda.cu:2127: invalid pitch argument
https://github.com/ggerganov/llama.cpp/blob/d411968e990c37f51328849c96a743dd78f3c3dd/ggml-cuda.cu#L2127

Not sure if its a memory issue on my end, since i push it quite a bit right now and it works reliable on the first few ones

mudler force-pushed the cuda_images branch from 3c99c44 to 3fb7091 Compare June 17, 2023 13:38

images: Use gcc-11 with CUDA images

9fc236e

GCC 12 is not compatible with nvcc yet. Fixes: #611 Signed-off-by: mudler <mudler@localai.io>

mudler force-pushed the cuda_images branch from 3fb7091 to 9fc236e Compare June 17, 2023 13:41

fix: go mod tidy

36fb792

Signed-off-by: mudler <mudler@localai.io>

mudler force-pushed the cuda_images branch from c51707e to 36fb792 Compare June 17, 2023 17:10

mudler marked this pull request as draft June 17, 2023 17:51

mudler and others added 2 commits June 17, 2023 23:22

Merge branch 'master' into cuda_images

1507bb1

Switch to bullseye

9b498dc

mudler changed the title ~~images: Use gcc-11 with CUDA images~~ feat: fix CUDA images and update go-llama to use full GPU offloading Jun 17, 2023

mudler added the enhancement New feature or request label Jun 17, 2023

Bump llama.cpp to support full CUDA offload

a9559fa

mudler force-pushed the cuda_images branch from 0a67428 to a9559fa Compare June 17, 2023 22:19

mudler marked this pull request as ready for review June 17, 2023 22:34

Fix tests on macOS

7493461

Aisuko self-requested a review June 18, 2023 02:52

This comment was marked as outdated.

Sign in to view

Aisuko mentioned this pull request Jun 18, 2023

Docker build failed #610

Closed

Aisuko self-requested a review June 18, 2023 03:44

Aisuko previously requested changes Jun 18, 2023

View reviewed changes

mudler merged commit d3d3187 into master Jun 18, 2023

mudler deleted the cuda_images branch June 18, 2023 06:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fix CUDA images and update go-llama to use full GPU offloading #618

feat: fix CUDA images and update go-llama to use full GPU offloading #618

mudler commented Jun 17, 2023 •

edited

Loading

This comment was marked as outdated.

Aisuko left a comment •

edited

Loading

mudler commented Jun 18, 2023

Aisuko commented Jun 18, 2023

m4xw commented Jun 19, 2023

feat: fix CUDA images and update go-llama to use full GPU offloading #618

feat: fix CUDA images and update go-llama to use full GPU offloading #618

Conversation

mudler commented Jun 17, 2023 • edited Loading

This comment was marked as outdated.

Aisuko left a comment • edited Loading

Choose a reason for hiding this comment

mudler commented Jun 18, 2023

Aisuko commented Jun 18, 2023

m4xw commented Jun 19, 2023

mudler commented Jun 17, 2023 •

edited

Loading

Aisuko left a comment •

edited

Loading