Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fix CUDA images and update go-llama to use full GPU offloading #618

Merged
merged 6 commits into from
Jun 18, 2023

Conversation

mudler
Copy link
Owner

@mudler mudler commented Jun 17, 2023

nvcc is not compatible with GCC 12 yet. golang:1.20 images got upgraded to debian bookworm, that uses GCC-12 by default. This PR downgrades images to use bullseye.

Fixes: #611

GCC 12 is not compatible with nvcc yet.

Fixes: #611

Signed-off-by: mudler <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
@mudler mudler marked this pull request as draft June 17, 2023 17:51
@mudler mudler changed the title images: Use gcc-11 with CUDA images feat: fix CUDA images and update go-llama to use full GPU offloading Jun 17, 2023
@mudler mudler added the enhancement New feature or request label Jun 17, 2023
@mudler mudler marked this pull request as ready for review June 17, 2023 22:34
@Aisuko Aisuko self-requested a review June 18, 2023 02:52
Aisuko

This comment was marked as outdated.

@Aisuko Aisuko mentioned this pull request Jun 18, 2023
@Aisuko Aisuko self-requested a review June 18, 2023 03:44
Aisuko
Aisuko previously requested changes Jun 18, 2023
Copy link
Collaborator

@Aisuko Aisuko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested this PR on Linux(x86_64) with docker build -t localai . and docker build --build-arg BUILD_TYPE=cublas localai . but it is still not working very well.

@mudler
Copy link
Owner Author

mudler commented Jun 18, 2023

I have tested this PR on Linux(x86_64) with docker build -t localai . and docker build --build-arg BUILD_TYPE=cublas localai . but it is still not working very well.

Don't get that. What's not working well? CI builds fine.

@mudler mudler dismissed Aisuko’s stale review June 18, 2023 06:27

Works here

@mudler mudler merged commit d3d3187 into master Jun 18, 2023
@mudler mudler deleted the cuda_images branch June 18, 2023 06:27
@Aisuko
Copy link
Collaborator

Aisuko commented Jun 18, 2023

I have tested this PR on Linux(x86_64) with docker build -t localai . and docker build --build-arg BUILD_TYPE=cublas localai . but it is still not working very well.

Don't get that. What's not working well? CI builds fine.

I test it again, it works for me now.

@m4xw
Copy link

m4xw commented Jun 19, 2023

After 6-7 requests i stumbled upon
CUDA error 12 at /build/go-llama/llama.cpp/ggml-cuda.cu:2127: invalid pitch argument
https://github.com/ggerganov/llama.cpp/blob/d411968e990c37f51328849c96a743dd78f3c3dd/ggml-cuda.cu#L2127

Not sure if its a memory issue on my end, since i push it quite a bit right now and it works reliable on the first few ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cublas build broken
3 participants