-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Gemma-2 not supported on b3262 #8195
Comments
can you show your build log? i can confirm it works for me (though not specifically b3262, will update and verify), so wondering if it cached some build stuff and you didn't actually get latest |
gemma-2-9b-it works fine for me - Q6_K quant, converted and launched using llama.cpp b3259. |
The name of the server binary has been changed to |
@nmandic78 Line 14 in 26a39bb
try |
Oh, I feel so stupid now :D Should have read that. Thank you! |
gemma 9b not work lama.cpp b3259 llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'gemma2' |
i still see following when running gemma2 27b q4_k_m of https://huggingface.co/bartowski/gemma-2-27b-it-GGUF after updating to latest commits llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'gemma2' |
I just tested build from current master (d0a7145) using freshly downloaded Q3_K_S quant from this repo, launched like this: |
@wesleysanjose make sure you're using llama-cli and have all the latest binaries |
@bartowski1182 I use server to launch openai compatible server, what's the difference? I always uses that ./server -m $1 --n-gpu-layers $2 -c $3 --host 192.168.0.184 --port 5000 -b 4096 -to 120 -ts 20,6 |
The binary names were updated a few weeks ago so you're using the old ones that have been sitting around It should be ./llama-server |
that works, thank you so much @bartowski1182 |
I'm getting this error when attempting to quantize a bf16 GGUF after building on Windows. I'm tacking this on here, as a fix may be related. $ ./llama-cli --version And yet: Eventually ends with: |
@jim-plus the quantize binary was also renamed alongside main & server, with the same give |
Ah, that did it (along with clearing out all the old binaries for a clean rebuild). Thanks! |
What happened?
I pulled and built b3262, but when loading the model (both server and cli) I get response that gemma2 is unknown architecture.
$ git log -1 --oneline
38373cf (HEAD -> master, tag: b3262, origin/master, origin/HEAD) Add SPM infill support (#8016)
Looking at release notes, I expected it to be supported from 2 releases before:
b3259
llama: Add support for Gemma2ForCausalLM (#8156)
Inference support for Gemma 2 model family
Am I missing something (as I don't see anybody else complaining)?
Name and Version
version: 3262 (38373cf)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
The text was updated successfully, but these errors were encountered: