Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't import vicuna models : (bad f16 value 5) #7

Closed
mh4ckt3mh4ckt1c4s opened this issue May 10, 2023 · 7 comments
Closed

Can't import vicuna models : (bad f16 value 5) #7

mh4ckt3mh4ckt1c4s opened this issue May 10, 2023 · 7 comments

Comments

@mh4ckt3mh4ckt1c4s
Copy link

When I try to load the vicuna models downloaded from this page, I have the following error :

# pyllamacpp /models/ggml-vicuna-7b-1.1-q4_2.bin 


██████╗ ██╗   ██╗██╗     ██╗      █████╗ ███╗   ███╗ █████╗  ██████╗██████╗ ██████╗ 
██╔══██╗╚██╗ ██╔╝██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██╔══██╗██╔══██╗
██████╔╝ ╚████╔╝ ██║     ██║     ███████║██╔████╔██║███████║██║     ██████╔╝██████╔╝
██╔═══╝   ╚██╔╝  ██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██║     ██╔═══╝ ██╔═══╝ 
██║        ██║   ███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║╚██████╗██║     ██║     
╚═╝        ╚═╝   ╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝ ╚═════╝╚═╝     ╚═╝     
                                                                                    

PyLLaMACpp
A simple Command Line Interface to test the package
Version: 2.1.3 

         
=========================================================================================

[+] Running model `/models/ggml-vicuna-7b-1.1-q4_2.bin`
[+] LLaMA context params: `{}`
[+] GPT params: `{}`
llama_model_load: loading model from '/models/ggml-vicuna-7b-1.1-q4_2.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 5
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: invalid model file '/models/ggml-vicuna-7b-1.1-q4_2.bin' (bad f16 value 5)
llama_init_from_file: failed to load model
Segmentation fault (core dumped)

I do not have this problem when using the gpt4all models. Running the vicuna models with the latest version of llama.cpp works just fine.

@pajoma
Copy link
Contributor

pajoma commented May 10, 2023

Same for me with model Pi3141/alpaca-native-7B-ggml

Output from llama.cpp

llama.cpp: loading model from ./models/ggml-model-q5_1.bin
llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 9 (mostly Q5_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  68,20 KB
llama_model_load_internal: mem required  = 6612,58 MB (+ 1026,00 MB per state)
llama_init_from_file: kv self size  = 1024,00 MB

Output from pyllamacpp

[+] Running model `models/ggml-model-q5_1.bin`
[+] LLaMA context params: `{'n_ctx': 2048}`
[+] GPT params: `{}`
llama_model_load: loading model from 'models/ggml-model-q5_1.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 2048
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 9
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: invalid model file 'models/ggml-model-q5_1.bin' (bad f16 value 9)
llama_init_from_file: failed to load model
Segmentation fault

@mh4ckt3mh4ckt1c4s
Copy link
Author

Original page have been archived, but links are still available here : https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-chat

@absadiki
Copy link
Owner

Thanks @mh4ckt3mh4ckt1c4s for reporting the issue.
Maybe there are new updates on the llama.cpp side .. I'll try to sync the repo once I get some time.

@absadiki
Copy link
Owner

Hi guys, I pushed a new release v2.2.0, could you please give it a try ?
I tested it with vicuna and alpaca and both seem to be working on my end ?

@mh4ckt3mh4ckt1c4s
Copy link
Author

Hello, I tested with Vicuna and it works with 2.2.0 but not with the latest 2.3.0. Is that normal ?

@absadiki
Copy link
Owner

Hello, I tested with Vicuna and it works with 2.2.0 but not with the latest 2.3.0. Is that normal ?

@mh4ckt3mh4ckt1c4s Yes it is normal as llama.cpp recent changes broke older models, so you will need to re-quantize the old models to work with the new update.

@mh4ckt3mh4ckt1c4s
Copy link
Author

Okay, so from my point of view this issue is closed. Thanks for your work !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants