Vicuna Models checkpoints transfer script #1657

sineeli · 2024-05-28T18:23:44Z

Successfully converted checkpoints from vicuna(torch) to Keras3 compatible, please let me know if any refactoring needed.

Thanks

sineeli · 2024-05-29T22:43:07Z

cc: @mattdangerw

mattdangerw

Looks good! A couple comments...

Next up, you could try uploading these to your individual user on Kaggle, and making a PR that updates our presets file here -> https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/src/models/llama/llama_presets.py

That would give us all a way to test the vicuna models end to end, then we can copy them to the Keras org on Kaggle when they look good.

Thanks!

mattdangerw · 2024-06-04T21:54:31Z

tools/checkpoint_conversion/convert_vicuna_checkpoints.py

+        print("\n-> Saved the tokenizer")
+
+        # === Upload the preset ===
+        uri = f"kaggle://keras/vicuna/keras/{preset}"


let's do this like the phi3 script

keras-nlp/tools/checkpoint_conversion/convert_phi3_checkpoints.py

Line 29 in 38c6608

from keras_nlp import upload_preset # noqa: E402

That will still allow people to run this who do not have access to the keras kaggle org.

mattdangerw · 2024-06-04T21:58:38Z

tools/checkpoint_conversion/convert_vicuna_checkpoints.py

+from keras_nlp.models import LlamaCausalLMPreprocessor
+from keras_nlp.models import LlamaTokenizer
+
+PRESET_MAP = {"vicuna_1.5_7b_en": "lmsys/vicuna-7b-v1.5"}


Is the weight conversion all the same as llama 2? If so could we consider consolidating the conversion scripts?

Yes the weights are same llam2 architecture, we can merge with existing script. I will try that. Thanks!

sineeli · 2024-06-05T22:39:14Z

@mattdangerw

When we run on cpu and use .numpy() at the end of the weights it causes an error: Bfloat16 scalar not surpported but when we see phi3 script there is no such usage and at global level the backend set to torch and he is loading the weights directly.

LLam2 all weights are in torch so and we can use the same approach as phi3 weights convertion script.

phi3 seems more fault tolerant

When tested on CPU from bfloat16 to bfloat16:

But when used float16(default of hugging face weights) to float32(keras model) we will not face this mismatch.

Thanks

mattdangerw · 2024-06-06T21:19:44Z

When we run on cpu and use .numpy() at the end of the weights it causes an error: Bfloat16 scalar not surpported but when we see phi3 script there is no such usage and at global level the backend set to torch and he is loading the weights directly.

@sineeli I think keras.ops.convert_to_numpy(x) would gracefully handle bfloat16, maybe try that?

With what precision are the original pytorch checkpoints stored on disk? If they are at float16, we could just do the same (and store at float16 on disk). The disk format does not mean we need to load at that format.

Anyway, as soon as you push with the comments addressed above we can merge this PR. And keep working on the actual checkpoints we ship.

mattdangerw · 2024-06-07T23:24:44Z

@sineeli can you make your kaggle model public? I'll pull in the script but leave the new preset off for now, we can do that on a follow up PR.

sineeli · 2024-06-08T00:19:32Z

@sineeli can you make your kaggle model public? I'll pull in the script but leave the new preset off for now, we can do that on a follow up PR.

Sure, waiting for page update. Thank!

https://www.kaggle.com/models/sineeli/vicuna/keras/vicuna_1.5_7b_en

sineeli added 2 commits May 20, 2024 15:14

Add Vicuna tokenizer and preset

14f7fbd

Add vicuna tokenizer and preset

e2d1b55

chunduriv requested a review from mattdangerw May 28, 2024 18:27

sineeli added 2 commits May 28, 2024 11:36

Sort the imports as per isort lib

cc362f2

fix lint errors

9477c1b

mattdangerw reviewed Jun 4, 2024

View reviewed changes

Merge branch 'keras-team:master' into master

bee707d

sineeli added 2 commits June 7, 2024 15:10

Add vicuna preset to llam2

4eaaa92

remove separate vicuna checkpoint script

256ef8d

indentation fix

55db114

mattdangerw force-pushed the master branch from b30dbba to 55db114 Compare June 7, 2024 23:25

mattdangerw merged commit 50e0414 into keras-team:master Jun 7, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vicuna Models checkpoints transfer script #1657

Vicuna Models checkpoints transfer script #1657

sineeli commented May 28, 2024 •

edited

Loading

sineeli commented May 29, 2024

mattdangerw left a comment

mattdangerw Jun 4, 2024

mattdangerw Jun 4, 2024

sineeli Jun 5, 2024

sineeli commented Jun 5, 2024 •

edited

Loading

mattdangerw commented Jun 6, 2024

mattdangerw commented Jun 7, 2024

sineeli commented Jun 8, 2024 •

edited

Loading

Vicuna Models checkpoints transfer script #1657

Vicuna Models checkpoints transfer script #1657

Conversation

sineeli commented May 28, 2024 • edited Loading

sineeli commented May 29, 2024

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Jun 4, 2024

Choose a reason for hiding this comment

mattdangerw Jun 4, 2024

Choose a reason for hiding this comment

sineeli Jun 5, 2024

Choose a reason for hiding this comment

sineeli commented Jun 5, 2024 • edited Loading

mattdangerw commented Jun 6, 2024

mattdangerw commented Jun 7, 2024

sineeli commented Jun 8, 2024 • edited Loading

sineeli commented May 28, 2024 •

edited

Loading

sineeli commented Jun 5, 2024 •

edited

Loading

sineeli commented Jun 8, 2024 •

edited

Loading