You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to load Lora training adapters from Deepspeed checkpoint:
dir:
ls Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2/checkpoint-6000
total 696M
-rw-r--r-- 1 schwan46494@gmail.com CU 775 Nov 18 11:03 adapter_config.json
-rw-r--r-- 1 schwan46494@gmail.com CU 686M Nov 18 11:03 adapter_model.safetensors
-rw-rw-r-- 1 schwan46494@gmail.com CU 1.4K Nov 18 16:54 config.json
drwxr-xr-x 2 schwan46494@gmail.com CU 4.0K Nov 18 11:03 global_step6000
-rw-r--r-- 1 schwan46494@gmail.com CU 15 Nov 18 11:03 latest
-rw-r--r-- 1 schwan46494@gmail.com CU 5.1K Nov 18 11:03 README.md
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_0.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_1.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_2.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_3.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_4.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_5.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_6.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 16K Nov 18 11:03 rng_state_7.pth
-rw-r--r-- 1 schwan46494@gmail.com CU 1.1K Nov 18 11:03 scheduler.pt
-rw-r--r-- 1 schwan46494@gmail.com CU 221 Nov 18 11:03 special_tokens_map.json
-rw-r--r-- 1 schwan46494@gmail.com CU 50K Nov 18 11:03 tokenizer_config.json
-rw-r--r-- 1 schwan46494@gmail.com CU 8.7M Nov 18 11:03 tokenizer.json
-rw-r--r-- 1 schwan46494@gmail.com CU 1023K Nov 18 11:03 trainer_state.json
-rw-r--r-- 1 schwan46494@gmail.com CU 6.5K Nov 18 11:03 training_args.bin
-rwxr--r-- 1 schwan46494@gmail.com CU 25K Nov 18 11:03 zero_to_fp32.py
instead of the usual Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2 as I want to perform error analysis of when do my model corrupt.
with this code:
# model_path is Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2/checkpoint-6000
# base_model_path is a bunny variant model.
model = AutoModelForCausalLM.from_pretrained(
base_model_path,
torch_dtype=torch.float16, # float32 for cpu
device_map='auto',
trust_remote_code=True)
model.load_adapter(model_path)
tokenizer = AutoTokenizer.from_pretrained(
model_path,
trust_remote_code=True)
However, I get this warning.
Some weights of the model checkpoint at /home/11001207/chawanP/Teerapol/llama-3-typhoon-v1.5-8b-vision-preview were not used when initializing BunnyLlamaForCausalLM: ['model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.head.attention.in_proj_bias', 'model.vision_tower.vision_tower.vision_model.head.attention.in_proj_weight', 'model.vision_tower.vision_tower.vision_model.head.attention.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.head.attention.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.head.layernorm.bias', 'model.vision_tower.vision_tower.vision_model.head.layernorm.weight', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.head.probe']
This IS expected if you are initializing BunnyLlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BunnyLlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Loading adapter weights from /home/11001207/chawanP/pak/Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2/checkpoint-6000 led to unexpected keys not found in the model: ['model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.lora_B.default.weight'].
Question
Is the model not fusing the vision adapters?
How to load/convert these checkpoints? (Their schema is different, they have no non_lora_trainable.bin, config.json, and more)
The text was updated successfully, but these errors were encountered:
I tried to load Lora training adapters from Deepspeed checkpoint:
dir:
instead of the usual Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2 as I want to perform error analysis of when do my model corrupt.
with this code:
However, I get this warning.
Some weights of the model checkpoint at /home/11001207/chawanP/Teerapol/llama-3-typhoon-v1.5-8b-vision-preview were not used when initializing BunnyLlamaForCausalLM: ['model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.head.attention.in_proj_bias', 'model.vision_tower.vision_tower.vision_model.head.attention.in_proj_weight', 'model.vision_tower.vision_tower.vision_model.head.attention.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.head.attention.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.head.layernorm.bias', 'model.vision_tower.vision_tower.vision_model.head.layernorm.weight', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.head.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.head.probe']
Loading adapter weights from /home/11001207/chawanP/pak/Bunny/checkpoints-llama3-8b/bunny-lora-llama3-8b-attempt2/checkpoint-6000 led to unexpected keys not found in the model: ['model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.lora_B.default.weight'].
Question
The text was updated successfully, but these errors were encountered: