With tied embeddings adapter merged to tied layers #2018

ltoniazzi · 2024-08-19T13:09:21Z

System Info

peft=0.12.0
transformers =4.44.0

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

With Gemma2, a model where tie_word_embeddings = True, using target_modules=["lm_head"] and merging the adapter leads to merging the adapter to the tied/embedding layer, which is incorrect.

from transformers import AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
# model.config.tie_word_embeddings = False # doing this does not change the outcome

config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["lm_head"],
    bias="none",
    task_type="CAUSAL_LM",
    init_lora_weights=False,
)
clone_lm_head_orig = model.lm_head.weight.data.clone()
model = get_peft_model(model, config)

# Check embed_tokens and lm_head point to the same data
assert model.model.model.embed_tokens.weight.data.data_ptr() == model.model.lm_head.weight.data.data_ptr()

# Merge adapter in the base model
model.merge_and_unload(safe_merge=True, adapter_names=["default"])

# Check adapter is merged
assert not torch.equal(clone_lm_head_orig, model.model.lm_head.weight.data)

# Check embedding layer is unchanged by the lm_head adapter merging
assert model.model.model.embed_tokens.weight.data.data_ptr() != model.model.lm_head.weight.data.data_ptr(), "Embedding layer should have not changed"

Expected behavior

I think that merging should not succeed silently, but either a:

succeed and a warning is raised, that you are merging adapters to layers that are tied to other layers,
error raised,
merging should be performed correctly (though it might complicate saving the model as a new layer will be present).

Related issues

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2024-08-19T15:58:25Z

Thanks for opening this issue. Yes, I agree that this is an easy source of errors, and having a warning would help.

The main reason why this is not implemented yet is that merging is a layer-level operation in PEFT. The individual layer can, however, not know if its weights are tied are not. Therefore, we cannot easily check for this. It could be possible to refactor this to work differently but I don't see an easy way.

We could still try to make an educated guess based on model.config.tie_word_embeddings and the actual target_modules and that should help most users who face this situation. If you are interested in working on this, feel free to create a PR. Otherwise, I'll put this on the backlog and work on this when I have a bit of time on my hands.

Make the B matrix non-zero

This can also be achieved by passing init_lora_weights=False to the LoraConfig :)

ltoniazzi · 2024-08-19T22:43:51Z

If you are interested in working on this, feel free to create a PR.

Yes sure, happy to have a go at it later this week!

BenjaminBossan · 2024-08-20T08:57:17Z

Fantastic, thanks. Don't hesitate to ask me if something is unclear, or to create a draft PR for early feedback.

github-actions · 2024-09-18T15:04:01Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

BenjaminBossan · 2024-09-19T09:26:45Z

Resolved via #2025.

ltoniazzi changed the title ~~With tied embeddings lora adapter merged to tied layers~~ With tied embeddings adapter merged to tied layers Aug 19, 2024

ltoniazzi mentioned this issue Aug 19, 2024

Bug: Gemma2 adapter weights lm_head skipped on gguf conversion ggerganov/llama.cpp#9065

Closed

ltoniazzi mentioned this issue Aug 20, 2024

Warn if using tied target module with tie_word_embeddings #2025

Merged

3 tasks

BenjaminBossan closed this as completed Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With tied embeddings adapter merged to tied layers #2018

With tied embeddings adapter merged to tied layers #2018

ltoniazzi commented Aug 19, 2024 •

edited

Loading

BenjaminBossan commented Aug 19, 2024

ltoniazzi commented Aug 19, 2024

BenjaminBossan commented Aug 20, 2024

github-actions bot commented Sep 18, 2024

BenjaminBossan commented Sep 19, 2024

With tied embeddings adapter merged to tied layers #2018

With tied embeddings adapter merged to tied layers #2018

Comments

ltoniazzi commented Aug 19, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Related issues

BenjaminBossan commented Aug 19, 2024

ltoniazzi commented Aug 19, 2024

BenjaminBossan commented Aug 20, 2024

github-actions bot commented Sep 18, 2024

BenjaminBossan commented Sep 19, 2024

ltoniazzi commented Aug 19, 2024 •

edited

Loading