Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With tied embeddings adapter merged to tied layers #2018

Closed
2 of 4 tasks
ltoniazzi opened this issue Aug 19, 2024 · 5 comments
Closed
2 of 4 tasks

With tied embeddings adapter merged to tied layers #2018

ltoniazzi opened this issue Aug 19, 2024 · 5 comments

Comments

@ltoniazzi
Copy link
Contributor

ltoniazzi commented Aug 19, 2024

System Info

peft=0.12.0
transformers =4.44.0

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

With Gemma2, a model where tie_word_embeddings = True, using target_modules=["lm_head"] and merging the adapter leads to merging the adapter to the tied/embedding layer, which is incorrect.

from transformers import AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
# model.config.tie_word_embeddings = False # doing this does not change the outcome

config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["lm_head"],
    bias="none",
    task_type="CAUSAL_LM",
    init_lora_weights=False,
)
clone_lm_head_orig = model.lm_head.weight.data.clone()
model = get_peft_model(model, config)

# Check embed_tokens and lm_head point to the same data
assert model.model.model.embed_tokens.weight.data.data_ptr() == model.model.lm_head.weight.data.data_ptr()

# Merge adapter in the base model
model.merge_and_unload(safe_merge=True, adapter_names=["default"])

# Check adapter is merged
assert not torch.equal(clone_lm_head_orig, model.model.lm_head.weight.data)

# Check embedding layer is unchanged by the lm_head adapter merging
assert model.model.model.embed_tokens.weight.data.data_ptr() != model.model.lm_head.weight.data.data_ptr(), "Embedding layer should have not changed"

Expected behavior

I think that merging should not succeed silently, but either a:

  • succeed and a warning is raised, that you are merging adapters to layers that are tied to other layers,
  • error raised,
  • merging should be performed correctly (though it might complicate saving the model as a new layer will be present).

Related issues

@ltoniazzi ltoniazzi changed the title With tied embeddings lora adapter merged to tied layers With tied embeddings adapter merged to tied layers Aug 19, 2024
@BenjaminBossan
Copy link
Member

Thanks for opening this issue. Yes, I agree that this is an easy source of errors, and having a warning would help.

The main reason why this is not implemented yet is that merging is a layer-level operation in PEFT. The individual layer can, however, not know if its weights are tied are not. Therefore, we cannot easily check for this. It could be possible to refactor this to work differently but I don't see an easy way.

We could still try to make an educated guess based on model.config.tie_word_embeddings and the actual target_modules and that should help most users who face this situation. If you are interested in working on this, feel free to create a PR. Otherwise, I'll put this on the backlog and work on this when I have a bit of time on my hands.

Make the B matrix non-zero

This can also be achieved by passing init_lora_weights=False to the LoraConfig :)

@ltoniazzi
Copy link
Contributor Author

If you are interested in working on this, feel free to create a PR.

Yes sure, happy to have a go at it later this week!

@BenjaminBossan
Copy link
Member

Fantastic, thanks. Don't hesitate to ask me if something is unclear, or to create a draft PR for early feedback.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member

Resolved via #2025.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants