Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There seems to be an error in LoRA's initialization #1060

Closed
4 tasks
ZexiLee opened this issue Oct 28, 2023 · 3 comments
Closed
4 tasks

There seems to be an error in LoRA's initialization #1060

ZexiLee opened this issue Oct 28, 2023 · 3 comments

Comments

@ZexiLee
Copy link

ZexiLee commented Oct 28, 2023

System Info

In lora.py Lines 205-213, the code is

   def reset_lora_parameters(self, adapter_name):
       if adapter_name in self.lora_A.keys():
           # initialize A the same way as the default for nn.Linear and B to zero
           nn.init.kaiming_uniform_(self.lora_A[adapter_name].weight, a=math.sqrt(5))
           nn.init.zeros_(self.lora_B[adapter_name].weight)
       if adapter_name in self.lora_embedding_A.keys():
           # initialize a the same way as the default for nn.linear and b to zero
           nn.init.zeros_(self.lora_embedding_A[adapter_name])
           nn.init.normal_(self.lora_embedding_B[adapter_name])

Two questions:

  • It seems like for ''if adapter_name in self.lora_embedding_A.keys():'', the initialization for A and B is reversed. B should be zeros but the implementation sets A as zeros. Therefore, the correct code should be:
        if adapter_name in self.lora_embedding_A.keys():
            # initialize a the same way as the default for nn.linear and b to zero
            nn.init.normal_(self.lora_embedding_A[adapter_name])
            nn.init.zeros_(self.lora_embedding_B[adapter_name])
  • Additionally, why for ''if adapter_name in self.lora_A.keys():'', kaiming initialization is used for A? Following the original paper, A is all set as Gaussian initialization.

Who can help?

@pacman100 @younesbelkada @sayakpaul

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

See above.

Expected behavior

If my understanding is right, kindly correct the error and provide feedback to me.

@BenjaminBossan
Copy link
Member

This is based on the LoRA implementation by Microsoft, so it's pretty much "official":

https://github.com/microsoft/LoRA/blob/a0a92e0f26c067cf94747bdbf1ce73793fa44d19/loralib/layers.py#L122-L125

Notice the comment they added, which addresses your point.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@younesbelkada
Copy link
Contributor

younesbelkada commented Dec 4, 2023

Closing as I believe this has been addressed through Benjamin's comment, also attaching #1189 that adds different init methods for LoRA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants