There seems to be an error in LoRA's initialization #1060

ZexiLee · 2023-10-28T13:14:45Z

System Info

In lora.py Lines 205-213, the code is

   def reset_lora_parameters(self, adapter_name):
       if adapter_name in self.lora_A.keys():
           # initialize A the same way as the default for nn.Linear and B to zero
           nn.init.kaiming_uniform_(self.lora_A[adapter_name].weight, a=math.sqrt(5))
           nn.init.zeros_(self.lora_B[adapter_name].weight)
       if adapter_name in self.lora_embedding_A.keys():
           # initialize a the same way as the default for nn.linear and b to zero
           nn.init.zeros_(self.lora_embedding_A[adapter_name])
           nn.init.normal_(self.lora_embedding_B[adapter_name])

Two questions:

It seems like for ''if adapter_name in self.lora_embedding_A.keys():'', the initialization for A and B is reversed. B should be zeros but the implementation sets A as zeros. Therefore, the correct code should be:

        if adapter_name in self.lora_embedding_A.keys():
            # initialize a the same way as the default for nn.linear and b to zero
            nn.init.normal_(self.lora_embedding_A[adapter_name])
            nn.init.zeros_(self.lora_embedding_B[adapter_name])

Additionally, why for ''if adapter_name in self.lora_A.keys():'', kaiming initialization is used for A? Following the original paper, A is all set as Gaussian initialization.

Who can help?

@pacman100 @younesbelkada @sayakpaul

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

See above.

Expected behavior

If my understanding is right, kindly correct the error and provide feedback to me.

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2023-10-30T10:15:24Z

This is based on the LoRA implementation by Microsoft, so it's pretty much "official":

https://github.com/microsoft/LoRA/blob/a0a92e0f26c067cf94747bdbf1ce73793fa44d19/loralib/layers.py#L122-L125

Notice the comment they added, which addresses your point.

github-actions · 2023-11-27T15:03:36Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

younesbelkada · 2023-12-04T09:43:07Z

Closing as I believe this has been addressed through Benjamin's comment, also attaching #1189 that adds different init methods for LoRA

younesbelkada closed this as completed Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There seems to be an error in LoRA's initialization #1060

There seems to be an error in LoRA's initialization #1060

ZexiLee commented Oct 28, 2023

BenjaminBossan commented Oct 30, 2023

github-actions bot commented Nov 27, 2023

younesbelkada commented Dec 4, 2023 •

edited

Loading

There seems to be an error in LoRA's initialization #1060

There seems to be an error in LoRA's initialization #1060

Comments

ZexiLee commented Oct 28, 2023

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

BenjaminBossan commented Oct 30, 2023

github-actions bot commented Nov 27, 2023

younesbelkada commented Dec 4, 2023 • edited Loading

younesbelkada commented Dec 4, 2023 •

edited

Loading