-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
FIX PiSSA & OLoRA with rank/alpha pattern, rslora (#1930)
* FIX PiSSA & OLoRA with rank/alpha pattern, rslora See #1929 (comment) At the moment, when using PiSSA or OLoRA with weight conversion to restore the original base weights, there is an error when either of rank_pattern, alpha_pattern, or rslora is being used. This PR fixes this. The issue is that we need to double the rank of the LoRA adapter. Right now, this is done by simply doubling r and alpha. But if rank_pattern and alpha_pattern are being used, those need to be doubled too. Furthermore, when using rslora, the scaling is again different, namely alpha / sqrt(r). This also needs to be adjusted. Unfortunately, when using rslora with rank_pattern and alpha_pattern, this gets way more complicated. Since we don't store the scaling in the state_dict, we would have to resolve all the patterns here to determine the correct scaling, i.e. reimplement the whole matching and init logic. This is a lot of work for a very edgy edge case. Therefore, I opted to raise an error instead. This is not super nice, as the error is only raised when trying to save the model, i.e. a lot of time may already have been spent to train the model. But we cannot know this earlier, so not much can be done. Overall, this fix is ugly because it further couples unrelated code. For instance, if we add new init methods that affect the scaling, we need to remember to change the saving logic accordingly. If anyone has a better idea, LMK. * Make style * Also warn during init if there is a potential ... for saving not to work * Ensure that GPU tests are run for PiSSA+OLoRA * Use renamed argument name * Make style * Reviewer feedback: Better document the change * Add clarifying comments to tests
- Loading branch information
1 parent
5268495
commit e02b938
Showing
6 changed files
with
548 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.