-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models #978
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Not quite sure what the status is, is the PR ready for review or are you still working on it? Let us know when it is ready. |
@BenjaminBossan hi! I've just finished LoKr implementation. As expected, this adapter is quite similar to LoHa, but from my point of view, it is quite hard to reuse LoHa code for LoKr to be easier to maintain. Currently, the code is working fine, we can train / convert / infer existing LoKr checkpoints for SD from civitai.com. I've also refactored the script for SD dreambooth training and incorporated subparsers to simplify training with different adapters (lora, loha, lokr). Here are the results from my sample training which shows that this LoKr implementation is able to train (the output adapter weight is several megabytes in size with requested rank 32): I used the following settings for training: python train_dreambooth.py \
--pretrained_model_name_or_path=... \
--instance_data_dir=... \
--instance_prompt="AlexanderKovalchuk" \
--output_dir=... \
--seed=42 \
--resolution=512 \
--train_text_encoder \
--train_batch_size=2 \
--max_train_steps=1000 \
--learning_rate=1e-4 \
--num_validation_images=4 \
--validation_steps=50 \
--validation_prompt="AlexanderKovalchuk" \
--logging_dir=./output_logs \
--report_to=tensorboard \
--lr_warmup_steps=300 \
--lr_scheduler=constant_with_warmup \
lokr --unet_r=32 --unet_alpha=32 --te_r=32 --te_alpha=32 --unet_use_effective_conv2d --unet_decompose_both --te_decompose_both |
We may need to update the docs for LoHa in LoKr in the following PRs, from my point of view LoKr is a bit tricky.
|
@BenjaminBossan I've also spotted several errors in convert_sd_adapter_to_peft.py script, so I fixed them in this PR. |
@BenjaminBossan, would you have some time to review this PR? |
Sorry for the delay, the last few days were very busy. I'll try to give a review today or in the next few days. |
I didn't have time for a full review yet (hopefully tomorrow), but did take a quick look. As always from you, it already looks quite excellent. Great attention to detail and thanks also for including tests. I saw that the When it comes to code duplication, when it comes to |
@BenjaminBossan, hi! I updated I've also heavily refactored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally I got around to giving this a proper review. Thanks for your patience.
This looks already quite good. Big thanks also for working on the abstractions, it's really nice to see how much code duplication can be avoided that way. I wonder if we can extend it to encompass LoRA as well (of course that would be a separate PR).
When it comes to the actual implementation of the LoKr method, I honestly haven't checked the details, so I'd trust you and the pre-existing code here to do the right thing.
Regarding the name of LyCORISConfig
etc.: I could see that this may be confusing to some, though I understand the choice. From a pure spelling perspective, I wonder, however, if we should switch to LycorisConfig
etc. This is much easier to type and also more consistent with other parts of PEFT (LoraConfig
, not LoRAConfig
). IMHO, it's also easier to read (Lycoris-Layer, not LyCORI-SLayer).
Regarding the update to train_dreambooth.py
, which can now also do LoHa, does it mean that train_dreambooth_loha.py
can be deleted?
... | ||
|
||
def forward(self, x: torch.Tensor) -> torch.Tensor: | ||
previous_dtype = x.dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need adjusting, depending on whether #1010 is merged.
@BenjaminBossan, thank you for giving a review on this PR!
I suppose that there is no problem in incorporating LoRA into this config, we can do it in a separate PR (taking into account, that we'll have to add/modify additional dropout strategies from LyCORIS).
No problem, I agree with that. Don't you mind if I also refactor LoHa / LoKr using the same naming strategy (in this PR)? LoHa adapter was not released yet, so it won't be a problem as far as I think.
Definitely, I'll remove it in the following commits. (UPD I've actually already refactored it to |
@BenjaminBossan, I addressed most of the comments you've left.
I have one idea, but I am not sure, whether you will find it useful and appropriate, or not. I had a thought that we may showcase that PEFT is able to load most of the adapters for Stable Diffusion from civitai.com through a blog post on the Hugging Face blog (as far as I know, PEFT is the only way to achieve this with Hugging Face ecosystem right now). We may collaborate and co-author and I am sure that it would help to attract more users of SD/SDXL to your library. In case you see it useful as well it probably would be great to do it after the LoKr adapter finally gets into release. |
I'm personally not an expert on SD and the ecosystem, but I'm sure we can manage something. Let me discuss this with my colleagues. Meanwhile, could you please fix the merge conflicts and then I can give a (hopefully) final review. |
Sure, I fixed all the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I did another review and it seems like we're almost done. I have some minor comments left, please take a look.
r""" | ||
A base config for LyCORIS like adapters | ||
""" | ||
rank_pattern: Optional[dict] = field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In LohaConfig
and LoKrConfig
, the arguments rank_pattern
and alpha_pattern
are also defined. Does it make sense to have them in parent class and child class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I just wanted to leave them explicit in case anyone would read the code, but from the other point it may lead to some errors in case of refactoring or some other modifications, so let's remove them from child classes.
self.weight.data += self.get_delta_weight(active_adapter) | ||
self.merged_adapters.append(active_adapter) | ||
|
||
def reset_adapter_parameters(self, adapter_name: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really needed as an abstract method? The abstract class does not require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should be an abstract method, so all child adapters need to implement them in their own way.
I missed to explicitly state that these methods should be abstract, so thank you for bringing it up!
r = self.r[active_adapter] | ||
self.scaling[active_adapter] = alpha / r | ||
|
||
def update_layer(self, adapter_name: str, r: int, alpha: float, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: Is this really needed as an abstract method? The abstract class does not require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, Alex, this looks stellar to me. Thanks for all the effort you put into this.
From my point of view, this is ready, I'll check with colleagues if we want to have a second review just to be sure.
I took a closer look at test coverage and found some small gaps, but they're not blockers for me. Up to you if you want to work on those:
- We can improve test coverage slightly by adding a LoKr config with rank dropout, e.g.:
("Vanilla MLP 7 LOKR", "MLP", LoKrConfig, {"target_modules": "lin0", "rank_dropout": 0.5}),
- There is currently no code path that leads to
use_w1=False
or touse_w2=False
with conv2d. Honestly, not sure what needs to be done to get those.
Found a tiny are for improvement: In the docstring of |
@BenjaminBossan, I updated the docstring of |
Great, thanks a lot. |
@kovalexal I just noticed that LoHa and LoKr are missing |
@BenjaminBossan Yes, it slipped under my radar (added it in the recent commit), thank you for noticing! |
Oh, noticed a small error in your last commit:
|
@BenjaminBossan sorry, fixed that one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -43,10 +43,6 @@ def __init__( | |||
self.out_features = out_features | |||
self.is_feedforward = is_feedforward | |||
|
|||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this has been removed?
def merge(self, *args) -> None: | ||
raise NotImplementedError | ||
|
||
def unmerge(self, *args) -> None: | ||
raise NotImplementedError | ||
|
||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, because you moved it here
This PR focuses on increasing compatibility of SD&SDXL adapters in peft with other open-source instruments like LyCORIS. Feel free to learn more about LyCORIS adapters from resources like this.
This specific PR is currently aimed at adding proper compatibility of PEFT with LoKr adapters. LoKr adapters are described in detail in LyCORIS paper and are inspired by KronA paper.
Currently, it's a draft PR, so the following things are needed:
It would be great to take into account that LoKr may reuse a lot of code from LoHa #956.
@pacman100 @BenjaminBossan @younesbelkada FYI