Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models #978

kovalexal · 2023-09-29T13:41:58Z

This PR focuses on increasing compatibility of SD&SDXL adapters in peft with other open-source instruments like LyCORIS. Feel free to learn more about LyCORIS adapters from resources like this.

This specific PR is currently aimed at adding proper compatibility of PEFT with LoKr adapters. LoKr adapters are described in detail in LyCORIS paper and are inspired by KronA paper.

Currently, it's a draft PR, so the following things are needed:

Actual implementation of LoKr
Unit tests
Conversion script for SD&SDXL for kohya_ss / LyCORIS trained LoKrs
Sample training script for SD / SDXL / SD&SDXL

It would be great to take into account that LoKr may reuse a lot of code from LoHa #956.

@pacman100 @BenjaminBossan @younesbelkada FYI

HuggingFaceDocBuilderDev · 2023-10-04T08:56:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

…apters

BenjaminBossan · 2023-10-04T14:08:15Z

Not quite sure what the status is, is the PR ready for review or are you still working on it? Let us know when it is ready.

kovalexal · 2023-10-04T16:16:37Z

@BenjaminBossan hi!

I've just finished LoKr implementation. As expected, this adapter is quite similar to LoHa, but from my point of view, it is quite hard to reuse LoHa code for LoKr to be easier to maintain.

Currently, the code is working fine, we can train / convert / infer existing LoKr checkpoints for SD from civitai.com.

I've also refactored the script for SD dreambooth training and incorporated subparsers to simplify training with different adapters (lora, loha, lokr). Here are the results from my sample training which shows that this LoKr implementation is able to train (the output adapter weight is several megabytes in size with requested rank 32):

I used the following settings for training:

python train_dreambooth.py \
--pretrained_model_name_or_path=... \
--instance_data_dir=... \
--instance_prompt="AlexanderKovalchuk" \
--output_dir=... \
--seed=42 \
--resolution=512 \
--train_text_encoder \
--train_batch_size=2 \
--max_train_steps=1000 \
--learning_rate=1e-4 \
--num_validation_images=4 \
--validation_steps=50 \
--validation_prompt="AlexanderKovalchuk" \
--logging_dir=./output_logs \
--report_to=tensorboard \
--lr_warmup_steps=300 \
--lr_scheduler=constant_with_warmup \
lokr --unet_r=32 --unet_alpha=32 --te_r=32 --te_alpha=32 --unet_use_effective_conv2d --unet_decompose_both --te_decompose_both

kovalexal · 2023-10-04T17:00:56Z

We may need to update the docs for LoHa in LoKr in the following PRs, from my point of view LoKr is a bit tricky.

At first, LoKr decomposes $\Delta W$ into two matrices $\Delta W_1$ and $\Delta W_2$ so that:
$\Delta W = \Delta W_1 \otimes \Delta W_2$
$\Delta W \in \mathbb{R}^{N \times M}, \Delta W_1 \in \mathbb{R}^{N_1 \times M_1}, \Delta W_2 \in \mathbb{R}^{N_2 \times M_2}$
$N = N_1 N_2, M = M_1 M_2, N_1 < N_2, M_1 < M_2$

There may exist multiple decompositions and decompose_factor controls how this decomposition is performed. If -1 is passed, it will try to decompose sizes to be close to square root of the dimension, otherwise it will try to find the closes decomposition to the number you've provided. In general, the first matrix is smaller than the second one.
Then the resulting matrices $\Delta W_1$ and $\Delta W_2$ are decomposed using the provided rank with the following logic:
- In general, $\Delta W_1$ is not decomposed (because it often has a small size) unless requested with decompose_both
- If the requested rank is lower than the smallest dimension of $\Delta W_{i}$, it is decomposed to two matrices $\Delta W_{ia}$ and $\Delta W_{ib}$, $\Delta W_{ia} \in \mathbb{R}^{X \times r}$, $\Delta W_{ib} \in \mathbb{R}^{r \times Y}$
- If the requested rank is bigger than the smallest dimension of $\Delta W_{i}$, matrix is not decomposed
- For Conv2d 3x3 layer $\Delta W_2$ we may perform Tucker decomposition (as we do it in LoHa - use_effective_conv2d)

kovalexal · 2023-10-04T17:11:08Z

@BenjaminBossan I've also spotted several errors in convert_sd_adapter_to_peft.py script, so I fixed them in this PR.

kovalexal · 2023-10-09T09:11:04Z

@BenjaminBossan, would you have some time to review this PR?

BenjaminBossan · 2023-10-09T10:47:18Z

Sorry for the delay, the last few days were very busy. I'll try to give a review today or in the next few days.

BenjaminBossan · 2023-10-09T15:21:44Z

I didn't have time for a full review yet (hopefully tomorrow), but did take a quick look. As always from you, it already looks quite excellent. Great attention to detail and thanks also for including tests.

I saw that the LoKrLayer does not yet have the changes from #979, could you please update it?

When it comes to code duplication, when it comes to layer.py, there are quite a few differences between this and LoHa, so I think there is no reasonable way to abstract here. When it comes to model.py, however, we might have a chance. The main differences I could spot where some changed variable names in code/docstrings. Do you think we have a chance of re-using a base model class here?

kovalexal · 2023-10-10T17:04:27Z

@BenjaminBossan, hi!

I updated LoKrLayer according to #979.

I've also heavily refactored LoHaModel and LoKrModel - feel free to share your thoughts about the current implementation.

BenjaminBossan

Finally I got around to giving this a proper review. Thanks for your patience.

This looks already quite good. Big thanks also for working on the abstractions, it's really nice to see how much code duplication can be avoided that way. I wonder if we can extend it to encompass LoRA as well (of course that would be a separate PR).

When it comes to the actual implementation of the LoKr method, I honestly haven't checked the details, so I'd trust you and the pre-existing code here to do the right thing.

Regarding the name of LyCORISConfig etc.: I could see that this may be confusing to some, though I understand the choice. From a pure spelling perspective, I wonder, however, if we should switch to LycorisConfig etc. This is much easier to type and also more consistent with other parts of PEFT (LoraConfig, not LoRAConfig). IMHO, it's also easier to read (Lycoris-Layer, not LyCORI-SLayer).

Regarding the update to train_dreambooth.py, which can now also do LoHa, does it mean that train_dreambooth_loha.py can be deleted?

examples/stable_diffusion/convert_sd_adapter_to_peft.py

src/peft/tuners/lokr/layer.py

src/peft/tuners/lycoris_utils.py

BenjaminBossan · 2023-10-11T14:00:58Z

src/peft/tuners/lycoris_utils.py

+        ...
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        previous_dtype = x.dtype


This may need adjusting, depending on whether #1010 is merged.

src/peft/tuners/lycoris_utils.py

tests/test_custom_models.py

kovalexal · 2023-10-13T08:43:41Z

@BenjaminBossan, thank you for giving a review on this PR!

I wonder if we can extend it to encompass LoRA as well (of course that would be a separate PR).

I suppose that there is no problem in incorporating LoRA into this config, we can do it in a separate PR (taking into account, that we'll have to add/modify additional dropout strategies from LyCORIS).

Regarding the name of LyCORISConfig etc.: I could see that this may be confusing to some, though I understand the choice.

No problem, I agree with that. Don't you mind if I also refactor LoHa / LoKr using the same naming strategy (in this PR)? LoHa adapter was not released yet, so it won't be a problem as far as I think.

Regarding the update to train_dreambooth.py, which can now also do LoHa, does it mean that train_dreambooth_loha.py can be deleted?

Definitely, I'll remove it in the following commits. (UPD I've actually already refactored it to train_dreambooth.py)

kovalexal · 2023-10-16T10:08:52Z

@BenjaminBossan, I addressed most of the comments you've left.

The reason is that we aim at having a release soon-ish once we return from the PT conference. That's probably too soon for this PR to be finished

I have one idea, but I am not sure, whether you will find it useful and appropriate, or not. I had a thought that we may showcase that PEFT is able to load most of the adapters for Stable Diffusion from civitai.com through a blog post on the Hugging Face blog (as far as I know, PEFT is the only way to achieve this with Hugging Face ecosystem right now). We may collaborate and co-author and I am sure that it would help to attract more users of SD/SDXL to your library.

In case you see it useful as well it probably would be great to do it after the LoKr adapter finally gets into release.

BenjaminBossan · 2023-10-24T13:30:55Z

I have one idea, but I am not sure, whether you will find it useful and appropriate, or not. I had a thought that we may showcase that PEFT is able to load most of the adapters for Stable Diffusion from civitai.com through a blog post on the Hugging Face blog (as far as I know, PEFT is the only way to achieve this with Hugging Face ecosystem right now). We may collaborate and co-author and I am sure that it would help to attract more users of SD/SDXL to your library.

I'm personally not an expert on SD and the ecosystem, but I'm sure we can manage something. Let me discuss this with my colleagues.

Meanwhile, could you please fix the merge conflicts and then I can give a (hopefully) final review.

kovalexal · 2023-10-24T14:26:14Z

Meanwhile, could you please fix the merge conflicts and then I can give a (hopefully) final review.

Sure, I fixed all the merge conflicts.

BenjaminBossan

So I did another review and it seems like we're almost done. I have some minor comments left, please take a look.

src/peft/tuners/lycoris_utils.py

src/peft/tuners/lokr/layer.py

BenjaminBossan · 2023-10-25T14:14:06Z

src/peft/tuners/lycoris_utils.py

+    r"""
+    A base config for LyCORIS like adapters
+    """
+    rank_pattern: Optional[dict] = field(


In LohaConfig and LoKrConfig, the arguments rank_pattern and alpha_pattern are also defined. Does it make sense to have them in parent class and child class?

You are right, I just wanted to leave them explicit in case anyone would read the code, but from the other point it may lead to some errors in case of refactoring or some other modifications, so let's remove them from child classes.

src/peft/tuners/lycoris_utils.py

BenjaminBossan · 2023-10-25T14:18:16Z

src/peft/tuners/lycoris_utils.py

+                self.weight.data += self.get_delta_weight(active_adapter)
+                self.merged_adapters.append(active_adapter)
+
+    def reset_adapter_parameters(self, adapter_name: str):


Is this really needed as an abstract method? The abstract class does not require it.

Yes, it should be an abstract method, so all child adapters need to implement them in their own way.
I missed to explicitly state that these methods should be abstract, so thank you for bringing it up!

src/peft/tuners/lycoris_utils.py

BenjaminBossan · 2023-10-25T14:21:16Z

src/peft/tuners/lycoris_utils.py

+            r = self.r[active_adapter]
+            self.scaling[active_adapter] = alpha / r
+
+    def update_layer(self, adapter_name: str, r: int, alpha: float, **kwargs):


Same as above: Is this really needed as an abstract method? The abstract class does not require it.

Answered above.

tests/test_custom_models.py

BenjaminBossan

Great work, Alex, this looks stellar to me. Thanks for all the effort you put into this.

From my point of view, this is ready, I'll check with colleagues if we want to have a second review just to be sure.

I took a closer look at test coverage and found some small gaps, but they're not blockers for me. Up to you if you want to work on those:

We can improve test coverage slightly by adding a LoKr config with rank dropout, e.g.:

("Vanilla MLP 7 LOKR", "MLP", LoKrConfig, {"target_modules": "lin0", "rank_dropout": 0.5}),

There is currently no code path that leads to use_w1=False or to use_w2=False with conv2d. Honestly, not sure what needs to be done to get those.

BenjaminBossan · 2023-10-26T14:21:31Z

Found a tiny are for improvement: In the docstring of check_target_module_exists in tuners_utils.py, the type for config is given as LoraConfig | LoHaConfig, it can be updated to LoraConfig | LycorisConfig.

kovalexal · 2023-10-27T11:02:49Z

@BenjaminBossan, I updated the docstring of check_target_module_exists and also added several tests to cover rank_dropout functionality for LoHa & LoKr + path that leads to use_w1=False and use_w2=False.

BenjaminBossan · 2023-10-27T11:52:40Z

I updated the docstring of check_target_module_exists and also added several tests to cover rank_dropout functionality for LoHa & LoKr + path that leads to use_w1=False and use_w2=False.

Great, thanks a lot.

BenjaminBossan · 2023-10-27T13:20:47Z

@kovalexal I just noticed that LoHa and LoKr are missing delete_adapter, or am I missing it? If true, it can be added in a later PR, just wanted to confirm.

kovalexal · 2023-10-27T14:02:46Z

@BenjaminBossan Yes, it slipped under my radar (added it in the recent commit), thank you for noticing!

BenjaminBossan · 2023-10-27T14:18:10Z

Oh, noticed a small error in your last commit:

[key for key, _ in self.model.named_modules() if "lora" not in key]

kovalexal · 2023-10-27T14:22:47Z

@BenjaminBossan sorry, fixed that one.

younesbelkada

Thanks!

younesbelkada · 2023-10-30T13:32:36Z

src/peft/tuners/ia3/layer.py

@@ -43,10 +43,6 @@ def __init__(
        self.out_features = out_features
        self.is_feedforward = is_feedforward

-    @property


Why this has been removed?

younesbelkada · 2023-10-30T13:33:40Z

src/peft/tuners/tuners_utils.py

    def merge(self, *args) -> None:
        raise NotImplementedError

    def unmerge(self, *args) -> None:
        raise NotImplementedError

+    @property


I see, because you moved it here

kovalexal added 4 commits September 29, 2023 16:24

Initial commit for LoKr implementation

41a71e3

Merged 'main' into 'lokr'

1762d10

Added current implementation of LoKr

0f60893

Fixed setting requires_grad for lokr modules

e7d6e23

kovalexal added 4 commits October 4, 2023 13:26

Updated initialization of LoKr adapter weights

bb45764

Updated docstrings for LoKr params

0c33d8c

Removed unneccessary comments

84b890b

Modified sd dreambooth script to be able to train LoRA, LoHa, LoKr ad…

fd4a754

…apters

kovalexal added 2 commits October 4, 2023 18:40

Updated conversion script to incorporate LoKr

ddfae52

Added simple tests for LoKr adapter

7526aa2

kovalexal marked this pull request as ready for review October 4, 2023 16:16

kovalexal added 8 commits October 9, 2023 19:16

Merge branch 'main' into lokr

fd5daad

Modified 'merged' property

8dc5e98

Removed duplicated comments

c1cef38

Replaced wrong keys for LoKr

ad525e4

Refactored LoHaModel and LoKrModel

ba45881

Refactored LoHaModel and LoKrModel again

2401bf1

Refactored LoHaLayer and LoKrLayer a bit

1fad986

Removed unnecessary comments

39e87ce

BenjaminBossan requested changes Oct 11, 2023

View reviewed changes

kovalexal added 3 commits October 14, 2023 01:05

Addressed conversion script review comments

3033a75

Replaced factorization docstring

299de88

LyCORIS -> Lycoris

d518728

kovalexal requested a review from BenjaminBossan October 16, 2023 12:16

Merged 'main' into 'lokr'

c9a8457

BenjaminBossan requested changes Oct 25, 2023

View reviewed changes

kovalexal added 4 commits October 25, 2023 20:19

Updated README to include LoKr adapter

63aba4e

Addressed some code review comments

6700baf

Addressed some code review comments

fa6b522

Addressed some code review comments

e76182f

kovalexal requested a review from BenjaminBossan October 25, 2023 18:38

BenjaminBossan approved these changes Oct 26, 2023

View reviewed changes

kovalexal mentioned this pull request Oct 27, 2023

Conv2d.forward() takes 2 positional arguments but 3 were given #1037

Closed

4 tasks

Updated check_target_modules docstring, increased test coverage

25077b2

Added delete_adapter method for LoKr and LoHa

9f05024

Fixed typo in delete_adapter

f6e7335

Provide default value for

69ae74c

younesbelkada approved these changes Oct 30, 2023

View reviewed changes

BenjaminBossan merged commit 884b1ac into huggingface:main Oct 30, 2023

BenjaminBossan mentioned this pull request Nov 7, 2023

DOC: List of TODOs for the documentation #1089

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models #978

Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models #978

kovalexal commented Sep 29, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 4, 2023

BenjaminBossan commented Oct 4, 2023

kovalexal commented Oct 4, 2023 •

edited

Loading

kovalexal commented Oct 4, 2023 •

edited

Loading

kovalexal commented Oct 4, 2023

kovalexal commented Oct 9, 2023

BenjaminBossan commented Oct 9, 2023

BenjaminBossan commented Oct 9, 2023

kovalexal commented Oct 10, 2023

BenjaminBossan left a comment

BenjaminBossan Oct 11, 2023

kovalexal commented Oct 13, 2023 •

edited

Loading

kovalexal commented Oct 16, 2023

BenjaminBossan commented Oct 24, 2023

kovalexal commented Oct 24, 2023

BenjaminBossan left a comment

BenjaminBossan Oct 25, 2023

kovalexal Oct 25, 2023

BenjaminBossan Oct 25, 2023

kovalexal Oct 25, 2023

BenjaminBossan Oct 25, 2023

kovalexal Oct 25, 2023

BenjaminBossan left a comment

BenjaminBossan commented Oct 26, 2023

kovalexal commented Oct 27, 2023

BenjaminBossan commented Oct 27, 2023

BenjaminBossan commented Oct 27, 2023

kovalexal commented Oct 27, 2023

BenjaminBossan commented Oct 27, 2023

kovalexal commented Oct 27, 2023

younesbelkada left a comment

younesbelkada Oct 30, 2023

younesbelkada Oct 30, 2023

Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models #978

Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models #978

Conversation

kovalexal commented Sep 29, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Oct 4, 2023

BenjaminBossan commented Oct 4, 2023

kovalexal commented Oct 4, 2023 • edited Loading

kovalexal commented Oct 4, 2023 • edited Loading

kovalexal commented Oct 4, 2023

kovalexal commented Oct 9, 2023

BenjaminBossan commented Oct 9, 2023

BenjaminBossan commented Oct 9, 2023

kovalexal commented Oct 10, 2023

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kovalexal commented Oct 13, 2023 • edited Loading

kovalexal commented Oct 16, 2023

BenjaminBossan commented Oct 24, 2023

kovalexal commented Oct 24, 2023

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan commented Oct 26, 2023

kovalexal commented Oct 27, 2023

BenjaminBossan commented Oct 27, 2023

BenjaminBossan commented Oct 27, 2023

kovalexal commented Oct 27, 2023

BenjaminBossan commented Oct 27, 2023

kovalexal commented Oct 27, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kovalexal commented Sep 29, 2023 •

edited

Loading

kovalexal commented Oct 4, 2023 •

edited

Loading

kovalexal commented Oct 4, 2023 •

edited

Loading

kovalexal commented Oct 13, 2023 •

edited

Loading