Enable ability to resize lora dim based off sv ratios #243

mgz-dev · 2023-02-28T21:09:52Z

Had the idea after looking at KohakuBlueleaf's implementation using SV threshold to dynamically determine layer dimension while extracting dreambooth models to lora.

This uses maximum SV ratio instead of a threshold (all singular value ranks below the ratio are dropped. e.g. if ratio is 10 and the largest singular value is 5, all singular values below 0.5 are removed and the dim is dynamically calculated per layer).

LoRA resized this way currently do not work with sd-webui-additional-networks since it will give size mismatch warnings but does function with built-in lora support in webui.

Same logic can be applied to extract_lora_from_models.

TingTingin · 2023-02-28T21:41:23Z

any comparisons?

mgz-dev · 2023-02-28T22:23:43Z

Sure. I grabbed the Arcane, Amber, and Makima LoRA from CivitAI since they are ranked as popular and processed them at sv_ratio of 5 (you can likely be more aggressive if you want). Listing file size reduction since dim is dynamic in new LoRAs. Prompt mostly just copied from a preview image.

Arcane (295mb to 55mb)
Amber (144mb to 23mb)
Makima (148mb to 49mb)

kohya-ss · 2023-02-28T23:06:22Z

Thank you for this! It looks good!

The extension needs to be updated first. The modification is not difficult so much, and I will make the change as soon as I have time. I'd like to support variable alpha and Conv2d at the same time.

thojmr · 2023-03-01T00:19:09Z

Was just about to mention Conv2d, nice! Looking forward to testing dreambooth extraction after Conv2d is implemented. Should be way more accurate.

ashen-sensored · 2023-03-01T06:49:46Z

K percentile(cumsum) instead of using SV threshold, then frobenius norm as precision error check may be a better way of doing it.
We can either directly use k as controllable input, or precision error as controllable input and do a sweep.

Locon author's comment about this, just FYI:

Had the idea after looking at KohakuBlueleaf's implementation using SV threshold to dynamically determine layer dimension while extracting dreambooth models to lora.

This uses maximum SV ratio instead of a threshold (all singular value ranks below the ratio are dropped. e.g. if ratio is 10 and the largest singular value is 5, all singular values below 0.5 are removed and the dim is dynamically calculated per layer).

LoRA resized this way currently do not work with sd-webui-additional-networks since it will give size mismatch warnings but does function with built-in lora support in webui.

Same logic can be applied to extract_lora_from_models.

mgz-dev · 2023-03-01T19:50:33Z

Thanks for the suggestion. It is easy to implement resizing based off any of these methods (cumulative sum, Frobenius norm, or sv ratio) so I went ahead and added them as options. I think they all have their strengths and weaknesses though and don't necessarily agree that cumulative sum or frobenius norm is strictly better.

Singular value ratio should function well if the goal is to denoise lower ranking data. It likely will also perform best at extremely high levels of compression (though this should be checked on a case-by-case basis)
Cumulative sum is similar to SV ratio, though the meaning of this value does not translate as well between different starting dims. It is important for people to understand that a "lower" cumulative sum is not necessarily bad. I originally avoided this because I was afraid people would misunderstand 0.6 as "only 60% as good as the original".
Frobenius norm is likely the most intuitive number for most users. This will try to represent the original matrix as accurately as possible. This is a good choice for trying to recreate with the highest level of accuracy at the tradeoff of compression. This is not a good option for users is trying to salvage an overfit LoRA. It will have lower compression on average but the least variance per layer.

TingTingin · 2023-03-01T22:06:46Z

What value ranges are each of those operating in? especially Cumulative sum since as you say the values are unintuitive.
Also is the any explanation on how resizing helps with overfitting

bmaltais · 2023-03-03T15:34:42Z

I discovered an interesting thing... clamping is sort of degrading the output. To my surprise changing the CLAMP_QUANTILE = 1 # 0.99 to 1 improves the results drastically... here is an example:

Original rank 256 model:

Resized with sv 5 clamp 0.99:

Resized with sv 5 clamp 1:

File size reduction for both was the same... from 295MB to 19.4 MB... but output of clamp 1 is much closer to original... perhaps the clamp value need to be exposed as a parameter?

mgz-dev · 2023-03-03T21:11:30Z

@bmaltais good observation. I actually think clamping may not be required at all since SVD should always return norm 1 vectors within U and Vh but wasn't sure if there were some unique exceptions in the algorithm used in torch which caused problems to warrant its inclusion in the original dreambooth extraction code. It may be best to just remove the quantile/clamp processing entirely.

@TingTingin as a brief explanation, it is somewhat similar to the theory behind PCA.

In our case the assumption is that dimensions with lower variance (you can think of smaller singular values as a proxy for less information) are less likely to contain useful data and more likely to contain noise from the training set which does not reflect what was actually desired to be learned during the training process. You can also think of it similarly to signal denoising.

As far as valid ranges,

sv_ratio should be greater than 1 (I'd recommend staying 2 or greater at a minimum)
sv_cumulative is somewhat dependent on the original rank of a matrix but numbers are restricted to be between (0,1)
sv_fro is also restricted to be between (0,1) but in practice should be much higher than sv_cumulative

You can use --verbose flag and compare numbers. I'd recommend doing some level of iteration to see what gives the best results for whichever LoRA you try to reduce since there is some variation in a case by case basis.

bmaltais · 2023-03-03T22:54:50Z

Based on testing one is better to create a LoRA in two steps. Use the highest rank possible to get the most out when creating a LoRA then resize it to a much smaller size with minimal loss. This provide way better results vs trying to train a small rank LoRA in one step

TingTingin · 2023-03-04T00:40:03Z

Thx for the explanation

- code refactor to be able to re-use same function for dynamic extract lora - remove clamp - fix issue where if sv_ratio is too high index goes out of bounds

Modified resize script so support different types of LoRA networks (refer to Kohaku-Blueleaf module implementation for structure).

-new_rank arg changed to limit the max rank of any layer. -added logic to make sure zero-ed layers do not create large lora dim

mgz-dev · 2023-03-04T09:30:34Z

Apologies for so many additional changes. I was working to make the script modular to be able to recycle code to handle dreambooth extraction but it looks like Kohaku-Blueleaf pushed an update their repo which overlaps in that task. I'll table dreambooth extraction for now unless there is some reason to bring it back up.

Latest version should also be able to handle the conv2d layers.

important --new_rank will now also limit the max rank of any individual lora layer during dynamic resizing. I added this because I noticed some layers could be disproportionately expensive space-wise. This will allow people to see if restricting those layers further result in quality loss (users can test on a model-by-model basis if they're trying to maximize space savings).

@bmaltais I have not done enough testing to say if 2-step LoRA is the best method, but I also have seen examples of trimming LoRA even improving quality. As a caveat though, I would think that trimming too many ranks runs the risk of losing important information in the model. There is probably some range that is best to stay within for resizing.

bmaltais · 2023-03-04T13:12:51Z

@mgz-dev I will refer to your repo for updated code until you are ready to update the PR as to keep up with the improvements. Nice work. I like that it now support LoCon also.

I have a question for you... if one use SD1.5 as the model... what would you say it the highest rank one should use when creatien a LoRA... since the model use 768 vectors... would a ran 768 be the limit after whick there is no return?

I am asking because I think LoRA can be used as nice "frozen" model training. Once the LoRA has been created at full model potential resolution one simply use your tool to reduce it down to what really constitute the model. In my test this is providing the best LoRA.... but I wonder what the practical rank should be for a model...

kohya-ss · 2023-03-06T23:20:41Z

Thank you all! I am not familiar with the mathematical theory of LoRA and svd, so this is very helpful.

The support for dynamic rank in sd-webui-additional-network has been completed in the dev branch.

I will test and merge this today after work.

Enable ability to resize lora dim based off ratios

efe4c98

add options to resize based off frobenius norm or cumulative sum

52ca6c5

mgz-dev added 3 commits March 3, 2023 23:32

refactor and bug fix for too large sv_ratio

80be6fa

- code refactor to be able to re-use same function for dynamic extract lora - remove clamp - fix issue where if sv_ratio is too high index goes out of bounds

add support to extract lora with resnet and 2d blocks

214ed09

Modified resize script so support different types of LoRA networks (refer to Kohaku-Blueleaf module implementation for structure).

make new_rank limit max rank, fix zero matrices

4a4450d

-new_rank arg changed to limit the max rank of any layer. -added logic to make sure zero-ed layers do not create large lora dim

kohya-ss changed the base branch from main to dev March 10, 2023 03:57

kohya-ss merged commit 1932c31 into kohya-ss:dev Mar 10, 2023

mgz-dev deleted the dynamic-dim-lora-resize branch March 10, 2023 17:40

$@fractal-fumbler$ fractal-fumbler mentioned this pull request Mar 25, 2023

Resizing of extracted lora giving error IndexError: index 320 is out of bounds for dimension 0 with size 320 #325

Closed

AI-Casanova mentioned this pull request May 9, 2023

Size from network weights #491

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable ability to resize lora dim based off sv ratios #243

Enable ability to resize lora dim based off sv ratios #243

mgz-dev commented Feb 28, 2023 •

edited

Loading

TingTingin commented Feb 28, 2023

mgz-dev commented Feb 28, 2023 •

edited

Loading

kohya-ss commented Feb 28, 2023

thojmr commented Mar 1, 2023 •

edited

Loading

ashen-sensored commented Mar 1, 2023 •

edited

Loading

mgz-dev commented Mar 1, 2023

TingTingin commented Mar 1, 2023 •

edited

Loading

bmaltais commented Mar 3, 2023

mgz-dev commented Mar 3, 2023 •

edited

Loading

bmaltais commented Mar 3, 2023 •

edited

Loading

TingTingin commented Mar 4, 2023

mgz-dev commented Mar 4, 2023

bmaltais commented Mar 4, 2023

kohya-ss commented Mar 6, 2023

Enable ability to resize lora dim based off sv ratios #243

Enable ability to resize lora dim based off sv ratios #243

Conversation

mgz-dev commented Feb 28, 2023 • edited Loading

TingTingin commented Feb 28, 2023

mgz-dev commented Feb 28, 2023 • edited Loading

kohya-ss commented Feb 28, 2023

thojmr commented Mar 1, 2023 • edited Loading

ashen-sensored commented Mar 1, 2023 • edited Loading

mgz-dev commented Mar 1, 2023

TingTingin commented Mar 1, 2023 • edited Loading

bmaltais commented Mar 3, 2023

mgz-dev commented Mar 3, 2023 • edited Loading

bmaltais commented Mar 3, 2023 • edited Loading

TingTingin commented Mar 4, 2023

mgz-dev commented Mar 4, 2023

bmaltais commented Mar 4, 2023

kohya-ss commented Mar 6, 2023

mgz-dev commented Feb 28, 2023 •

edited

Loading

mgz-dev commented Feb 28, 2023 •

edited

Loading

thojmr commented Mar 1, 2023 •

edited

Loading

ashen-sensored commented Mar 1, 2023 •

edited

Loading

TingTingin commented Mar 1, 2023 •

edited

Loading

mgz-dev commented Mar 3, 2023 •

edited

Loading

bmaltais commented Mar 3, 2023 •

edited

Loading