Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong indented lines cause bugs for a long time #659

Closed
KohakuBlueleaf opened this issue Jul 31, 2023 · 7 comments
Closed

Wrong indented lines cause bugs for a long time #659

KohakuBlueleaf opened this issue Jul 31, 2023 · 7 comments

Comments

@KohakuBlueleaf
Copy link

This issue have been mentioned before but since no one have replied it so I open a new one.

https://github.com/TimDettmers/bitsandbytes/blob/a06a0f6a08cb23754b110359a109e069fa97ce9e/bitsandbytes/functional.py#L337-L342

#227
#262

The if-block around functional.py:218 in 2f2063b may have been accidentally indented. tests.test_functional.test_few_bit_quant doesn't cover the inside of the if block, so maybe this was missed (or it was intentional and it'd be great to have confirmation).

This bug(or, feature) left over 6 versions (0.36~0.41) and cause lot of problems:
ShivamShrirao/diffusers#178
#152
kohya-ss/sd-scripts#523

really need fixes or explanation for this

Sorry for @
@TimDettmers

@sdbds
Copy link

sdbds commented Jul 31, 2023

Too many report about it:
diff with bitsandbytes over 0.35

bmaltais/kohya_ss#1291
bmaltais/kohya_ss#1252
bmaltais/kohya_ss#775

@TimDettmers
plz look here

@2kpr
Copy link

2kpr commented Aug 1, 2023

I did a few hours of comparison tests today with 0.35.0 and 0.41.0, full DreamBooth, batch size of 4, 400, 800, and 1200 steps.

I altered 0.41.0 in the tests by removing the indents in the code in question above.

The results were that removing the indents in 0.41.0 did fix the 'blotchyness' and that hard overtrained look basically, BUT, when testing with the exact same prompts vs 0.35.0 it's obvious that there is something else wrong, other than the 'indented code' above.

For instance in the 800/1200 steps cases the 0.41.0 is producing images very similar to what is shown here: ShivamShrirao/diffusers#230
Basically the people are heavily 'elongated' and the inferences/generations are NOT following the prompts properly at all, almost like it's been heavily overtrained but also heavily distorted.

On the other hand 0.35.0 at 800, 1200 steps, with the exact same prompts, the inferences/generations are perfectly fine, and they are completely following the prompts with all the added detail added, etc.

So, I just wanted to mention the results, I realize it might be extremely hard to isolate the problem, but basically at this point, and pretty much since the end of 2022, most all the repos for training Stable Diffusion (fine-tuning, dreambooth, lora) have been locking their bitsandbytes down to version 0.35.0 for these very reasons.

It would be great to be able to play with the newer optimizers in the versions above 0.35.0, but doing so just happens at the cost of constant bad inferences/generations unfortunately.

@KohakuBlueleaf
Copy link
Author

I did a few hours of comparison tests today with 0.35.0 and 0.41.0, full DreamBooth, batch size of 4, 400, 800, and 1200 steps.

I altered 0.41.0 in the tests by removing the indents in the code in question above.

The results were that removing the indents in 0.41.0 did fix the 'blotchyness' and that hard overtrained look basically completely, BUT, when testing with the exact same prompts vs 0.35.0 it's obvious that there is something else wrong.

For instance in the 800/1200 steps cases the 0.41.0 is producing images very similar to what is shown here: ShivamShrirao/diffusers#230 Basically the people are heavily 'elongated' and the inferences/generations are NOT following the prompts properly at all, almost like it's been heavily overtrained but also heavily distorted.

On the other hand 0.35.0 at 800, 1200 steps, with the exact same prompts, the inferences/generations are perfectly fine, and they are completely following the prompts with all the added detail added, etc.

So, I just wanted to mention the results, I realize it might be extremely hard to isolate the problem, but basically at this point, and pretty much since the end of 2022, most all the repos for training have been locking their bitsandbytes down to version 0.35.0 for these very reasons.

It would be great to be able to play with the newer optimizers in the versions above 0.35.0, but doing so just happens at the cost of constant bad inferences/generations unfortunately.

I think 0.35~0.41 may change some calculation things and it may change some property of the optimizer

But the most important thing is the indented block definitely should not in the loop(

For other difference, we may need more investigation on what bnb have changed

Just like what I said, we don't know if it is "features"
We need explanation

@2kpr
Copy link

2kpr commented Aug 1, 2023

I think 0.35~0.41 may change some calculation things and it may change some property of the optimizer

But the most important thing is the indented block definitely should not in the loop(

For other difference, we may need more investigation on what bnb have changed

Just like what I said, we don't know if it is "features" We need explanation

Completely agreed! :)

@rationalism
Copy link

@KohakuBlueleaf @2kpr This should be fixed by 0.41.1

@wkpark
Copy link
Contributor

wkpark commented Nov 16, 2023

indentation fixed by 3c9aca9

commit 3c9aca9124ab8bcd160a8c90bba0d6ca361c141f
Author: Tim Dettmers <tim.dettmers@gmail.com>
Date:   Thu Aug 3 19:47:15 2023 -0700

    Fixed two bugs in dynamic data type creation.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants