Dango/timesteps fix #1768

Dango233 · 2024-11-07T10:16:59Z

Remove diffusers dependency in ts & sigma calc
Support Shift Setting
Support timesteps range setting
Add uniform distribution
Default to Uniform distribution and shift 1

* Remove diffusers dependency in ts & sigma calc * support Shift setting * Add uniform distribution * Default to Uniform distribution and shift 1

Dango233 · 2024-11-07T10:18:24Z

With default setting, training should catch patterns/details much quicker and reduce overfitting on early/mid timesteps

kohya-ss · 2024-11-07T11:56:28Z

Thank you for this!

bghira · 2024-11-07T12:37:11Z

library/sd3_train_utils.py

+    indices = (u * (t_max-t_min) + t_min).long()
+    timesteps = indices.to(device=device, dtype=dtype)
+
+    # sigmas according to dlowmatching


flowmatching*

It seems to be fixed in bafd10d :)

bghira · 2024-11-07T12:54:03Z

@Dango233 you guys are seeing better results than the normal flux schedule of sigmoid sampling?

Dango233 · 2024-11-07T13:02:53Z

It's dataset and training purpose dependent. Sigmoid/logit normal works really well on initial construction of denoising capabilities but not necessarily best for small scale finetuning - uniform distribution is more universal for downstream usecases. If a training focus heavily on learning overall structures and can ignore details (details/objects/patterns already in model base weight), logit normal (with a shift > 1) still performs great; but for dataset that needs a lot of attention into details (like anime characters) - uniform distribution should work better. s Some extrem cases even need to have shift < 1 to emphasis details (like if you are training a detailed pattern). So it really depends on what you want to achieve.

bghira · 2024-11-07T13:13:42Z

does this scale with batch size such that around 2048 we really want to weight sampling or is uniform alright then as well?

the explanation about early structure during pretraining does make sense.

Dango233 · 2024-11-07T13:23:24Z

I would have to say that's task-dependent, but in general, if your dataset is very large and the samples share similar details, logit_normal still makes sense sometimes.

dsienra · 2024-11-08T16:45:41Z

I'm having problems to set the LR for the TE and the unet independently, it trains the TE with the same LR I set to the unet.

That's is in my config file

"learning_rate": 1e-06,
"learning_rate_te": 4e-06,
"learning_rate_te1": 4e-06,
"learning_rate_te2": 4e-06,

Additional parameters: --fused_backward_pass --use_t5xxl_cache_only --train_text_encoder

I think the problem started after the last update but I'm not sure,

kohya-ss · 2024-11-09T00:42:15Z

I think the problem started after the last update but I'm not sure,

This PR is not related to the learning rates. I will check it sooner.

Dango233 added 2 commits November 7, 2024 09:53

Simplify Timestep weighting

40ed54b

* Remove diffusers dependency in ts & sigma calc * support Shift setting * Add uniform distribution * Default to Uniform distribution and shift 1

Fix SD3 trained lora loading and merging

e54462a

Fix typo

bafd10d

kohya-ss merged commit 588ea9e into kohya-ss:sd3 Nov 7, 2024
1 check failed

kohya-ss mentioned this pull request Nov 7, 2024

support SD3 #1374

Draft

25 tasks

bghira reviewed Nov 7, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dango/timesteps fix #1768

Dango/timesteps fix #1768

Dango233 commented Nov 7, 2024

Dango233 commented Nov 7, 2024

kohya-ss commented Nov 7, 2024

bghira Nov 7, 2024

kohya-ss Nov 7, 2024

bghira commented Nov 7, 2024

Dango233 commented Nov 7, 2024 via email •

edited

Loading

bghira commented Nov 7, 2024

Dango233 commented Nov 7, 2024 via email •

edited

Loading

dsienra commented Nov 8, 2024

kohya-ss commented Nov 9, 2024

Dango/timesteps fix #1768

Dango/timesteps fix #1768

Conversation

Dango233 commented Nov 7, 2024

Dango233 commented Nov 7, 2024

kohya-ss commented Nov 7, 2024

bghira Nov 7, 2024

Choose a reason for hiding this comment

kohya-ss Nov 7, 2024

Choose a reason for hiding this comment

bghira commented Nov 7, 2024

Dango233 commented Nov 7, 2024 via email • edited Loading

bghira commented Nov 7, 2024

Dango233 commented Nov 7, 2024 via email • edited Loading

dsienra commented Nov 8, 2024

kohya-ss commented Nov 9, 2024

Dango233 commented Nov 7, 2024 via email •

edited

Loading

Dango233 commented Nov 7, 2024 via email •

edited

Loading