[Wuerstchen] text to image training script #5052

kashif · 2023-09-15T10:25:33Z

What does this PR do?

For issue #5043

cc @sayakpaul

examples/wuerstchen/text_to_image/train_text_to_image.py

examples/wuerstchen/text_to_image/train_text_to_image_prior.py

pcuenca

Very cool!

examples/wuerstchen/text_to_image/README.md

examples/wuerstchen/text_to_image/train_text_to_image_prior.py

examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py

pcuenca · 2023-10-09T11:37:15Z

examples/wuerstchen/text_to_image/requirements.txt

+accelerate>=0.16.0
+torchvision
+transformers>=4.25.1
+wandb


is it required?

no its not technically required but the sample snippet in the README has the --report_to="wandb" option

src/diffusers/loaders.py

src/diffusers/schedulers/scheduling_ddpm_wuerstchen.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

…or.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

pcuenca

Thanks for the quick attr fix!

patrickvonplaten · 2023-10-11T11:04:24Z

@kashif this test:

FAILED tests/pipelines/wuerstchen/test_wuerstchen_prior.py::WuerstchenPriorPipelineFastTests::test_save_load_local - AssertionError: 0.4609375 not less than 0.0005

seems to fail quite persistently

kashif · 2023-10-11T11:06:00Z

@patrickvonplaten I wanted to ask you about it, all the seeds etc. are set and I don't get why the values are different each time... is it the lora-adaptors in the prior?

I have checked that all is in eval mode as well...

patrickvonplaten · 2023-10-13T14:06:23Z

Hey @kashif,

I'm not sure why it fails either, but it doesn't seem to fail on current main so it does look like this PR introduces the bug. Think we have to investigate/debug a bit to figure out what's going on

kashif · 2023-10-13T14:08:11Z

agree! so i will check too!

kashif · 2023-10-13T16:55:44Z

@patrickvonplaten so the test passes if I comment out the set_default_attn_processor method in the WuerstchenPrior model...

patrickvonplaten · 2023-10-16T13:00:28Z

Very nice!

* initial script * formatting * prior trainer wip * add efficient_net_encoder * add CLIPTextModel * add prior ema support * optimizer * fix typo * add dataloader * prompt_embeds and image_embeds * intial training loop * fix output_dir * fix add_noise * accelerator check * make effnet_transforms dynamic * fix training loop * add validation logging * use loaded text_encoder * use PreTrainedTokenizerFast * load weigth from pickle * save_model_card * remove unused file * fix typos * save prior pipeilne in its own folder * fix imports * fix pipe_t2i * scale image_embeds * remove snr_gamma * format * initial lora prior training * log_validation and save * initial gradient working * remove save/load hooks * set set_attn_processor on prior_prior * add lora script * typos * use LoraLoaderMixin for prior pipeline * fix usage * make fix-copies * yse repo_id * write_lora_layers is a staitcmethod * use defualts * fix defaults * undo * Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_prior.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/wuerstchen/modeling_wuerstchen_prior.py * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add graident checkpoint support to prior * gradient_checkpointing * formatting * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/loaders.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/train_text_to_image_prior.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * use default unet and text_encoder * fix test --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

initial script

269ccf8

kashif marked this pull request as draft September 15, 2023 10:25

kashif added 15 commits September 15, 2023 12:27

formatting

67d734d

Merge branch 'main' into wuerstchen-train

ba1c3b7

prior trainer wip

3c7ac6f

add efficient_net_encoder

b412828

add CLIPTextModel

a24131a

add prior ema support

b4f2cdb

optimizer

3c8f6ed

fix typo

34aab3e

add dataloader

9def4b5

prompt_embeds and image_embeds

d8fb19c

intial training loop

3fe9079

fix output_dir

3a22be0

fix add_noise

6b5d2e7

accelerator check

8f9a683

make effnet_transforms dynamic

8d93fe5