Add support for fine-tuning with LoRA (text2image example) #2002

sayakpaul · 2023-01-16T09:26:29Z

Most of it is as same as #1884. I guess the only script that needs reviewing is examples/text_to_image/train_text_to_image_lora.py.

…add_lora_fine_tuning

sayakpaul · 2023-01-16T10:41:57Z

With the following:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch --gpu_ids="0," \
   ./train_text_to_image_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$DATASET_NAME --caption_column="text" \
  --resolution=512 --random_flip \
  --train_batch_size=1 \
  --num_train_epochs=100 --checkpointing_steps=5000 \
  --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
  --seed=42 \
  --save_sample_prompt="cute Sundar Pichai creature" --report_to="wandb" \

still leading to:

Steps:   0%|                                                                                                                                                  | 1/83300 [00:08<204:55:05,  8.86s/it, lr=0.0001, step_loss=0.209]Traceback (most recent call last):
  File "./train_text_to_image_lora.py", line 891, in <module>
    main()
  File "./train_text_to_image_lora.py", line 819, in main
    accelerator.backward(loss)
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/accelerate/accelerator.py", line 1316, in backward
    loss.backward(**kwargs)
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 14.56 GiB total capacity; 12.59 GiB already allocated; 500.44 MiB free; 13.01 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

With xformers, still leading to:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 14.56 GiB total capacity; 12.59 GiB already allocated; 474.44 MiB free; 13.01 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The above experiments were run on a single T4 machine.

On a V100 with xformers, it works. Logs will be here: https://wandb.ai/sayakpaul/stable_diffusion_ft_lora/runs/0b88cwxc.

sayakpaul · 2023-01-16T10:48:26Z

When I tried enabling mixed-precision on T4, it led to:

Traceback (most recent call last):
  File "./train_text_to_image_lora.py", line 891, in <module>
    main()
  File "./train_text_to_image_lora.py", line 822, in main
    accelerator.clip_grad_norm_(params_to_clip, args.max_grad_norm)
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/accelerate/accelerator.py", line 1373, in clip_grad_norm_
    self.unscale_gradients()
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/accelerate/accelerator.py", line 1336, in unscale_gradients
    self.scaler.unscale_(opt)
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 282, in unscale_
    optimizer_state["found_inf_per_device"] = self._unscale_grads_(optimizer, inv_scale, found_inf, False)
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 210, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

sayakpaul · 2023-01-17T08:28:01Z

@patil-suraj the training is completed and the results look good: https://wandb.ai/sayakpaul/stable_diffusion_ft_lora/reports/LoRA-fine-tuning-of-text2image--VmlldzozMzUxNjI5

Let me know if it makes sense to continue to work on this PR and add LoRA support formally to our text2image fine-tuning script. Happy to take care of it :)

Update: Talked to Suraj offline. I will continue working on this PR and let y'all know (@patrickvonplaten @patil-suraj) when it's ready for reviews.

patil-suraj · 2023-01-17T08:49:02Z

Thanks a lot for working on this! Feel free to continue on the PR. We could add the train_lora_text_to_image.py script under the text_to_image directory.

…lora-fit

Co-authored-by: Simo Ryu <cloneofsimo@gmail.com>

sayakpaul · 2023-01-18T09:47:52Z

Things seem to be working on both T4 and V100.

My command:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch \
  train_text_to_image_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$DATASET_NAME --caption_column="text" \
  --resolution=512 --random_flip \
  --train_batch_size=1 \
  --num_train_epochs=100 --checkpointing_steps=5000 \
  --learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0 \
  --seed=42 \
  --enable_xformers_memory_efficient_attention \
  --validation_prompt="cute Sundar Pichai creature" --report_to="wandb" \
  --output_dir="sd-model-finetuned-lora-v100" \
  --push_to_hub && sudo shutdown now

The final weights will be pushed to https://huggingface.co/sayakpaul/sd-model-finetuned-lora-v100/tree/main and an experimentation run is available here: https://wandb.ai/sayakpaul/text2image-fine-tune/runs/782txylu (currently running). Once these are done, I will update the appropriate sections in the README.

sayakpaul · 2023-01-18T17:55:13Z

Closing this PR since the merge conflicts are a little too brutal to resolve. I will create a fresh PR.

sayakpaul · 2023-01-18T18:32:42Z

#2031

patrickvonplaten and others added 13 commits January 2, 2023 08:39

[Lora] first upload

4eb297e

add first lora version

67f4e5a

upload

24993c4

more

943e7f4

first training

e7293d0

Merge branch 'main' of https://github.com/huggingface/diffusers into …

0baadb1

…add_lora_fine_tuning

up

b8e9ce4

correct

f7719e0

improve

b69f276

finish loaders and inference

5d6ee56

up

bc15289

feat: add support for fine-tuning with LoRA (text2img).

c780bdc

chore: apply formatting.

67a388f

sayakpaul requested a review from patil-suraj January 16, 2023 09:26

unet device placement.

bcc8d12

sayakpaul and others added 8 commits January 16, 2023 12:01

fix: lora layer preparation from accelerate

4c811cf

defaulting to weight type code.

d0eff5c

small nit.

d0a038d

fix

aa8ad74

up

d334d5a

fix more

d8f1a6b

up

c5cf0a0

finish more

060697e

sayakpaul self-assigned this Jan 17, 2023

sayakpaul added 2 commits January 17, 2023 11:30

Merge remote-tracking branch 'origin/add_lora_fine_tuning' into feat/…

e86d8d6

…lora-fit

add: readme notes for LoRA.

1723671

sayakpaul and others added 20 commits January 17, 2023 17:00

set unet requires_grad False.

0bb68ed

Add cloneofsimo as co-author.

b4bcc26

Co-authored-by: Simo Ryu <cloneofsimo@gmail.com>

finish

3d693c0

fix docs

f53d962

no explicit weight casting.

f0e36c2

disable casting of pixels.

2df28cc

quality changes and edits to readme.

7013b96

change to half-precision for val inference.

afc7bf3

Merge branch 'add_lora_fine_tuning' into feat/lora-fit

0cef303

fix: device placement.

3247ef5

ifx: code quality.

4482f77

Merge branch 'main' into feat/lora-fit

c027340

autocasting.

66fc1a5

disable half precision in pipeline.

b6245e2

fix autocasting.

7159ca0

device in autocast.

c1f1844

revert to weight_dtype.

f4fb615

run inference one by one.

ca75257

Merge branch 'main' into feat/lora-fit

4570ae7

minor nit in the readme.

16776da

Merge branch 'main' into feat/lora-fit

866d13c

sayakpaul marked this pull request as ready for review January 18, 2023 09:48

sayakpaul requested review from pcuenca and patrickvonplaten January 18, 2023 09:49

sayakpaul mentioned this pull request Jan 18, 2023

Write a doc for centralizing our LoRA efforts #2028

Closed

sayakpaul closed this Jan 18, 2023

sayakpaul deleted the feat/lora-fit branch January 18, 2023 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for fine-tuning with LoRA (text2image example) #2002

Add support for fine-tuning with LoRA (text2image example) #2002

sayakpaul commented Jan 16, 2023 •

edited

Loading

sayakpaul commented Jan 16, 2023 •

edited

Loading

sayakpaul commented Jan 16, 2023

sayakpaul commented Jan 17, 2023 •

edited

Loading

patil-suraj commented Jan 17, 2023

sayakpaul commented Jan 18, 2023

sayakpaul commented Jan 18, 2023

sayakpaul commented Jan 18, 2023

Add support for fine-tuning with LoRA (text2image example) #2002

Add support for fine-tuning with LoRA (text2image example) #2002

Conversation

sayakpaul commented Jan 16, 2023 • edited Loading

sayakpaul commented Jan 16, 2023 • edited Loading

sayakpaul commented Jan 16, 2023

sayakpaul commented Jan 17, 2023 • edited Loading

patil-suraj commented Jan 17, 2023

sayakpaul commented Jan 18, 2023

sayakpaul commented Jan 18, 2023

sayakpaul commented Jan 18, 2023

sayakpaul commented Jan 16, 2023 •

edited

Loading

sayakpaul commented Jan 16, 2023 •

edited

Loading

sayakpaul commented Jan 17, 2023 •

edited

Loading