make sure everything stays in the same dtype when using dpo + FSDP #1559

winglian · 2024-04-22T19:08:27Z

trl's DPO Trainer attempts to call prepare_for_kbit_training on the peft model, but this should be avoided when using FSDP, otherwise you end up with the error:

ValueError: Must flatten tensors with uniform dtype but got torch.bfloat16 and torch.float32

we go behind it after instantiation and fix the dtypes.

maziyarpanahi · 2024-04-22T19:31:46Z

can confirm, this PR fixes the issue for me. (QLoRA+FSDP Llama-3-70B)

…1559)

make sure everything stays in the same dtype when using dpo + FSDP

97178f3

winglian merged commit 68601ec into main Apr 22, 2024
7 checks passed

winglian deleted the dpo-fsdp-fix branch April 22, 2024 20:00

djsaunde pushed a commit that referenced this pull request Dec 17, 2024

make sure everything stays in the same dtype when using dpo + FSDP (#…

3fdd805

…1559)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make sure everything stays in the same dtype when using dpo + FSDP #1559

make sure everything stays in the same dtype when using dpo + FSDP #1559

winglian commented Apr 22, 2024

maziyarpanahi commented Apr 22, 2024

make sure everything stays in the same dtype when using dpo + FSDP #1559

make sure everything stays in the same dtype when using dpo + FSDP #1559

Conversation

winglian commented Apr 22, 2024

maziyarpanahi commented Apr 22, 2024