You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the case of the same batch size, it is recommended to use a larger number of gradient accumulation steps in a single GPU instead of multi-GPUs considering huggingface/diffusers#4046. It may lead to fluctuations in the reward.
The text was updated successfully, but these errors were encountered:
Thanks so much for pointing this out! What a terrible bug. I've been able to fix it so that gradients are synchronized properly across GPUs, but it uses more memory for some reason (up to 16GB from 10GB before the change).
In the case of the same batch size, it is recommended to use a larger number of gradient accumulation steps in a single GPU instead of multi-GPUs considering huggingface/diffusers#4046. It may lead to fluctuations in the reward.
The text was updated successfully, but these errors were encountered: