-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assign device to unet. Resolves #5897 #6061
Assign device to unet. Resolves #5897 #6061
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you ensure you get expected quality with this change?
I've removed the initial unet.to(device) that's now redundant. Regarding quality - I don't have much experience training these models but the performance is roughly on par with what I expect. I'm training a LORA with rank 16, on "stabilityai/stable-diffusion-2" with the usual resolution of 768. On a 3090 I can fit a batch size of 8, and I'm getting 1.65it/s. It's roughly 4 times faster than a LORA I trained on the same machine on SDXL using another example script that had no issues. I hope that answers your question but let me know if you need anything else. |
Hi there. Will this change still matter in light of: #5388. |
I'll let you know when I get a chance to test this. I had a look at the peft code and I'm not sure - LORA weights are still added after moving the unet to device, which was causing the issue before. But I'm not sure how peft handles this, it might detect this and put the new weights on the right device. |
@sayakpaul in my 1 test, looks like peft fixes the issue - it must be checking for device under the hood and making sure the adapter is on the same device |
That is very good to know. Feel free to continue to test and let us know about your findings. |
Is this still a problem? Note that we recently added |
We recently incorporated |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Resolves issue #5897 - training using the example script examples/text_to_image/train_text_to_image_lora.py fails because parameters and data are on 2 devices, CPU and CUDA. This is caused by the unet LORA weights - I fix it by sending the unet to the accelerator.device after setting up the LORA weights.
Fixes #5897 (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul