running on tpu #25

entrpn · 2024-10-17T22:58:13Z

This PR adds the ability to run on TPUs, although its very slow right now. I'm unsure how to this hooks into uv as I used pip. Please review.

Loads encoders on CPU when device_type == 'tpu'.
Can run on 32GB of HBM, i.e., TPUv4.

SauravMaheshkar

Thanks a lot for the PR @entrpn. I've had a look at the diff and it makes sense to me. I'll just confirm by running this on TPU later today.

ariG23498

LGTM!

I have asked for some TPU support. Once I have them, I can check this PR.

On another note I have the following queries:

We have FLAX implementation of T5 and CLIP in Hugging Face, would it be easier for us to load them instead of PyTorch versions?
Do you see any place we can apply parallelization to make this implementation faster? I understand that this is a more dense question which requires one to look into the code, if that is too much to ask at this time I completely understand if you don't want to answer it now.

ariG23498 · 2024-10-19T07:41:47Z

I had checked the implementation on my side on tpu v4.

Thanks for the great contribution. ❤️

cataluna84

LGTM 🤩

This conversation to use TPUs looks fantastic

entrpn · 2024-10-19T17:33:36Z

LGTM!

I have asked for some TPU support. Once I have them, I can check this PR.

On another note I have the following queries:

We have FLAX implementation of T5 and CLIP in Hugging Face, would it be easier for us to load them instead of PyTorch versions?

Do you see any place we can apply parallelization to make this implementation faster? I understand that this is a more dense question which requires one to look into the code, if that is too much to ask at this time I completely understand if you don't want to answer it now.

I will take some time to understand the code better. This should be able to run faster and be parallelizable across all TPU devices.

For reference, I created a pytorch xla version that runs flux on TPUs and generates 4 images in parallel in under 10 seconds. https://github.com/entrpn/diffusers/tree/flux_ptxla

running on tpu

be46c2e

entrpn requested review from SauravMaheshkar and ariG23498 as code owners October 17, 2024 22:58

SauravMaheshkar linked an issue Oct 18, 2024 that may be closed by this pull request

add TPU support #19

Closed

SauravMaheshkar reviewed Oct 18, 2024

View reviewed changes

SauravMaheshkar assigned entrpn Oct 18, 2024

SauravMaheshkar added the feature 🚀 New feature or request label Oct 18, 2024

ariG23498 reviewed Oct 18, 2024

View reviewed changes

ariG23498 merged commit 5892764 into ml-gde:main Oct 19, 2024

cataluna84 reviewed Oct 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

running on tpu #25

running on tpu #25

entrpn commented Oct 17, 2024

SauravMaheshkar left a comment

ariG23498 left a comment

ariG23498 commented Oct 19, 2024

cataluna84 left a comment •

edited

Loading

entrpn commented Oct 19, 2024

running on tpu #25

running on tpu #25

Conversation

entrpn commented Oct 17, 2024

SauravMaheshkar left a comment

Choose a reason for hiding this comment

ariG23498 left a comment

Choose a reason for hiding this comment

ariG23498 commented Oct 19, 2024

cataluna84 left a comment • edited Loading

Choose a reason for hiding this comment

entrpn commented Oct 19, 2024

cataluna84 left a comment •

edited

Loading