MPS Mixed-precision Autocast #20497

laclouis5 · 2024-12-13T17:37:17Z

Support for MPS autocasting has recently be added in PyTorch 2.5.0 here and there is an ongoing effort to implement gradient scaling here.

PyTorch Lightning does not currently support mixed-precision on MPS device but it could be added in a near future when gradient scaling is finalized.

Is this feature considered? This would allow reducing memory usage and improving training time for some models.

Currently PyTorch Lightning falls back to FP32 when trying to use mixed-precision and issues a warning mentioning CUDA.

I think that considering adding a path for MPS mixed-precision would be great.

Stick to FP32 training when using MPS device.

thanks for your work!

lantiga · 2025-01-06T10:08:02Z

Thank you @laclouis5, we'll be happy to accept a PR when grad scaling is in. Would you like to give it a shot?

laclouis5 · 2025-01-06T19:07:19Z

@lantiga Yes, sure! The draft pull request is here: #20531.

laclouis5 added feature Is an improvement or enhancement needs triage Waiting to be triaged by maintainers labels Dec 13, 2024

lantiga removed the needs triage Waiting to be triaged by maintainers label Jan 6, 2025

laclouis5 linked a pull request Jan 6, 2025 that will close this issue

Support for MPS device mixed-precision #20531

Draft

7 tasks

Provide feedback