Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed precision training #2291

Open
y-akbal opened this issue Jul 16, 2023 · 4 comments
Open

Mixed precision training #2291

y-akbal opened this issue Jul 16, 2023 · 4 comments

Comments

@y-akbal
Copy link

y-akbal commented Jul 16, 2023

Motivation and description

Just wondering if there is a way to do mixed precision training in Flux?

Possible Implementation

No response

@mcabbott
Copy link
Member

With the new-style training, I think this should basically just work.

m16 = f16(m32) makes a low-precision copy of the model, you can use that to compute the gradient g16, and then update!(opt_state, m32, g16) will apply this change to the original model.

Although not all operation support Float16, e.g. I'm not sure about convolutions. Maybe there are other un-anticipated problems.

It would be super-nice to have an example of this, e.g. a model zoo page which uses it.

@CarloLucibello
Copy link
Member

In FluxML/Optimisers.jl#152 I introduce an optimiser handling behind the curtains what @mcabbott said

@mcabbott
Copy link
Member

@y-akbal
Copy link
Author

y-akbal commented Jul 26, 2023

Great, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants