Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] OSS: Support nvidia's LARC #81

Closed
blefaudeux opened this issue Sep 11, 2020 · 3 comments · Fixed by #86
Closed

[feat] OSS: Support nvidia's LARC #81

blefaudeux opened this issue Sep 11, 2020 · 3 comments · Fixed by #86
Assignees

Comments

@blefaudeux
Copy link
Contributor

🚀 Feature

Make it possible to support LARC with OSS

Motivation

LARC is a must have for large batch jobs, right now OSS will break on LARC because of the closure() being passed

Pitch

Should be doable to gracefully handle optimizers with do not support closures in step()

Alternatives

Not supporting LARC, reduces a lot of OSS interest

Additional context

cc @mannatsingh @prigoyal @msbaines

@blefaudeux blefaudeux self-assigned this Sep 11, 2020
@msbaines
Copy link
Contributor

SGTM

But if you are using LARC to wrap Adam then you may want to consider using the fused implementation of LAMB available in apex. It is implemented as a fused multi-tensor CUDA kernel so will run orders of magnitude faster than LARC, implemented in python, wrapping Adam.

If you are wrapping something other than Adam, you may obtain a noticeable speedup implementing a fused multi-tensor kernel. Something like this could be added to fairscale.

@blefaudeux
Copy link
Contributor Author

The need for now (classy/vissl) is mostly around SGD, using this as a wrap, which breaks with the closure keyword argument. I'll keep the rest of Apex in mind, we should probably converge indeed

@mannatsingh
Copy link

Just to add a data point - I trained a RegNetY 128GF model on 8 nodes using FusedSGD and didn't notice any significant speed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants