[feat] OSS: Support nvidia's LARC #81

blefaudeux · 2020-09-11T18:34:23Z

🚀 Feature

Make it possible to support LARC with OSS

Motivation

LARC is a must have for large batch jobs, right now OSS will break on LARC because of the closure() being passed

Pitch

Should be doable to gracefully handle optimizers with do not support closures in step()

Alternatives

Not supporting LARC, reduces a lot of OSS interest

Additional context

cc @mannatsingh @prigoyal @msbaines

msbaines · 2020-09-11T19:03:06Z

SGTM

But if you are using LARC to wrap Adam then you may want to consider using the fused implementation of LAMB available in apex. It is implemented as a fused multi-tensor CUDA kernel so will run orders of magnitude faster than LARC, implemented in python, wrapping Adam.

If you are wrapping something other than Adam, you may obtain a noticeable speedup implementing a fused multi-tensor kernel. Something like this could be added to fairscale.

blefaudeux · 2020-09-12T00:08:42Z

The need for now (classy/vissl) is mostly around SGD, using this as a wrap, which breaks with the closure keyword argument. I'll keep the rest of Apex in mind, we should probably converge indeed

mannatsingh · 2020-09-15T02:01:46Z

Just to add a data point - I trained a RegNetY 128GF model on 8 nodes using FusedSGD and didn't notice any significant speed up.

blefaudeux self-assigned this Sep 11, 2020

blefaudeux mentioned this issue Sep 14, 2020

[feat ] OSS : optional closure argument for the optimizer #86

Merged

4 tasks

blefaudeux closed this as completed in #86 Sep 15, 2020

myleott added a commit that referenced this issue Feb 22, 2021

Slightly faster execution when world_size == 1 (#81)

5fc1f12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] OSS: Support nvidia's LARC #81

[feat] OSS: Support nvidia's LARC #81

blefaudeux commented Sep 11, 2020

msbaines commented Sep 11, 2020

blefaudeux commented Sep 12, 2020

mannatsingh commented Sep 15, 2020

[feat] OSS: Support nvidia's LARC #81

[feat] OSS: Support nvidia's LARC #81

Comments

blefaudeux commented Sep 11, 2020

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

msbaines commented Sep 11, 2020

blefaudeux commented Sep 12, 2020

mannatsingh commented Sep 15, 2020