Skip to content

Adaptation of discriminative learning rates from the Fastai library for standard PyTorch.

Notifications You must be signed in to change notification settings

vdouet/Discriminative-learning-rates-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 

Repository files navigation

Discriminative learning rates PyTorch

Adaptation of discriminative learning rates from the Fastai library for standard PyTorch.

This is an adaptation of the functions from the fastai library to be used with standard PyTorch. Please see the fastai github for original implementation.

Example of use

#Discriminative learning rates using resnet50 with SGD and CyclicLR

model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
params, lr_arr, _ = discriminative_lr_params(model, slice(1e-5, 1e-3))
optim = torch.optim.SGD(params, lr=1e-3, momentum=0.9, weight_decay=1e-1)
lr_scheduler = torch.optim.lr_scheduler.CyclicLR(optim, base_lr=list(lr_arr), max_lr=list(lr_arr*100))
  • Using slice(min_lr, max_lr) the last layer will have max_lr as a learning rate and the first one will have min_lr. All middle layers will have learning rates logarithmically interpolated ranging from min_lr to max_lr.
  • Using sclice(lr) the last layer will have the corresponding learning rate and all other ones will have lr/10.

We can show the learning rates using optim.state_dict()

Parameter Lr*
Group 0 1e-05
Group 1 1.06512-05
Group 2 1.13447e-05
Group 3 1.2085e-05
... ...
Group 70 0.00083
Group 71 0.00088
Group 72 0.00094
Group 73 0.00099

* Learning rates are rounded here.

Difference from the fastai version

The big difference is that we give one independant learning rate for each layer here. In the fastai implementation layers are regrouped in blocks.

About

Adaptation of discriminative learning rates from the Fastai library for standard PyTorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages