Skip to content

A Jupyter notebook exploring sophisticated learning rate strategies for training deep neural networks

License

Notifications You must be signed in to change notification settings

connorcl/pytorch-lr-explorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pytorch-lr-explorer

A Jupyter notebook exploring various sophisticated techniques for working with learning rate in deep neural networks, including:

  • A systematic method for estimating optimal learning rate settings, in which the learning rate is constantly increased for a short duration of training, and then plotted against training loss or accuracy. This idea was introduced in this paper by Leslie N. Smith.
  • Time-based learning rate scheduling, and in particular cosine annealing with warm restarts, where the learning rate is cyclically varied between high and low boundaries in a cosine pattern. This particular technique comes from this paper by Ilya Loshchilov and Frank Hutter.
  • Snapshot ensembling, an extension to the above technique which involves taking a snapshot of the model after every cycle and using the last M snapshots as an ensemble. This technique is based on this paper by Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft and Kilian Q. Weinberger.

These strategies are demonstrated with respect to an image classification problem (CIFAR10) using a simple resnet-style convolutional neural network created with PyTorch.

About

A Jupyter notebook exploring sophisticated learning rate strategies for training deep neural networks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published