This is modification of the gdtuo source code from https://github.com/kach/gradient-descent-the-ultimate-optimizer with an implementation of norm-based averaged gradient clipping as outlined in the following paper by Pascanu et al.: https://proceedings.mlr.press/v28/pascanu13.pdf.
For any
Open sample_usage_mnist.ipynb
or sample_usage_cifar10.ipynb
in JupyterLab. In cell 3, uncomment the desired optimizer stack and leave the others commented. Optimizers with clip=True
have gradient clipping enabled. Otherwise, gradient clipping is disabled. Execute the cells in the notebook as usual.