Nvidia Apex for FP16 calculations #36

YacobBY · 2019-07-23T13:52:19Z

Included Compatibility with the Nvidia's Apex library, which can do Floating Point16 calculations. This gives significant speedup in training. This code has been tested on a single RTX2070. If the Nvidia Apex library is not found the code should run as normal.

To install Apex: https://github.com/NVIDIA/apex#quick-start

Known bugs:
-Does not work with adam parameter
-Gradient overflow keeps happening at the start, however it automatically reduces loss scale to 8192 after which this notification disappears

examples:
Loading: https://i.imgur.com/3nZROJz.png
Training: https://i.imgur.com/Q2w52m7.png

Included Compatibility with the Nvidia's Apex library, which can do Floating Point16 calculations. This gives significant speedup in training. This code has been tested on a single RTX2070. If the Nvidia Apex library is not found the code should run as normal. To install Apex: https://github.com/NVIDIA/apex#quick-start Known bugs: -Does not work with adam parameter -Gradient overflow keeps happening at the start, however it automatically reduces loss scale to 8192 after which this notification disappears examples: Loading: https://i.imgur.com/3nZROJz.png Training: https://i.imgur.com/Q2w52m7.png

ku21fan · 2019-07-24T15:36:45Z

@YacobBY Thank you for pull request!
I did not try floating point 16 calculation yet, but it seems work, thus I merged it :)

ku21fan · 2019-07-24T16:19:38Z

@YacobBY
I am sorry that I will back to the previous version.
Instead, I will refer this pull request in the readme.
The reasons are below.

Using floating-point 16 calculation is not the default option of our paper, thus not all people need to install apex and know it.
I know that the code run as normal if apex library is not found.
but I feel the code became slightly complex after merge this pull request.
I hope to keep this code simple.
known bugs as you mentioned :'(

best.

YacobBY · 2019-07-24T16:43:55Z

@ku21fan Hello JeongHun,

I understand. The Apex compatibility code has indeed added a lot of lines and FP16 is very new so not many people have the hardware + library to run it yet.

Currently I'm trying out some new deep-learning optimizers such as Pytorch-Lightning and Nvidia Apex. I might be able to implement the apex fusedAdamOptimizer instead of the default adam option if Apex is available. This should fix the adam bug, however it still leads to a lot of extra code lines.

If I can get the other Apex functionality working I'll try to get back to you with a neater and more modular version.

In any case thanks for your open source code! it's really helpful and I've learned a lot from it.

YacobBY added 2 commits July 23, 2019 15:51

use amp grad clipping

2d45ba2

ku21fan merged commit 5d4ed38 into clovaai:master Jul 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvidia Apex for FP16 calculations #36

Nvidia Apex for FP16 calculations #36

YacobBY commented Jul 23, 2019

ku21fan commented Jul 24, 2019

ku21fan commented Jul 24, 2019

YacobBY commented Jul 24, 2019

Nvidia Apex for FP16 calculations #36

Nvidia Apex for FP16 calculations #36

Conversation

YacobBY commented Jul 23, 2019

ku21fan commented Jul 24, 2019

ku21fan commented Jul 24, 2019

YacobBY commented Jul 24, 2019