-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LRFinder w/ Gradient Accumulation #8
Comments
As of right now, I'm not planning on developing the package further. But I'll gladly review PR's |
Hi, @rsomani95 , I've implement a version of Before the PR is merged, you can clone that version of implementation from my forked repository to do any test you want. $ git clone -b grad_acc_amp --single-branch https://github.com/NaleRaphael/pytorch-lr-finder
$ cd pytorch-lr-finder
$ pip install . And please feel free to let me know if there is anything need to be improved! |
@NaleRaphael this looks great! Give me a few days to get back to you with some feedback. |
Hi, @rsomani95 !
However, I ran into a little problem about this: the time of mixed precision training takes longer than training in FP32. I'll try to validate it in the next few days. ;) |
I've run more tests on my machine. And I found that we can set the flag And I wrote a script for testing the performance of LRFinder with Here is the table of the result after I run that script with different conditions:
Besides, I've create a notebook in colab. But the GPU used by colab is K80, and there is no tensor cores in it. Although the performance is not benefited by tensor cores, it seems that LRFinder is still stable enough to run with |
Hello @NaleRaphael. This is great work!! Thank you for sharing it.
In my experiments, I took used an RTX 2080 Ti, so I expected performance gains with FP16 ( With FP32, it took ~18:47, whereas with FP16, it took ~15:00. Thank you again for your time and effort! |
Hi @rsomani95 . And it's quite weird that it takes longer time to run when However, it seems to me that it's not harmful to pend the issue about Besides, it seems that apex is going to be integrated as a builtin component of PyTorch in the future. (nvidai/apex#659) I will keep tracking this, too. @davidtvs Before merging this PR, I would like to add some code to make users able to install Thanks you, guys! |
Sounds good, I'll wait for your changes and then merge. Thanks |
Hi @davidtvs . $ git clone -b grad_acc_amp --single-branch https://github.com/NaleRaphael/pytorch-lr-finder
$ cd pytorch-lr-finder
$ pip install -v --global-option="amp" ./ After the latest version of this package is updated on PyPI, the command Thanks a lot for your review! |
* UPDATE: implement a new LRFinder with the support of gradient accumulation `AccumulationLRFinder` is a learning rate finder implemented with the mechanism of gradient accumulation. Besides, the iterator used for getting batch of data for training is now replaced by `DataLoaderIterWrapper` to simplify the code and make the implementation of `AccumulationLRFinder` easiler. And the input parameter ofs `LRFinder._train_batch()` is also modified for the same reason mentioned above. * UPDATE: add support for mixed precision training (#8) * UPDATE: add requirements for mixed precision training and update README * MAINT: improve the compatibility for Python 2 and some minor fixes - add `next = __next__` in the class `DataLoaderIterWrapper` - call `logging.basicConfig()` before getting a logger, see also: https://docs.python.org/2.7/library/logging.html#logging.log - Add more information about installing this package for users who need to use it with mixed precision training
The PR from @NaleRaphael is merged. Thanks @rsomani95 for raising the issue. |
I'm considering changing the API for gradient accumulation, please have a look at PR #13 and give your feedback |
Great package! Thank you for sharing :)
LRFinder
with a larger batch size.The text was updated successfully, but these errors were encountered: