Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto learning rate finder fails AFTER optimization due to misconfiguration #5487

Closed
noamzilo opened this issue Jan 12, 2021 · 7 comments · Fixed by #5638
Closed

auto learning rate finder fails AFTER optimization due to misconfiguration #5487

noamzilo opened this issue Jan 12, 2021 · 7 comments · Fixed by #5638
Assignees
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@noamzilo
Copy link
Contributor

noamzilo commented Jan 12, 2021

🐛 Bug

Optimizing learning rate using auto_lr_find=True fails AFTER a learning rate was found with error

pytorch_lightning.utilities.exceptions.MisconfigurationException: When auto_lr_find is set to True, expects that modelormodel.hparamseither has fieldlrorlearning_rate that can overridden

AFTER having waited for several minutes.

This can be caught BEFORE the long run to save time.

Please reproduce using the BoringModel

I can just copy paste everything to here... the only lacking line is self.lr = 999 under model::__init__(), if it is not present the error happens, otherwise it doesn't happen.

Reproducing this a single time takes me 15 minutes (due to the lr finding process), but I still wanted to report this because it seems important and I don't have time right now for a full reproduction on the template.

trainer = Trainer(auto_lr_find=True) trainer.tune(model)

then for a while I see the progress bar

Finding best initial lr: 27%|██▋ | 27/100 [02:43<07:37, 6.27s/it]

and then the error after a lr has been found.

Expected behavior

pytorch_lightning.utilities.exceptions.MisconfigurationException: When auto_lr_find is set to True, expects that modelormodel.hparamseither has fieldlrorlearning_rate that can overridden

should hit BEFORE the long calculation.

Environment

  • PyTorch Version (e.g., 1.0): 1.7.1
  • OS (e.g., Linux): Windows
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source): -
  • Python version: 3.7
  • CUDA/cuDNN version: 101
  • GPU models and configuration: single GPU NVIDIA quadro P4000
  • Any other relevant information:-
@noamzilo noamzilo added bug Something isn't working help wanted Open to be worked on labels Jan 12, 2021
@rohitgr7
Copy link
Contributor

good catch! Mind send a PR with a fix?

@noamzilo
Copy link
Contributor Author

good catch! Mind send a PR with a fix?

I actually never contributed to open source before :)

How does this work? I get assigned and there's a deadline?
is there a readme?

I want to do it!

@carmocca
Copy link
Contributor

We have a basic guide here!

There's no deadline! Any contribution is welcome 😄

@noamzilo
Copy link
Contributor Author

Can the same issue be done twice by some people w/o them knowing about it until the latter does a PR?

@carmocca
Copy link
Contributor

Yes. For this reason, you might want to create the PR in draft mode while you are working on it, so others can know about it

https://github.blog/2019-02-14-introducing-draft-pull-requests/

@edenlightning
Copy link
Contributor

Thanks for contributing @noamzilo! I have assigned this ticket to you, so other contributors know that this is already in progress. please let us know if you need any help!

@noamzilo
Copy link
Contributor Author

noamzilo commented Jan 24, 2021

How to link this post to the PR?
PR at:

#5638

currently my test doesn't pass due to lack of

pip install pytest-timeout (from https://stackoverflow.com/questions/19527320/how-can-i-limit-the-maximum-running-time-for-a-unit-test) in the test server.

Please allow this library, or advise how I should limit test time.

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants