Revert/Fix: epoch indexing from 1, to be from 0 #2289

Borda · 2020-06-19T22:46:37Z

What does this PR do?

This reverts commit f94b919

Reaction to #1946
Fixes #2206

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

This reverts commit f94b919

pytorch_lightning/callbacks/gradient_accumulation_scheduler.py

pytorch_lightning/callbacks/progress.py

pytorch_lightning/trainer/training_loop.py

tests/models/test_restore.py

codecov · 2020-06-20T00:08:08Z

Codecov Report

Merging #2289 into master will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #2289   +/-   ##
======================================
  Coverage      88%     88%           
======================================
  Files          70      70           
  Lines        5490    5490           
======================================
+ Hits         4819    4821    +2     
+ Misses        671     669    -2

williamFalcon · 2020-06-20T03:40:03Z

@Borda we need to index epoch from 0

Borda · 2020-06-20T05:42:15Z

@Borda we need to index epoch from 0

yes, it is now... with this PR

* add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * cannot pass an int as default_save_path * refactor log message * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * fix test with new epoch indexing * fix progress bar totals * fix off by one error (see #2289) epoch starts at 0 now * added missing imports * fix hpc_save folderpath * fix formatting * fix tests * small fixes from a rebase * fix * tmpdir * tmpdir * tmpdir * wandb * fix merge conflict * add back evaluation after training * test_resume_early_stopping_from_checkpoint TODO * undo the horovod check * update changelog * remove a duplicate test from merge error * try fix dp_resume test * add the logger fix from master * try remove default_root_dir * try mocking numpy * try import numpy in docs test * fix wandb test * pep 8 fix * skip if no amp * dont mock when doctesting * install extra * fix the resume ES test * undo conf.py changes * revert remove comet pickle from test * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update weights_loading.rst * Update weights_loading.rst * Update weights_loading.rst * renamed flag * renamed flag * revert the None check in logger experiment name/version * add the old comments * _experiment * test chckpointing on DDP * skip the ddp test on windows * cloudpickle * renamed flag * renamed flag * parentheses for clarity * apply suggestion max epochs Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu> Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>

Revert "deprecated: epoch indexing from 1 (#2206)"

65bcefd

This reverts commit f94b919

Borda added the bug Something isn't working label Jun 19, 2020

Borda added this to the 0.8.x milestone Jun 19, 2020

mergify bot requested a review from a team June 19, 2020 22:47

Borda added 2 commits June 20, 2020 00:51

chlog

d461b16

grad index

bdc502c

Borda marked this pull request as ready for review June 19, 2020 22:59

Borda commented Jun 19, 2020

View reviewed changes

pytorch_lightning/callbacks/gradient_accumulation_scheduler.py Show resolved Hide resolved

pytorch_lightning/callbacks/progress.py Show resolved Hide resolved

pytorch_lightning/trainer/training_loop.py Show resolved Hide resolved

tests/models/test_restore.py Show resolved Hide resolved

Borda and others added 4 commits June 20, 2020 01:07

Apply suggestions from code review

1325ba4

tests

d864667

fix

92dcc80

test

6d1de81

Borda added the priority: 0 High priority task label Jun 19, 2020

williamFalcon merged commit f278ac4 into master Jun 20, 2020

Borda deleted the epoch-indexing branch June 20, 2020 05:41

awaelchli added a commit that referenced this pull request Jun 22, 2020

fix off by one error (see #2289) epoch starts at 0 now

c5330ed

awaelchli mentioned this pull request Jun 22, 2020

fixes for early stopping and checkpoint callbacks #1504

Merged

10 tasks

This was referenced Jul 3, 2020

For versions >0.8.2 learning rate is zero for last epoch (potentially a logging bug) #2480

Closed

Start accumulate gradients schedule at epoch 0 #2490

Closed

Start accumulate gradients schedule at epoch 0 (continued) #2513

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert/Fix: epoch indexing from 1, to be from 0 #2289

Revert/Fix: epoch indexing from 1, to be from 0 #2289

Borda commented Jun 19, 2020 •

edited

Loading

codecov bot commented Jun 20, 2020

williamFalcon commented Jun 20, 2020

Borda commented Jun 20, 2020

Revert/Fix: epoch indexing from 1, to be from 0 #2289

Revert/Fix: epoch indexing from 1, to be from 0 #2289

Conversation

Borda commented Jun 19, 2020 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Jun 20, 2020

Codecov Report

williamFalcon commented Jun 20, 2020

Borda commented Jun 20, 2020

Borda commented Jun 19, 2020 •

edited

Loading