Make training_epoch_end behave like validation_epoch_end #1357

jbschiratti · 2020-04-03T08:20:21Z

What does this PR do?

This PR fixes #914.

The class LightningModule currently implements a validation_epoch_end method but no training_epoch_end method. The CHANGELOG.md file suggests that, from version 0.7.0, the training_end (or now training_step_end) method is to be renamed training_epoch_end. However, this is not satisfactory as training_epoch_end would still not behave like validation_epoch_end.

This PR ensures that both training_epoch_end and validation_epoch_end have the same behavior (for instance: allow the user to log metrics only once at the end of each epoch).

PR review

Anyone in the community is free to review the PR once the tests have passed 👍

…s in docstrings.

pep8speaks · 2020-04-03T08:20:25Z

Hello @jbschiratti! Thanks for updating this PR.

In the file pytorch_lightning/trainer/training_loop.py:

Line 575:111: E501 line too long (118 > 110 characters)

Comment last updated at 2020-04-03 11:15:16 UTC

Borda

Great catch! thx 🚀 pls add a note to changelog...

pytorch_lightning/trainer/training_loop.py

mergify · 2020-04-03T09:52:55Z

This pull request is now in conflict... :(

ethanwharris

Thanks for the PR. Unfortunately there's a potential memory leak here. The process_output function doesn't detach the tensors, so when I run your code, the each tensor in the OrderedDict given to on_training_epoch_end still has an associated grad_fn. To resolve this we need a way to detach any tensors in the output, perhaps based on process_output.

pytorch_lightning/core/lightning.py

pytorch_lightning/trainer/training_loop.py

…fix.

jbschiratti · 2020-04-03T12:42:28Z

CircleCI failed on test_ddp_all_dataloaders_passed_to_fit however, AFAIK, this does not seem to be related with the PR. Is it?

jbschiratti · 2020-04-03T12:47:54Z

Thanks @Borda, @ethanwharris and @justusschock 👍🎉!

awaelchli

This is great! One obvious thing that was missing is now finally here :))

I would like to point out some things that were overlooked.

pytorch_lightning/core/lightning.py

Borda · 2020-04-03T13:34:44Z

@awaelchli thank you for your careful reading, may apply your suggestions in follow-up PR?

awaelchli · 2020-04-03T13:37:31Z

yeah sure:)
btw is it not possible to add more commits to this branch and merge again?

Borda · 2020-04-03T13:46:41Z

well, you would need to fork jbschiratti:fix_914 (another forked repo) so I would suggest starting with the actual master as this is already merged...

* Doc fixes from #1357 (awaelchli's comments) + changelog. * Fix indentation. * Add blank line to fix doc build?

…I#1357) * Make training_epoch_end behave like validation_epoch_end + minor fixes in docstrings. * Minor fixes (Borda's comments). * Detach tensors in batch_output (to avoid possible memory leak) + doc fix. Co-authored-by: Jean-Baptiste SCHIRATTI <jean-baptisteschiratti@MacBook-Pro-de-Jean-Baptiste.local>

* Doc fixes from Lightning-AI#1357 (awaelchli's comments) + changelog. * Fix indentation. * Add blank line to fix doc build?

Make training_epoch_end behave like validation_epoch_end + minor fixe…

3d555cc

…s in docstrings.

mergify bot requested a review from a team April 3, 2020 08:20

Borda added bug Something isn't working feature Is an improvement or enhancement labels Apr 3, 2020

Borda added this to the 0.7.2 milestone Apr 3, 2020

Borda approved these changes Apr 3, 2020

View reviewed changes

pytorch_lightning/trainer/training_loop.py Show resolved Hide resolved

pytorch_lightning/trainer/training_loop.py Outdated Show resolved Hide resolved

Borda requested review from jeffling, MattPainter01 and a team April 3, 2020 08:43

justusschock approved these changes Apr 3, 2020

View reviewed changes

justusschock requested a review from a team April 3, 2020 08:48

Minor fixes (Borda's comments).

168ab2b

ethanwharris requested changes Apr 3, 2020

View reviewed changes

pytorch_lightning/core/lightning.py Outdated Show resolved Hide resolved

pytorch_lightning/trainer/training_loop.py Show resolved Hide resolved

jbschiratti added 2 commits April 3, 2020 11:55

Merge branch 'master' into fix_914

bc3519d

Detach tensors in batch_output (to avoid possible memory leak) + doc …

39cecdb

…fix.

Borda requested a review from ethanwharris April 3, 2020 11:26

Borda added the ready PRs ready to be merged label Apr 3, 2020

ethanwharris approved these changes Apr 3, 2020

View reviewed changes

Borda merged commit 868b172 into Lightning-AI:master Apr 3, 2020

awaelchli reviewed Apr 3, 2020

View reviewed changes

pytorch_lightning/core/lightning.py Show resolved Hide resolved

pytorch_lightning/core/lightning.py Show resolved Hide resolved

pytorch_lightning/core/lightning.py Show resolved Hide resolved

pytorch_lightning/core/lightning.py Show resolved Hide resolved

jbschiratti added a commit to jbschiratti/pytorch-lightning that referenced this pull request Apr 3, 2020

Doc fixes from Lightning-AI#1357 (awaelchli's comments) + changelog.

3bba576

jbschiratti mentioned this pull request Apr 3, 2020

Doc fixes #1362

Merged

williamFalcon pushed a commit that referenced this pull request Apr 3, 2020

Doc fixes (#1362)

e570d2e

* Doc fixes from #1357 (awaelchli's comments) + changelog. * Fix indentation. * Add blank line to fix doc build?

awaelchli mentioned this pull request Apr 4, 2020

Why is there no training_epoch_end? #1076

Closed

alexeykarnachev pushed a commit to alexeykarnachev/pytorch-lightning that referenced this pull request Apr 4, 2020

Doc fixes (Lightning-AI#1362)

d055430

* Doc fixes from Lightning-AI#1357 (awaelchli's comments) + changelog. * Fix indentation. * Add blank line to fix doc build?

Borda mentioned this pull request Apr 4, 2020

add forgotten change logs #1380

Merged

Deanamic mentioned this pull request Apr 10, 2020

Also update progress_bar in training_epoch_end #1448

Closed

tullie pushed a commit to tullie/pytorch-lightning that referenced this pull request Jun 7, 2020

Doc fixes (Lightning-AI#1362)

911852f

* Doc fixes from Lightning-AI#1357 (awaelchli's comments) + changelog. * Fix indentation. * Add blank line to fix doc build?

Borda modified the milestones: v0.7., v0.7.x Apr 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make training_epoch_end behave like validation_epoch_end #1357

Make training_epoch_end behave like validation_epoch_end #1357

jbschiratti commented Apr 3, 2020 •

edited

Loading

pep8speaks commented Apr 3, 2020 •

edited

Loading

Borda left a comment •

edited

Loading

mergify bot commented Apr 3, 2020

ethanwharris left a comment

jbschiratti commented Apr 3, 2020

jbschiratti commented Apr 3, 2020

awaelchli left a comment •

edited

Loading

Borda commented Apr 3, 2020

awaelchli commented Apr 3, 2020

Borda commented Apr 3, 2020

Make training_epoch_end behave like validation_epoch_end #1357

Make training_epoch_end behave like validation_epoch_end #1357

Conversation

jbschiratti commented Apr 3, 2020 • edited Loading

What does this PR do?

PR review

pep8speaks commented Apr 3, 2020 • edited Loading

Comment last updated at 2020-04-03 11:15:16 UTC

Borda left a comment • edited Loading

Choose a reason for hiding this comment

mergify bot commented Apr 3, 2020

ethanwharris left a comment

Choose a reason for hiding this comment

jbschiratti commented Apr 3, 2020

jbschiratti commented Apr 3, 2020

awaelchli left a comment • edited Loading

Choose a reason for hiding this comment

Borda commented Apr 3, 2020

awaelchli commented Apr 3, 2020

Borda commented Apr 3, 2020

jbschiratti commented Apr 3, 2020 •

edited

Loading

pep8speaks commented Apr 3, 2020 •

edited

Loading

Borda left a comment •

edited

Loading

awaelchli left a comment •

edited

Loading