Weekly Patch Release v1.4.6 [full merge, no squash] #9358

justusschock · 2021-09-07T12:26:06Z

What does this PR do?

gh pr list -s merged -S 'merged:2021-09-01T16:30:00.000Z..2021-09-07T22:30:00.000Z' --json mergedAt,milestone,url,mergeCommit,title --jq 'sort_by(.mergedAt) | reverse | .[] | select((.milestone.title == "v1.4.x") or (.milestone.title == null)) | [.url, .mergeCommit.oid, .title] | join(" ")' --limit 100

https://github.com/PyTorchLightning/pytorch-lightning/pull/9288 6892d533ea1c743f7e05171846a28e685db85f51 Run plugin closure before `on_before_optimizer_step` [1/2]
https://github.com/PyTorchLightning/pytorch-lightning/pull/9311 d49709e29c6174be8bde7e1edc288180a2173adc Fix DeepSpeed warning CI Test
https://github.com/PyTorchLightning/pytorch-lightning/pull/9319 0135a4bd1ca338ddd1ceedd17c9a91dcd8d8be1f Remove some incorrect comments in ddp.py
https://github.com/PyTorchLightning/pytorch-lightning/pull/9336 98e2f56db090f38a09d5f63202688590702ba15a Clear reference to training loss at the end of train step
https://github.com/PyTorchLightning/pytorch-lightning/pull/9316 9149b649089976d2d723e33fca929bf92f192ff8 [bugfix] Resolve PyTorch Profiling for Manual Optimization
https://github.com/PyTorchLightning/pytorch-lightning/pull/9125 904dde7573c97245b45e477631ece989bd8c01e9 Fix inspection of unspecified args for container hparams
https://github.com/PyTorchLightning/pytorch-lightning/pull/9279 dc3391beaec4e16b08ffe8bf9a05cf8039f8b9e7 Remove deprecation warnings being called for `on_{task}_dataloader`
https://github.com/PyTorchLightning/pytorch-lightning/pull/8877 cf1a589956f86a0cf1a50c0710051eee9b082094 Allow disabling automatic stopping after max_steps or max_epochs
https://github.com/PyTorchLightning/pytorch-lightning/pull/9308 f6d40871bd52ac755a146958513a0a330b813b52 Prevent loss to be moved to the cpu before backward call.
https://github.com/PyTorchLightning/pytorch-lightning/pull/9301 9d0caa6928c28fcf2252c3acdc6fda8570e5adb9 Fix TPU cleaning job
https://github.com/PyTorchLightning/pytorch-lightning/pull/9156 d5ee8d8e3f46f0e5a6789f45d865fb348fd738f3 Disable `{save,check}_on_train_epoch_end` with `check_val_every_n_epoch>1`
https://github.com/PyTorchLightning/pytorch-lightning/pull/9261 f745aa9ce1b8a78b8ef27b939dc1db456837b374 Move tracking epoch end outputs logic to the `EvaluationEpochLoop`
https://github.com/PyTorchLightning/pytorch-lightning/pull/9223 a7461bfc3b98da2314c21603ee457c4b604f4c9a Add missing callbacks to `callbacks.rst`
https://github.com/PyTorchLightning/pytorch-lightning/pull/9232 ead2404aac20658b6ca0d99317bbaabc94f99f87 Added doc strings to base logger file
https://github.com/PyTorchLightning/pytorch-lightning/pull/9231 f0788b3bbc8773543297ee8d8c6d17a679703bb1 scheduled removal of auto_move_data decorator
https://github.com/PyTorchLightning/pytorch-lightning/pull/9267 69cdb79e33de3dc0b19aad4c6fe8c5c9d21d28c4 Add check for uninitialized `_sync_dir` in DDP Plugin to avoid errors during error handling
https://github.com/PyTorchLightning/pytorch-lightning/pull/9289 071ae498083afc131828c982b3fcb62944a751d1 Fix `LightningOptimizer.step` signature
https://github.com/PyTorchLightning/pytorch-lightning/pull/8800 e2ecb8f8591d79e81512cd70d773cb9b4c390132 Allow exporting to onnx when input is tuple
https://github.com/PyTorchLightning/pytorch-lightning/pull/9255 f9994d456cb264f8d66002eae6d7d51bd1ecc94d Update CHANGELOG following patch releases

Excluded #9231 due to breaking changes and #8877 since no bugfix

Also skipping #9308 since it's entangled with data fetching.
And #9319 because it requires Post LOCALSGD

Fixes #<issue_number>

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-09-07T13:10:16Z

Codecov Report

Merging #9358 (520d85d) into release/1.4.x (a61cc72) will increase coverage by 0%.
The diff coverage is 95%.

@@              Coverage Diff              @@
##           release/1.4.x   #9358   +/-   ##
=============================================
  Coverage             92%     92%           
=============================================
  Files                218     218           
  Lines              14490   14511   +21     
=============================================
+ Hits               13393   13419   +26     
+ Misses              1097    1092    -5

tchaton

LGTM !

Fixes #8799

…uring error handling (#9267)

* added doc strings to base logger * updated docs

…9261)

…ch>1` (#9156)

* Update parsing.py * add todo (for single arg) * unblock non container single arg * init test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update CHANGELOG.md * pep8 line length * Update pytorch_lightning/utilities/parsing.py * remove dict namespace conversion * add omegaconf support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add dict test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add omegaconf test * Update CHANGELOG.md * Update pytorch_lightning/utilities/parsing.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/utilities/parsing.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Without clearing this reference, the loss tensor stays live through the next training step. This can be a problem for memory intensive models that produce very deep backward graphs such as neural ODEs. For these models, keeping the backward graph of the previous loss in memory can lead to OOM errors in the next training step even though the step might have succeeded if we had cleared (and thus GC'd) the previous backward graph. Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

carmocca · 2021-09-07T16:49:33Z

I think #9288 and #9308 should make it in.

Also, remember you should update the milestones of the PRs with no milestone or that you've decided not to include

justusschock · 2021-09-07T16:59:04Z

i can add #9308 manually. when I tried to cherry pick it, it somehow messed everything up since it included the data fetching. Not sure why...

pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py

leezu · 2021-09-07T18:14:27Z

Would it make sense to revert #9239 as part of 1.4.6? This can trigger "RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment". I'll open an issue about it soon

carmocca · 2021-09-07T23:28:07Z

I'll open an issue about it soon

You can use #8821 - check the last few comments. A reproduction would be appreciated.

pytorch_lightning/callbacks/early_stopping.py

Co-authored-by: Justus Schock <justus.schock@lfb.rwth-aachen.de>

awaelchli · 2021-09-09T22:10:46Z

test_deepspeed_multigpu_stage_3 test passes but pytest hangs. 🙈 😭

Somehow deepspeed stage 3 changes made in into this PR even though none of the listed commits have any deepspeed fixes.

justusschock changed the base branch from master to release/1.4.x September 7, 2021 12:26

justusschock marked this pull request as ready for review September 7, 2021 12:28

justusschock requested review from awaelchli, Borda, carmocca, kaushikb11, SeanNaren, tchaton and williamFalcon as code owners September 7, 2021 12:28

justusschock changed the title ~~V1.4.6~~ Weekly Patch Release v1.4.6 [full merge, no squash] Sep 7, 2021

justusschock requested a review from edenlightning as a code owner September 7, 2021 14:17

Borda force-pushed the v1.4.6 branch from 1b8336e to 68dcd06 Compare September 7, 2021 14:35

Borda added Important let's do it! approved to implement labels Sep 7, 2021

Borda marked this pull request as draft September 7, 2021 15:04

justusschock force-pushed the v1.4.6 branch from cb624d5 to 68dcd06 Compare September 7, 2021 15:42

justusschock marked this pull request as ready for review September 7, 2021 16:22

justusschock requested a review from ananthsub as a code owner September 7, 2021 16:22

tchaton approved these changes Sep 7, 2021

View reviewed changes

justusschock and others added 10 commits September 7, 2021 18:37

Update CHANGELOG following patch release (#9255)

c692bf3

Allow exporting to onnx when input is tuple (#8800)

1525955

Fixes #8799

Fix LightningOptimizer.step signature (#9289)

7bd4fff

model def

bb498b3

Add check for uninitialized _sync_dir in DDP Plugin to avoid errors d…

dca4a41

…uring error handling (#9267)

Added doc strings to base logger file (#9232)

62ecd39

* added doc strings to base logger * updated docs

Add missing callbacks to callbacks.rst (#9223)

74bb5d1

Move tracking epoch end outputs logic to the EvaluationEpochLoop (#…

fc841cc

…9261)

Disable {save,check}_on_train_epoch_end with `check_val_every_n_epo…

67b1a97

…ch>1` (#9156)

Fix TPU cleaning job (#9301)

ec62de3

justusschock and others added 5 commits September 7, 2021 18:38

Update hooks.py

96541cf

[bugfix] Resolve PyTorch Profiling for Manual Optimization (#9316)

130bc06

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Remove todo, ensure we only check rank 0 for deepspeed warning (#9311)

d3541ea

awaelchli force-pushed the v1.4.6 branch from 70300ed to 597a1c8 Compare September 7, 2021 16:41

awaelchli approved these changes Sep 7, 2021

View reviewed changes

mergify bot added the ready PRs ready to be merged label Sep 7, 2021

awaelchli reviewed Sep 7, 2021

View reviewed changes

pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py Outdated Show resolved Hide resolved

awaelchli added this to the v1.4.x milestone Sep 7, 2021

justusschock force-pushed the v1.4.6 branch from 4eaba9f to ad319d9 Compare September 8, 2021 09:10

Borda force-pushed the v1.4.6 branch from ad319d9 to 30d8784 Compare September 8, 2021 09:25

Borda approved these changes Sep 8, 2021

View reviewed changes

awaelchli force-pushed the v1.4.6 branch from 30d8784 to 7a20cdf Compare September 8, 2021 09:47

awaelchli reviewed Sep 8, 2021

View reviewed changes

pytorch_lightning/callbacks/early_stopping.py Show resolved Hide resolved

awaelchli force-pushed the v1.4.6 branch from 7a20cdf to fe4d3dd Compare September 8, 2021 10:07

justusschock force-pushed the v1.4.6 branch 2 times, most recently from 48483ef to fe4d3dd Compare September 8, 2021 20:58

awaelchli force-pushed the v1.4.6 branch 3 times, most recently from fbb7f16 to 269cb03 Compare September 9, 2021 21:24

1.4.6 release

520d85d

Co-authored-by: Justus Schock <justus.schock@lfb.rwth-aachen.de>

awaelchli force-pushed the v1.4.6 branch from 269cb03 to 520d85d Compare September 9, 2021 22:06

lexierule merged commit 00c6640 into release/1.4.x Sep 10, 2021

lexierule deleted the v1.4.6 branch September 10, 2021 13:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weekly Patch Release v1.4.6 [full merge, no squash] #9358

Weekly Patch Release v1.4.6 [full merge, no squash] #9358

justusschock commented Sep 7, 2021 •

edited

Loading

codecov bot commented Sep 7, 2021 •

edited

Loading

tchaton left a comment

carmocca commented Sep 7, 2021

justusschock commented Sep 7, 2021

leezu commented Sep 7, 2021

carmocca commented Sep 7, 2021

awaelchli commented Sep 9, 2021

Weekly Patch Release v1.4.6 [full merge, no squash] #9358

Weekly Patch Release v1.4.6 [full merge, no squash] #9358

Conversation

justusschock commented Sep 7, 2021 • edited Loading

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Did you have fun?

codecov bot commented Sep 7, 2021 • edited Loading

Codecov Report

tchaton left a comment

Choose a reason for hiding this comment

carmocca commented Sep 7, 2021

justusschock commented Sep 7, 2021

leezu commented Sep 7, 2021

carmocca commented Sep 7, 2021

awaelchli commented Sep 9, 2021

justusschock commented Sep 7, 2021 •

edited

Loading

codecov bot commented Sep 7, 2021 •

edited

Loading