-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weekly Patch Release v1.4.6 [full merge, no squash] #9358
Conversation
Codecov Report
@@ Coverage Diff @@
## release/1.4.x #9358 +/- ##
=============================================
Coverage 92% 92%
=============================================
Files 218 218
Lines 14490 14511 +21
=============================================
+ Hits 13393 13419 +26
+ Misses 1097 1092 -5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
…uring error handling (#9267)
* added doc strings to base logger * updated docs
* Update parsing.py * add todo (for single arg) * unblock non container single arg * init test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update CHANGELOG.md * pep8 line length * Update pytorch_lightning/utilities/parsing.py * remove dict namespace conversion * add omegaconf support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add dict test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add omegaconf test * Update CHANGELOG.md * Update pytorch_lightning/utilities/parsing.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/utilities/parsing.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Without clearing this reference, the loss tensor stays live through the next training step. This can be a problem for memory intensive models that produce very deep backward graphs such as neural ODEs. For these models, keeping the backward graph of the previous loss in memory can lead to OOM errors in the next training step even though the step might have succeeded if we had cleared (and thus GC'd) the previous backward graph. Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
i can add #9308 manually. when I tried to cherry pick it, it somehow messed everything up since it included the data fetching. Not sure why... |
pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
Outdated
Show resolved
Hide resolved
Would it make sense to revert #9239 as part of 1.4.6? This can trigger "RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment". I'll open an issue about it soon |
You can use #8821 - check the last few comments. A reproduction would be appreciated. |
48483ef
to
fe4d3dd
Compare
fbb7f16
to
269cb03
Compare
Co-authored-by: Justus Schock <justus.schock@lfb.rwth-aachen.de>
Somehow deepspeed stage 3 changes made in into this PR even though none of the listed commits have any deepspeed fixes. |
What does this PR do?
Excluded #9231 due to breaking changes and #8877 since no bugfix
Also skipping #9308 since it's entangled with data fetching.
And #9319 because it requires Post LOCALSGD
Fixes #<issue_number>
Does your PR introduce any breaking changes? If yes, please list them.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃