-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Add Logging Restoration on Failure 2/2 #7966
Conversation
…ghtning/pytorch-lightning into fault_tolerant_log_2/n
Codecov Report
@@ Coverage Diff @@
## master #7966 +/- ##
======================================
- Coverage 93% 92% -0%
======================================
Files 211 211
Lines 13416 13448 +32
======================================
- Hits 12447 12437 -10
- Misses 969 1011 +42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic!
return partial(self.fn, reduce_op=self.op, group=self.group) if self.should else self.no_op | ||
return ( | ||
partial(self.fn, reduce_op=self.op, group=self.group) | ||
if self.should and not self.rank_zero_only else self.no_op |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kazhang as an FYI where the self.rank_zero_only
check was introduced.
What does this PR do?
Improve support for state restoration of results
Requires TM RC.
Tracking Issue: #7898
Follow up to #7948
Before submitting
PR review