-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove on epoch guard from the should stop validation check #7701
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7701 +/- ##
======================================
- Coverage 93% 92% -0%
======================================
Files 199 199
Lines 12965 12960 -5
======================================
- Hits 11995 11952 -43
- Misses 970 1008 +38 |
Can you include the refactors in the changelog, including #7682 ? I have a section "Refactored Loops" |
|
||
assert trainer.current_epoch == 0 | ||
assert trainer.global_step == 5 | ||
assert model.validation_called_at == (0, 4) # TODO(@carmocca): should be 5 - will be fixed in next PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 5, in the current way global step is used it is expected.
what are you planning to fix here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, but the current way is "wrong" (although expected given the design in master). That's why I used the verb "fix"
The fix will be incrementing the global_step count appropriately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we said we would use the progress tracking to count optimizer step?
The only difference between global_step and the real optimizer step count is that the global_step update is delayed for logging.
Sorry if I mix something up again here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
* Remove on epoch guard from the should stop validation check * Formatting
* Remove on epoch guard from the should stop validation check * Formatting
* Remove on epoch guard from the should stop validation check * Formatting
What does this PR do?
Remove
if on_epoch
when deciding whether to run validation ifshould_stop=True
.Add a test.
The behavior is the same in both cases because even when
should_stop=True
and_should_check_val_fx()
returns False mid-epoch, when it gets called again out of the batch-loop, it returns True.Part of #7357
Before submitting
PR review