-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log/save_interval based on global step #3667
Conversation
Hello @awaelchli! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-09-29 14:19:21 UTC |
Global step updates only on the training step right? I tried to log it after every
It is incrementing by an extra +1 after every epoch. Is this expected behavior or a bug? |
@rohitgr7 Yes I remember discussing this before with someone, I can't find the issue or PR unfortunately. |
Agreed. @awaelchli Ideally IMO |
so you say we should first fix this global step problem before we can switch to logging on global step? ok, that makes sense. |
maybe just removing this line will do the job only if it meant to not update in |
@rohitgr7 william fixed it, it should be good now and we can trust global_step to be aligned in all epochs. |
This pull request is now in conflict... :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ✌️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
awesome fix! |
Codecov Report
@@ Coverage Diff @@
## master #3667 +/- ##
=======================================
+ Coverage 85% 90% +5%
=======================================
Files 110 110
Lines 8355 8849 +494
=======================================
+ Hits 7141 7982 +841
+ Misses 1214 867 -347 |
What does this PR do?
This issue came up frequently: nothing is getting logged, what's the problem?
If number of batches in training is < row_log_interval, nothing gets logged.
This PR will fix this by making row_log_interval based on global_step.
The current logging behaviour does not change, except in the edge case where epochs are shorter than logging interval.
Fixes #3466
Fixes #3506
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃