-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cut-out train step [wip] #891
Conversation
Tests is failing because of some cycling... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced accessing last_epoch
directly is the best idea. Could always put in a check for the PyTorch version and do different things for different versions if that's the problem.
m = "ReduceLROnPlateau conditioned on metric val_loss" \ | ||
f" which is not available. Available metrics are: {avail_metrics}" | ||
raise MisconfigurationException(m) | ||
self.reduce_lr_on_plateau_scheduler.step(val_loss, epoch=self.current_epoch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use .step
here if not above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, I have overlooked this one :]
for lr_scheduler in self.lr_schedulers: | ||
# accessing the Scheduler state directly, | ||
# see the base class `torch.optim.lr_scheduler._LRScheduler` | ||
lr_scheduler.last_epoch = self.current_epoch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is a good idea, any alternatives that involve just calling .step
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well I can make a loop over and call .step()
until the .last_epoch
is the current_epoch - 1
unless set differently at the init the, it starts steps from 0, on the other hand, there is a desired feature to resume the process from step N so you need to set the particular step... but I see your point that it would be better to do this at scheduler creation... |
Yeah, it's a difficult choice but the while loop solution seems good. At least this way if something important happens in |
the original issue with schedulers was solved by #890 |
What does this PR do?
cut out too large train epoch into a separate method, see #856 (comment)
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.