You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When auto_scale_batch_size is enabled, the model is initially trained with varying batch sizes. When training begins, trainer.current_epoch equals 1 instead of 0.
To Reproduce
Either observe the progress bar or use a simple callback to track the epoch number, once with auto_scale_batch_size enabled and once with auto_scale_batch_size disabled.
from pytorch_lightning import Callback
class PrintCallback(Callback):
def __init__(self):
self.observed_epochs = []
def on_train_epoch_start(self, trainer, pl_module):
print(f'Current Epoch: {trainer.current_epoch}')
self.observed_epochs.append(trainer.current_epoch)
The text was updated successfully, but these errors were encountered:
The problem is during model checkpointing. The checkpoint sets 'epoch': self.current_epoch + 1,. That checkpoint will be loaded after having completed the batch size finder. During batch size scaling, the epoch won't be increased.
🐛 Bug
When
auto_scale_batch_size
is enabled, the model is initially trained with varying batch sizes. When training begins,trainer.current_epoch
equals 1 instead of 0.To Reproduce
Either observe the progress bar or use a simple callback to track the epoch number, once with
auto_scale_batch_size
enabled and once withauto_scale_batch_size
disabled.The text was updated successfully, but these errors were encountered: