-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trainer(precision=16) fails with optim.lr_scheduler.ReduceLROnPlateau #2078
Comments
Hi! thanks for your contribution!, great first issue! |
@naokishibuya good catch. It seems like a problem that should be solved upstream in pytorch, but for now we can solve this locally. Would you be up for a PR? |
When I tried this fix, it solved the error but unfortunately |
I think that the fix is actually working, however only calling
Again I think this is a bit hacky, and a proper solution upstream in pytorch is better. |
I think this does the trick for me: def reinit_scheduler_properties(self, optimizers: list, schedulers: list):
# Reinitialize optimizer.step properties added by schedulers
for scheduler in schedulers:
for optimizer in optimizers:
scheduler = scheduler["scheduler"]
# check that we dont mix users optimizers and schedulers
if scheduler.optimizer == optimizer:
# Find the mro belonging to the base lr scheduler class
for i, mro in enumerate(scheduler.__class__.__mro__):
if (
mro == optim.lr_scheduler._LRScheduler
or mro == optim.lr_scheduler.ReduceLROnPlateau
):
idx = i
state = scheduler.state_dict()
else:
state = None
scheduler.__class__.__mro__[idx].__init__(scheduler, optimizer)
if state is not None:
scheduler.load_state_dict(state) Happy to open a PR if it looks ok to you guys |
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
pl.LightningModule
that returns your optimizer along with aoptim.lr_scheduler.ReduceLROnPlateau
scheduler fromconfigure_optimizers
pl.Trainer
witprecision=16
trainer.fit(model)
)The error occurs in
pytorch-lightning/pytorch_lightning/trainer/optimizers.py", line 122
.The
idx
local variable is unassigned becauseoptim.lr_scheduler.ReduceLROnPlateau
is not a subclass ofoptim.lr_scheduler._LRScheduler
.I could work around the error by adding a specific check for
optim.lr_scheduler.ReduceLROnPlateau
but I'm not sure if this is a good solution.Related issue in PyTorch:
ReduceLROnPlateau parent class is not _LRScheduler #21981
pytorch/pytorch#21981
The text was updated successfully, but these errors were encountered: