Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move code from TrainerOptimizersMixin into TrainingTypePlugin #10681

Closed
daniellepintz opened this issue Nov 22, 2021 · 7 comments · Fixed by #11155
Closed

Move code from TrainerOptimizersMixin into TrainingTypePlugin #10681

daniellepintz opened this issue Nov 22, 2021 · 7 comments · Fixed by #11155
Assignees

Comments

@daniellepintz
Copy link
Contributor

daniellepintz commented Nov 22, 2021

Proposed refactor

Move all the code from this class to TTP:
https://github.com/PyTorchLightning/pytorch-lightning/blob/2036dfb5df2a48056d1c043ee311aed793187070/pytorch_lightning/trainer/optimizers.py#L28

Motivation

After #10596 the TTP will own the optimizer related logic previously owned by Accelerator. It makes sense for the TTP to also own the optimizer related code in TrainerOptimizersMixin. This would allow us to avoid this type of circuitous logic: https://github.com/PyTorchLightning/pytorch-lightning/blob/dcafc95f2b0fd3f176d425139ca99676ce943a12/pytorch_lightning/plugins/training_type/training_type_plugin.py#L244-L245

We in general want to reduce the number of mixins (#10417)

cc @justusschock @awaelchli @akihironitta

@awaelchli
Copy link
Contributor

Thanks for the initiative.

I propose to only move the contents of the init_optimizers method to the TTP. In there, we would check if the model has a configure_optimizers method and call it.
The static methods could become functions and be part of core (LightningModule) because they are responsible for parsing the output of the LightningModule hook. They don't belong into the TTP imo.

@daniellepintz daniellepintz self-assigned this Nov 23, 2021
@daniellepintz
Copy link
Contributor Author

@awaelchli Sounds good!!

@daniellepintz
Copy link
Contributor Author

and be part of core (LightningModule)

@awaelchli do you mean in lightning.py?

@awaelchli
Copy link
Contributor

Yes, basically in core/lightning.py or core/optimizer.py or somewhere there where it's close to the hook that retrieves the optimizers and schedulers from the user (configure_optimizers).

@ananthsub
Copy link
Contributor

@awaelchli is something like this what you had in mind?

class LightningModule(...):
    
    @abstractmethod
    def configure_optimizers(self):
        ....

    def init_optimizers_and_lr_schedulers(self) -> TypedDict[...]:
        output = self.configure_optimizers()
        # parse & validate output
        # return components

then we could also schematize/add typing for configure_optimizers so all the typechecking & validation is in one place, with the strategy calling init_optimizers as opposed to configure_optimizers

@daniellepintz
Copy link
Contributor Author

@ananthsub it sounds like what you are saying is to move init_optimizers to the LM as well, rather than putting it in TTP?

@ananthsub
Copy link
Contributor

ananthsub commented Dec 18, 2021

@ananthsub it sounds like what you are saying is to move init_optimizers to the LM as well, rather than putting it in TTP?

I think the caller of configure_optimizers should be the strategy. However, the definition of the parsing & validation logic from the output of configure_optimizers should be shared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants