Core Trainer Connectors #10417

awaelchli · 2021-11-08T17:57:23Z

Proposed refactoring or deprecation

Reduce the number of connectors the Trainer relies on to only three core ones:

AcceleratorConnector
LoggerConnector
DataConnector

Motivation

As part of the Lightning API audit led by @ananthsub + co., we proposed already several simplifications and code quality improvements to connectors in #7493, #7654, #9778, #10161, #10119, #10108, #10110 etc.
There are still a few connectors that are problematic for several reasons.

They share responsibilities that are too similar to the ones the Trainer should have (e.g. data connector vs. data_loading mixin)
They modify the state of the trainer / impersonating the Trainer (see all connectors)
They have been reduced over time and are now not useful anymore (e.g. optimizer connector)

These three properties make most connectors a burden to maintain as they just obscure the fact that Trainer remains a too powerful class.

Pitch

Remove (refactor away) all connectors except the core ones:

AcceleratorConnector
LoggerConnector
DataConnector

We (@awaelchli @daniellepintz + co) believe that the fact they have enough complexity and encapsulate responsibility warrants their existence as standalone classes. Hence, we formulate these goals:

Simplify and document the three core connectors listed above
Remove Trainer references
Arrange ownership of components: LoggerConnector should own the logger instance, AcceleratorConnector should own accelerator instance, etc.
Refactor away all others such that their logic lives in the Trainer directly.

Additional context

There are a great many similarities between the "DataLoadingMixin" and the DataConnector. As the "DataLoadingMixin" is not a true mixin and we are aiming at removing the "mixins" from the Trainer completely, the DataConnector will be a natural choice for where this logic can go.

If you enjoy Lightning, check out our other projects! ⚡

Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @justusschock @awaelchli @akihironitta @rohitgr7 @kaushikb11 @ninginthecloud

tchaton · 2021-11-15T11:19:54Z

Hey @awaelchli,

I know there is a strong push to remove the connectors to a minimal amount and I don't like this effort.
@williamFalcon introduced the connectors in the first hand to make the Trainer approachable to new readers and contributors. The goal was to make the highest layer of Lightning the cleanest possible.

IMO, the Trainer code right now is getting more complex than it used to be: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/trainer.py#L447 and I have seen tweets about Lightning becoming unreadable.

I would prefer for us to come with a better approach to organise the code instead of dumping everything on the Trainer class and making it un-readable.

awaelchli added the refactor label Nov 8, 2021

awaelchli changed the title ~~Trainer Connectors~~ Core Trainer Connectors Nov 8, 2021

awaelchli added the design Includes a design discussion label Nov 8, 2021

four4fish mentioned this issue Nov 8, 2021

[RFC] Simplifying the Accelerator Connector logic and flags #10422

Closed

kaushikb11 mentioned this issue Nov 12, 2021

Moved env_vars_connector._defaults_from_env_vars to utilities.argsparse._defaults_from_env_vars #10501

Merged

12 tasks

awaelchli mentioned this issue Nov 16, 2021

Improve code quality in AcceleratorConnector._configure_slurm_ddp #10102

Merged

11 tasks

daniellepintz mentioned this issue Nov 22, 2021

Move code from TrainerOptimizersMixin into TrainingTypePlugin #10681

Closed

awaelchli mentioned this issue Dec 14, 2021

Fix _should_reload_dl_epoch causing inconsistent validation dataloader reloading #11036

Merged

12 tasks

ananthsub mentioned this issue Dec 17, 2021

Remove LoggerConnector.on_trainer_init #11121

Closed

9 tasks

carmocca mentioned this issue Feb 1, 2022

Deprecate the connector + trainer dependency inside the Trainer class #7493

Closed

carmocca added let's do it! approved to implement trainer: connector and removed design Includes a design discussion labels Feb 1, 2022

carmocca added this to the 1.6 milestone Feb 1, 2022

awaelchli mentioned this issue Feb 8, 2022

Create loggers property for Trainer and LightningModule #11683

Merged

12 tasks

daniellepintz mentioned this issue Feb 9, 2022

[RFC] Logger_connector should own the logger, and should not have a Trainer reference #11816

Open

carmocca modified the milestones: 1.6, future Feb 14, 2022

awaelchli closed this as completed Mar 15, 2023

awaelchli removed this from the future milestone Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core Trainer Connectors #10417

Core Trainer Connectors #10417

awaelchli commented Nov 8, 2021 •

edited by github-actions bot

Loading

tchaton commented Nov 15, 2021

Core Trainer Connectors #10417

Core Trainer Connectors #10417

Comments

awaelchli commented Nov 8, 2021 • edited by github-actions bot Loading

Proposed refactoring or deprecation

Motivation

Pitch

Additional context

If you enjoy Lightning, check out our other projects! ⚡

tchaton commented Nov 15, 2021

awaelchli commented Nov 8, 2021 •

edited by github-actions bot

Loading