-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to access datalaoders within configure_optimizers
#10430
Comments
@rohitgr7 Good observation. Yes, this was indeed a somewhat breaking change and it was not fully intentional. However, one could also argue this was a bug or at least undefined behavior. Previously, if we called train_dataloader(), e.g., through the trainer, it would still have lead to Trainer calling it a second time later on when it got reset, if I recall correctly. For the purpose of documentation, here is the sequence of hooks we exacute around With this issue request, either the What would make this configuration more favorable than the current one? Isn't it equally valid to request having the optimizers available in the dataloader methods? If both need to be the case, then this is hitting the fundamental limits of the hook system in Lightning. A similar issue was faced in #8485. All I'm saying is, if the hook order changes then we need to be 100% sure that the order we choose is more favorable than the other in most cases, and we need to be aware of the limitations. |
Hi! @rohitgr7 @awaelchli I'm working on integrating Lightning with Opacus (it enables training with differential privacy). This pull request demonstrates how PL user can add privacy by just wrapping optimizer in The problem is, we need to access the training data to configure noise generation and privacy accounting properly. We can pass some stats about the training data as the model parameters, but this can easily lead to inconsistent optimizer configuration and as the result, incorrect privacy accounting. My current solution touches def configure_optimizers(self):
optimizer = optim.SGD(self.parameters(), lr=self.lr, momentum=0)
if self.enable_dp:
# dirty introspection of the trainer instance to get the training data
data_loader = (
self.trainer._data_connector._train_dataloader_source.dataloader()
)
# transform (model, optimizer, dataloader) to DP-versions
model, optimizer, dataloader = self.privacy_engine.make_private(
self,
optimizer,
data_loader,
...
)
return optimizer cc @ananthsub (To our discussion about using TrainingTypePlugin: I think this way of enabling DP by modifying configure_optimizers can be useful for advanced users and DP researchers, I would prefer to have both on the table) |
I think moving the configure_optimizers later would be better since moving the dataloader initialization call outside FitLoop is inconvenient plus we only initialize optimizers during
do we have any use-case/example for it? @awaelchli |
@rohitgr7 Would you like to try to invert the order in a draft? Failing tests should then reveal potential challenges we have not considered yet, if there are any. |
yep! we do have some failing tests. |
Any update? |
hey @andreimargeloiu ! the current workaround is that you can initialize the dataloader before it's initialized by lightning using: def configure_optimizers(self):
self.trainer.reset_train_dataloader()
self.train_dataloader.loaders # access it here. |
A bit confused here. So in the current pytorch lightning version, if I want to access the train dataloder, for example, in my model's forward method, I do Looks like if I do this in model's Would be happy to see if there is a more elegant way of doing this. |
The above workaround by @rohitgr7 does not sem to work anymore in Is there an other workaround or any plans to fix this? I think it is quite more common to access the dataset in |
In 2.0, you can do def configure_optimizers(self):
self.trainer.fit_loop.setup_data()
dataloader = self.trainer.train_dataloader
... You could also call |
Closing as I don't think there's anything actionable. Feel free to post further questions though |
Proposed refactoring or deprecation
Before v1.5, the dataloader hooks were patched to model and were easily accessible within
configure_optimizers
to setup total training steps for scheduler. But now since they are no longer patched,https://github.com/PyTorchLightning/pytorch-lightning/blob/0ed5e3dc8abcec40aacd64cc9175590bb1409759/pytorch_lightning/trainer/connectors/data_connector.py#L213-L224
they are no longer available directly if using a datamodule or dataloaders are passed directly to
.fit
. Neither they can be accessed usingself.trainer.train_dataloaders
because dataloaders are being loaded within Fit Loop.https://github.com/PyTorchLightning/pytorch-lightning/blob/0ed5e3dc8abcec40aacd64cc9175590bb1409759/pytorch_lightning/loops/fit_loop.py#L194-L197
Motivation
I'd suggest these dataloaders should be available for users, no matter how to passed it during
.fit
.Pitch
If possible we should load call
configure_optimizers
after loading the dataloaders for the first time within fit loop. Not sure if it will bring some complications and failures because we load the optimizers differently for deepspeed.Additional context
As always, alternative suggestions/thoughts would be appreciated :)
cc: @karthikrangasai @awaelchli
If you enjoy Lightning, check out our other projects! ⚡
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
The text was updated successfully, but these errors were encountered: