-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new way of passing dataloaders #759
new way of passing dataloaders #759
Conversation
pytorch_lightning/trainer/trainer.py
Outdated
else: | ||
logging.info('Model has predefined val_dataloader, ' | ||
'will skip the val_dataloader passed to fit method ') | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may merge this with the first if, so elif val_dataloader:
pytorch_lightning/trainer/trainer.py
Outdated
""" | ||
if train_dataloader: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make it a function since the three are almost the same:
def _set_dataloader(dataloader, attribute):
if isinstance(dataloader, torch.utils.data.DataLoader):
if getattr(model, attribute) is None:
setattr(model, attribute, lambda: dataloader)
else:
logging.info(f'Model has predefined `{attribute}`, '
'will skip the `{attribute}` passed to fit method.')
elif dataloader:
raise ValueError(f'`{attribute}` needs to be an instance of `torch.utils.data.DataLoader`')
_set_dataloader(train_dataloader, 'train_dataloader')
...
@Borda, took your changes into consideration, and made it all into a function. In addition I added:
|
@SkafteNicki this has been brought up before. Can you tell me again why we should enable this? I do agree that it might be easier to allow this model to server arbitrary data. BUT this will come at the expense of breaking the LightningTemplate and having data + transforms defined in unconventional ways that don't lend themselves to reproducibility |
As I initially wrote I think the current dataloader setup is not intuitive for newcomers. It is easy to learn, but it is just different than other common frameworks. My personal use case is where i have two models, that needs to be trained on different training sets but evaluated on the same val/test set. For me it makes sense to then let the training dataloaders be defined as part of the model, but pass the val/test dataloaders to the trainer. Lastly, I do not think that there is anything wrong with having this as additional option for parsing data. It should in no way replace the current way of doing it. If this means that we are giving up too much reproducibility, then maybe it is not worth it. |
@SkafteNicki makes sense! let's do it. |
@SkafteNicki we also need a test for this. Add it under the tests for trainer. Awesome addition! |
@SkafteNicki mind finishing this over the next day or so? that way we can add to next release. awesome feature! |
@williamFalcon i will try to look at this tomorrow, will hopefully get it all to work and implement the appropriate tests. Will let you know as soon as I am done. |
ideally next week (we were on a cycle of every 6th), but we got behind in dec |
@SkafteNicki love this feature. can we please get this merged? |
@SkafteNicki can you add a test and documentation for this please? |
@williamFalcon test and documentation should be added now. The failing CI testing seems to be a windows specific problem not related to this PR. Two notes on changes since last time:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this addition...
I am not very sure about some block so could you pls double check it
Hello @SkafteNicki! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-02-19 09:51:31 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
@SkafteNicki add change to CHANGELOG.md
@Borda I have updated code and CHANGELOG per your request |
@SkafteNicki great job, Thx! |
Before submitting
What does this PR do?
This is a initial solution to issue #662. This allows people to pass a
train_dataloader
,val_dataloader
andtest_dataloader
to the.fit()
method ofTrainer
. As mentioned in the issue, this is a more familiar interface for people coming fromkeras
andscikit-learn
.What needs to be discussed is probably in the case where a dataloader is defined both in the model and passed to
fit
. Right now thefit
method will only use the dataloader defined in the model, and prompt the user that it is skipping the dataloader passed tofit
. Fortrain_dataloader
there needs to be some kind of tiebreaker, however since bothval_dataloader
andtest_dataloader
can be a list of multiple dataloader a option here would be to concatenate dataloaders defined in the model and those passed tofit
.PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
yes!