-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using multiple dataloaders in the training_step? #2457
Comments
Hi! thanks for your contribution!, great first issue! |
Hi @christofer-f, I've actually prototyped this feature already (in #1959). If your dataset has a length, this is already working (the failing tests are due to the case, where your dataset does not have a defined length). |
Hi! This looks very promising. In my case, I have several processes that create datasets on the fly. I will try to apply your code to the toy example above. Br, |
This is exactly what I wanted!!! Great job.
|
great to hear that! So the feature should almost be ready to merge to master. There is also a trainer flag, that controls, how to deal with datasets of different lengths |
I close this. It is a very good and useful feature. I need to tinker a bit with this before I understand how it really works... |
Thank your for this wonderful work @justusschock ! pip install -e git://github.com/PyTorchLightning/pytorch-lightning.git@train_loaders#egg=pytorch-lightning_train_loaders import os
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST, FashionMNIST
from torchvision import transforms
import pytorch_lightning as pl
class FashionMNIST_and_MNISTModel(pl.LightningModule):
def __init__(self):
super(FashionMNIST_and_MNISTModel, self).__init__()
self.l_mnist = torch.nn.Linear(28 * 28, 10)
self.l_fashion_mnist = torch.nn.Linear(28 * 28, 10)
def forward(self, x):
return torch.relu(self.l_mnist(x.view(x.size(0), -1)))
def training_step(self, batch, batch_idx, optimizer_idx):
if optimizer_idx == 0:
print('mnist')
x, y = batch['mnist']
y_hat = self(x)
loss_mnist = F.cross_entropy(y_hat, y)
tensorboard_logs = {'train_loss': loss_mnist}
return {'loss': loss_mnist, 'log': tensorboard_logs}
if optimizer_idx == 1:
print('fashion')
x, y = batch['fashion_mnist']
y_hat = torch.relu(self.l_fashion_mnist(x.view(x.size(0), -1)))
loss_fashion_mnist = F.cross_entropy(y_hat, y)
tensorboard_logs = {'train_loss': loss_fashion_mnist}
return {'loss': loss_fashion_mnist, 'log': tensorboard_logs}
def configure_optimizers(self):
opt_mnist = torch.optim.Adam(self.l_mnist.parameters(), lr=0.02)
opt_fashion_mnist = torch.optim.Adam(self.l_fashion_mnist.parameters(), lr=0.02)
return [opt_mnist, opt_fashion_mnist], []
def train_dataloader(self):
loader_mnist = DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=transforms.ToTensor()), batch_size=32)
loader_fashion_mnist = DataLoader(FashionMNIST(os.getcwd(), train=True, download=True, transform=transforms.ToTensor()), batch_size=32)
loaders = {"mnist": loader_mnist, "fashion_mnist": loader_fashion_mnist}
return loaders
mnist_model = FashionMNIST_and_MNISTModel()
trainer = pl.Trainer(gpus=0, fast_dev_run=False, max_epochs=1)
trainer.fit(mnist_model) And, I got the following output. GPU available: False, used: False
TPU available: False, using: 0 TPU cores
| Name | Type | Params
-------------------------------------------
0 | l_mnist | Linear | 7 K
1 | l_fashion_mnist | Linear | 7 K
Epoch 1: 100%
2/2 [00:00<00:00, 30.03it/s, loss=2.272, v_num=16]
mnist
fashion
mnist
fashion
1 It seems there are only two steps in one epoch according to the print messages. Thanks in advance! |
Hi, I think @omiita is right. |
I close this again... the problem has been identified... |
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team! |
Hi!
In the pseudo-code below I have two models that I want to fit with two different datasets.
I have tried to figure out if this is possible by reading test_dataloaders.py with no success...
In the documentation, it states that:
Multiple training dataloaders
For training, the best way to use multiple-dataloaders is to create a Dataloader class which wraps both your dataloaders. (This of course also works for testing and validation dataloaders).
But that doesn't really help me...
I guess that this already has been discussed in: #1089
And that I should study: https://gist.github.com/Dref360/2524e524244569ed47428f19c487f264
But it would be nice with a dataloader_idx like just like the optimizer_idx parameter...
Or perhaps a batch could have a dictionary-like structure where you sample data into different "baskets"
so that I could write something like:
//Christofer
The text was updated successfully, but these errors were encountered: