Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] fix train example #1502

Merged
merged 4 commits into from
Nov 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion examples/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,9 @@ def train_step(self, data, optimizer):
if __name__ == '__main__':
model = Model()
if torch.cuda.is_available():
model = MMDataParallel(model.cuda())
# only use gpu:0 to train
# Solved issue https://github.com/open-mmlab/mmcv/issues/1470
model = MMDataParallel(model.cuda(), device_ids=[0])

# dataset and dataloader
transform = transforms.Compose([
Expand Down
10 changes: 9 additions & 1 deletion mmcv/parallel/data_parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@ class MMDataParallel(DataParallel):
flexible control of input data during both GPU and CPU inference.
- It implement two more APIs ``train_step()`` and ``val_step()``.

.. warning::
MMDataParallel only supports single GPU training, if you need to
train with multiple GPUs, please use MMDistributedDataParallel
instead. If you have multiple GPUs and you just want to use
MMDataParallel, you can set the environment variable
``CUDA_VISIBLE_DEVICES=0`` or instantiate ``MMDataParallel`` with
``device_ids=[0]``.

Args:
module (:class:`nn.Module`): Module to be encapsulated.
device_ids (list[int]): Device IDS of modules to be scattered to.
Expand Down Expand Up @@ -54,7 +62,7 @@ def train_step(self, *inputs, **kwargs):
assert len(self.device_ids) == 1, \
('MMDataParallel only supports single GPU training, if you need to'
' train with multiple GPUs, please use MMDistributedDataParallel'
'instead.')
' instead.')

for t in chain(self.module.parameters(), self.module.buffers()):
if t.device != self.src_device_obj:
Expand Down