You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In fact, MMEngine can automatically change your module into cuda mode i.e. model = model.to(get_device()) in runner.py. Please double check your custom implementation like wrongly using list instead of modulelist for injection.
Prerequisite
Task
I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.
Branch
master branch https://github.com/open-mmlab/mmdetection
Environment
sys.platform: linux
Python: 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3060
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 12.0, V12.0.140
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
PyTorch: 1.13.1+cu117
PyTorch compiling details: PyTorch built with:
TorchVision: 0.14.1+cu117
OpenCV: 4.7.0
MMCV: 1.7.1
MMCV Compiler: GCC 11.3
MMCV CUDA Compiler: not available
MMDetection: 2.28.1+
Reproduces the problem - code sample
RuntimeError Traceback (most recent call last)
Cell In[30], line 9
5 model = build_detector( cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
7 datasets = [build_dataset(cfg.data.train)]
----> 9 train_detector(model, datasets, cfg, distributed=False, validate=True)
File /usr/local/lib/python3.10/dist-packages/mmdet/apis/train.py:246, in train_detector(model, dataset, cfg, distributed, validate, timestamp, meta)
244 elif cfg.load_from:
245 runner.load_checkpoint(cfg.load_from)
--> 246 runner.run(data_loaders, cfg.workflow)
File ~/.local/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py:136, in EpochBasedRunner.run(self, data_loaders, workflow, max_epochs, **kwargs)
134 if mode == 'train' and self.epoch >= self._max_epochs:
135 break
--> 136 epoch_runner(data_loaders[i], **kwargs)
138 time.sleep(1) # wait for some hooks like loggers to finish
139 self.call_hook('after_run')
File ~/.local/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py:53, in EpochBasedRunner.train(self, data_loader, **kwargs)
51 self._inner_iter = i
52 self.call_hook('before_train_iter')
---> 53 self.run_iter(data_batch, train_mode=True, **kwargs)
54 self.call_hook('after_train_iter')
55 del self.data_batch
...
458 _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
460 self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Reproduces the problem - command or script
model = build_detector( cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
datasets = [build_dataset(cfg.data.train)]
train_detector(model, datasets, cfg, distributed=False, validate=True)
Reproduces the problem - error message
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Additional information
hello I use ssd300 model with modified backbone, dataset coco
at config I marked that device is cuda
cfg.gpu_ids = range(1)
cfg.device = 'cuda'
after that I use build_detector
model = build_detector( cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
datasets = [build_dataset(cfg.data.train)]
train_detector(model, datasets, cfg, distributed=False, validate=True)
at at start train get error
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
The text was updated successfully, but these errors were encountered: