Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same #9725

Open
3 tasks done
danilovabg opened this issue Feb 4, 2023 · 1 comment
Assignees

Comments

@danilovabg
Copy link

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmdetection

Environment

sys.platform: linux
Python: 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3060
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 12.0, V12.0.140
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
PyTorch: 1.13.1+cu117
PyTorch compiling details: PyTorch built with:

  • GCC 9.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.7
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.5
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.14.1+cu117
OpenCV: 4.7.0
MMCV: 1.7.1
MMCV Compiler: GCC 11.3
MMCV CUDA Compiler: not available
MMDetection: 2.28.1+

Reproduces the problem - code sample

RuntimeError Traceback (most recent call last)
Cell In[30], line 9
5 model = build_detector( cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
7 datasets = [build_dataset(cfg.data.train)]
----> 9 train_detector(model, datasets, cfg, distributed=False, validate=True)

File /usr/local/lib/python3.10/dist-packages/mmdet/apis/train.py:246, in train_detector(model, dataset, cfg, distributed, validate, timestamp, meta)
244 elif cfg.load_from:
245 runner.load_checkpoint(cfg.load_from)
--> 246 runner.run(data_loaders, cfg.workflow)

File ~/.local/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py:136, in EpochBasedRunner.run(self, data_loaders, workflow, max_epochs, **kwargs)
134 if mode == 'train' and self.epoch >= self._max_epochs:
135 break
--> 136 epoch_runner(data_loaders[i], **kwargs)
138 time.sleep(1) # wait for some hooks like loggers to finish
139 self.call_hook('after_run')

File ~/.local/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py:53, in EpochBasedRunner.train(self, data_loader, **kwargs)
51 self._inner_iter = i
52 self.call_hook('before_train_iter')
---> 53 self.run_iter(data_batch, train_mode=True, **kwargs)
54 self.call_hook('after_train_iter')
55 del self.data_batch
...
458 _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
460 self.padding, self.dilation, self.groups)

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Reproduces the problem - command or script

model = build_detector( cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
datasets = [build_dataset(cfg.data.train)]
train_detector(model, datasets, cfg, distributed=False, validate=True)

Reproduces the problem - error message

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Additional information

hello I use ssd300 model with modified backbone, dataset coco
at config I marked that device is cuda
cfg.gpu_ids = range(1)
cfg.device = 'cuda'

after that I use build_detector

model = build_detector( cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
datasets = [build_dataset(cfg.data.train)]
train_detector(model, datasets, cfg, distributed=False, validate=True)

at at start train get error
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

@JamesYang568
Copy link

In fact, MMEngine can automatically change your module into cuda mode i.e. model = model.to(get_device()) in runner.py. Please double check your custom implementation like wrongly using list instead of modulelist for injection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants