Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: invalid device function (hard_voxelize_gpu at mmdet3d/ops/voxel/src/voxelization_cuda.cu:232) #188

Closed
hughlee815 opened this issue Nov 2, 2020 · 1 comment
Labels
installation/environment Installation and environment issues

Comments

@hughlee815
Copy link

Describe the bug
File "tools/test.py", line 152, in
main()
File "tools/test.py", line 130, in main
outputs = single_gpu_test(model, data_loader, args.show, args.show_dir)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/apis/test.py", line 29, in single_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 42, in forward
return super().forward(*inputs, **kwargs)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/base.py", line 61, in forward
return self.forward_test(**kwargs)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/base.py", line 42, in forward_test
return self.simple_test(points[0], img_metas[0], img[0], **kwargs)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 98, in simple_test
x = self.extract_feat(points, img_metas)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 40, in extract_feat
voxels, num_points, coors = self.voxelize(points)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(*args, **kwargs)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 164, in new_func
return old_func(*args, **kwargs)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 55, in voxelize
res_voxels, res_coors, res_num_points = self.voxel_layer(res)
File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, kwargs)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 113, in forward
self.max_num_points, max_voxels)
File "/home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 53, in forward
coors_range, max_points, max_voxels, 3)
RuntimeError: CUDA error: invalid device function (hard_voxelize_gpu at mmdet3d/ops/voxel/src/voxelization_cuda.cu:232)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7fe0603f8627 in /home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: voxelization::hard_voxelize_gpu(at::Tensor const&, at::Tensor&, at::Tensor&, at::Tensor&, std::vector<float, std::allocator >, std::vector<float, std::allocator >, int, int, int) + 0x819 (0x7fdffdac175b in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #2: voxelization::hard_voxelize(at::Tensor const&, at::Tensor&, at::Tensor&, at::Tensor&, std::vector<float, std::allocator >, std::vector<float, std::allocator >, int, int, int) + 0x117 (0x7fdffda7fa77 in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #3: + 0x4371c (0x7fdffda8a71c in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #4: + 0x4396e (0x7fdffda8a96e in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #5: + 0x3f651 (0x7fdffda86651 in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)

frame #11: THPFunction_apply(_object
, _object
) + 0xa0f (0x7fe0929cda3f in /home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

Segmentation fault (core dumped)

Reproduction

  1. What command or script did you run?
python tools/test.py /home/nio/pointdet/mmdetection3d/configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py /home/nio/Desktop/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class_20200620_230421-aa0f3adb.pth --show
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
    NO modifications.

  2. What dataset did you use?
    kitti.
    Environment

  3. Please run python mmdet3d/utils/collect_env.py to collect necessary environment infomation and paste it here.
    sys.platform: linux
    Python: 3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]
    CUDA available: True
    GPU 0: GeForce GTX 1080 Ti
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 10.0, V10.0.130
    GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
    PyTorch: 1.4.0
    PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0
OpenCV: 4.4.0
MMCV: 1.1.6
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.0
MMDetection: 2.6.0
MMDetection3D: 0.7.0+37ce187

  1. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
      conda
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

File "tools/test.py", line 152, in <module>
    main()
  File "tools/test.py", line 130, in main
    outputs = single_gpu_test(model, data_loader, args.show, args.show_dir)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/apis/test.py", line 29, in single_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 42, in forward
    return super().forward(*inputs, **kwargs)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
    return old_func(*args, **kwargs)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/base.py", line 61, in forward
    return self.forward_test(**kwargs)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/base.py", line 42, in forward_test
    return self.simple_test(points[0], img_metas[0], img[0], **kwargs)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 98, in simple_test
    x = self.extract_feat(points, img_metas)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 40, in extract_feat
    voxels, num_points, coors = self.voxelize(points)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 164, in new_func
    return old_func(*args, **kwargs)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 55, in voxelize
    res_voxels, res_coors, res_num_points = self.voxel_layer(res)
  File "/home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 113, in forward
    self.max_num_points, max_voxels)
  File "/home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 53, in forward
    coors_range, max_points, max_voxels, 3)
RuntimeError: CUDA error: invalid device function (hard_voxelize_gpu at mmdet3d/ops/voxel/src/voxelization_cuda.cu:232)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7fe0603f8627 in /home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: voxelization::hard_voxelize_gpu(at::Tensor const&, at::Tensor&, at::Tensor&, at::Tensor&, std::vector<float, std::allocator<float> >, std::vector<float, std::allocator<float> >, int, int, int) + 0x819 (0x7fdffdac175b in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #2: voxelization::hard_voxelize(at::Tensor const&, at::Tensor&, at::Tensor&, at::Tensor&, std::vector<float, std::allocator<float> >, std::vector<float, std::allocator<float> >, int, int, int) + 0x117 (0x7fdffda7fa77 in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x4371c (0x7fdffda8a71c in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x4396e (0x7fdffda8a96e in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
frame #5: <unknown function> + 0x3f651 (0x7fdffda86651 in /home/nio/pointdet/mmdetection3d/mmdet3d/ops/voxel/voxel_layer.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
frame #11: THPFunction_apply(_object*, _object*) + 0xa0f (0x7fe0929cda3f in /home/nio/anaconda3/envs/mm3d/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

Segmentation fault (core dumped)

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

@ZwwWayne
Copy link
Collaborator

ZwwWayne commented Nov 5, 2020

Please see the troubleshooting page here. invalid device function usually means that there is incompatibility in your environment.

@ZwwWayne ZwwWayne added the installation/environment Installation and environment issues label Nov 5, 2020
tpoisonooo pushed a commit to tpoisonooo/mmdetection3d that referenced this issue Sep 5, 2022
…-mmlab#188)

* update naming and docstring in mmseg and mmcv

* update docstring

* update self docstring

* resolve comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installation/environment Installation and environment issues
Projects
None yet
Development

No branches or pull requests

2 participants