-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime Error: invalid device function, tsdf_volume.cu:76 (Kinfu bug on windows) #1113
Comments
I posted an answer on the other thread but I am not sure if it was the right place, I just make a copy here feel free to tell me if one should be removed. Hi, If it may help, after searching a long time I managed to run the kinfu application with CUDA 6.5. |
Thanks for your reply! But I don't think this fixes the underlying problem. Removing /GL explains why the debug build works and the release build doesn't (by default, /GL is a release only flag), but it doesn't explain the substantially reduced frame rate on maxwell GPUs. To reiterate... I can build the code for non-maxwell under CUDA 6.5 & MSVC 12 and have it work fine, but I can't build for maxwell under CUDA 6.5 & MSVC 12 in a way that the performance is comparable to building under CUDA 5.0 & MSVC 10 using the aforementioned flags. |
hey guys! i'am quite new to cuda issues but i can report that i have the same cudaLastError message on Ubuntu 14.04, 64bit with a quite old GPU (GTX275, 1.3) - i'm building gpu/cuda modules to use on android device with Tegra K1 - Kepler, 3.2 (NVCC: -arch=sm_32). I tried with CUDA6.0/6.5 GCC4.6/4.8- the build is successful but when i run the app i get the same error. i looked into ../cuda_compile.dir/cuda/cuda_compile_generated_tsdf_volume.cu.o.cmake - there is no GL flag in my case. set(CMAKE_HOST_FLAGS -fexceptions -frtti -Wno-psabi --sysroot=/home/con/NDK/android-ndk-r8b/platforms/android-8/arch-arm -fpic -funwind-tables -finline-limit=64 -fsigned-char -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -fdata-sections -ffunction-sections -Wa,--noexecstack -Wall -Wextra -Wabi -Wno-unknown-pragmas -Wconversion -Wold-style-cast -fno-strict-aliasing -Wno-format-extra-args -Wno-sign-compare ) they only common thing i see is that i used 6.5 aswell. 'll give you an update. |
conconconconconconcon, your GTX275 is pre-Fermi, so it makes sense that kinfu would not run with that card. (The minimum arch pcl compiles for is 2.0, as per the settings in my initial comment above.) Technically the "invalid device function" error is supposed to occur when you have an architecture mismatch. I can't comment about tegra cross-compilation as I'm not knowledgeable about it, sorry. |
you're right - i didn't make a proper describtion of what i'am doing - i'am just building on the machine with the GTX275 (pre fermi) i'am running on the tablet with kepler. so my statement about linux is quite irrelavant. anything new on the cause of the problem? are you sure the lastcudaerror is caused by the initVolume Kernel? i would love to try your workaround settings |
You could try using the cmake settings CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0, CUDA_ARCH_PTX - 3.0 with your current version of CUDA and see if that works. There's a few things to note about these settings. If I am not mistaken, the "2.0" would translate into "-gencode=arch=compute_20,code=sm_20", and the "2.1(2.0)" would translate into "-gencode=arch=compute_21,code=sm_20" with the idea being that we're compiling 2.0 code for 2.1 architecture. Meanwhile the ptx setting translates into "-gencode=arch=compute_30,code=compute_30" which is designed to "future-proof" your build. Note that by default in PCL, the PTX setting is blank (and probably shouldn't be, as per below) From http://docs.nvidia.com/cuda/maxwell-compatibility-guide/#axzz3TFvYAV8o "Note that compute_XX refers to a PTX version and sm_XX refers to a cubin version. The arch= clause of the -gencode= command-line option to nvcc specifies the front-end compilation target and must always be a PTX version. The code= clause specifies the back-end compilation target and can either be cubin or PTX or both. Only the back-end target version(s) specified by the code= clause will be retained in the resulting binary; at least one should be PTX to provide compatibility with future architectures." I notice that the default compilation settings don't include 3.2. You can either try including it in the CUDA_ARCH_BIN field as "3.2(3.0)" or make edits to the source similar to what was done for issue #880. However, theoretically 3.2 should be able to run PTX version 3.0. As to my issue, I have tried compiling both on the machine with the GTX 570 and the machine with the maxwell GTX 860M to the same result, so I don't think the compiling GPU is an issue for me at least. I have also tried the settings "CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0, CUDA_ARCH_PTX - 3.0" under CUDA 6.5 but have also gotten the same result. But maybe results are different for you, conconconconconconcon. If you're able to get this to work, then that further makes me suspect a windows-only bug either in CUDA or PCL. One further dumb suggestion: are you using the Tegra Android Development Pack? See https://developer.nvidia.com/tegra-android-development-pack In any case, good luck! This issue tracker probably isn't supposed to be used as a support forum, so I've probably overstepped a bit. |
cudaSafeCall has been removed in cude >= 5.0 . Maxwell is compute 5.0 so it isn't there. |
This bug is still replicable till now. @SteveSmithStyku have you found a solution? |
my environment is: GT705, cuda7.5, vs2013, |
Till now I've found a workaround: since the problem occurs due to '/GL' optimization flag in release mode, I unchecked the |
I have a GTX 1070 with CUDA 7.5 and am experiencing the same issue. I can't even get the octree example to run without invalid device function. How exactly does one fix this? I did exactly as above, disabling the CMAKE_MSVC_CODE_LINK_OPTIMIZATION |
Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs. |
Same bug. And don't know how to solve this. |
Pcl is dead |
Kinfu sub-module is deprecated due to lack of a maintainer familiar with kinfu. GPU and CUDA modules will be receiving some update in GSoC, but the lack of tests on CI is hampering efforts to increase support there. Community help is welcome in these 2 modules (namely cuda and gpu). |
Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs. |
Closing this as it should be fixed by #4197. |
This is a follow-up to an issue reported in the comments of issue #880. Kinfu continues to fail to work on Maxwell GPUs when compiled on windows with latest MSVC and CUDA. I strongly suspect that this is a windows only issue, but maybe linux users can confirm that for me?
In detail, this is what I used to compile pcl_gpu_containers and pcl_gpu_kinfu:
Microsoft Visual Studio 2013
CUDA 6.5.14
Boost 1.56
Eigen 3.0.5
FLANN 1.7.1
VTK 6.1.0
The only non-default settings in cmake are the following:
CUDA_ARCH_BIN: 2.0 2.1(2.0) 3.0 3.5 5.0
CUDA_ARCH_PTX: 5.0
Compilation succeeds, and Kinfu does work properly on non-maxwell GPUs (tested on GTX 570). However, Kinfu fails to work properly on two different Maxwell GPUs, the GTX 750Ti, and the Maxwell version of the GTX 860M. In debug mode, it does run, but at a significantly lower framerate than a non-maxwell GPU. In release mode, it encounters a runtime error and exits the application:
Error: invalid device function, tsdf_volume.cu:76
The function that fails is pcl::device::initVolume, and I suspect it fails on this line: initializeVolume<<<grid, block>>>(volume);
My current workaround is to compile Kinfu with Microsoft Visual Studio 2010 and CUDA 5.0.35 with the following cmake settings:
CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0
CUDA_ARCH_PTX - 3.0
I hope someone has an idea of how to resolve this issue! It's been persisting ever since the Maxwell GPUs first came out. Thanks everyone.
The text was updated successfully, but these errors were encountered: