Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Error: invalid device function, tsdf_volume.cu:76 (Kinfu bug on windows) #1113

Closed
SteveSmithStyku opened this issue Jan 27, 2015 · 17 comments
Labels
needs: pr merge Specify why not closed/merged yet

Comments

@SteveSmithStyku
Copy link

This is a follow-up to an issue reported in the comments of issue #880. Kinfu continues to fail to work on Maxwell GPUs when compiled on windows with latest MSVC and CUDA. I strongly suspect that this is a windows only issue, but maybe linux users can confirm that for me?

In detail, this is what I used to compile pcl_gpu_containers and pcl_gpu_kinfu:
Microsoft Visual Studio 2013
CUDA 6.5.14
Boost 1.56
Eigen 3.0.5
FLANN 1.7.1
VTK 6.1.0

The only non-default settings in cmake are the following:
CUDA_ARCH_BIN: 2.0 2.1(2.0) 3.0 3.5 5.0
CUDA_ARCH_PTX: 5.0

Compilation succeeds, and Kinfu does work properly on non-maxwell GPUs (tested on GTX 570). However, Kinfu fails to work properly on two different Maxwell GPUs, the GTX 750Ti, and the Maxwell version of the GTX 860M. In debug mode, it does run, but at a significantly lower framerate than a non-maxwell GPU. In release mode, it encounters a runtime error and exits the application:

Error: invalid device function, tsdf_volume.cu:76

The function that fails is pcl::device::initVolume, and I suspect it fails on this line: initializeVolume<<<grid, block>>>(volume);

My current workaround is to compile Kinfu with Microsoft Visual Studio 2010 and CUDA 5.0.35 with the following cmake settings:
CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0
CUDA_ARCH_PTX - 3.0

I hope someone has an idea of how to resolve this issue! It's been persisting ever since the Maxwell GPUs first came out. Thanks everyone.

@Grandgarfield
Copy link

I posted an answer on the other thread but I am not sure if it was the right place, I just make a copy here feel free to tell me if one should be removed.
My lead :

Hi,

If it may help, after searching a long time I managed to run the kinfu application with CUDA 6.5.
My OS is win7, and i'm using a laptop computer with nvidia 640M as GPU.
I had the same error in release mode and the application working fine in debug mode as f2um2326 described.
The problem seems to come from the /GL flag during compilation.
To make it work after generating the project files with cmake i manually removed the /GL flags in all .cmake files in : $(BUILD_DIR)\gpu\kinfu\CMakeFiles\cuda_compile.dir\src\cuda, there is a line setting CMAKE_HOST_FLAGS_RELEASE that is where i removed the flag.
This is no optimal solution since code is not properly optimized i guess but it may help some of you...

@SteveSmithStyku
Copy link
Author

Thanks for your reply! But I don't think this fixes the underlying problem. Removing /GL explains why the debug build works and the release build doesn't (by default, /GL is a release only flag), but it doesn't explain the substantially reduced frame rate on maxwell GPUs.

To reiterate... I can build the code for non-maxwell under CUDA 6.5 & MSVC 12 and have it work fine, but I can't build for maxwell under CUDA 6.5 & MSVC 12 in a way that the performance is comparable to building under CUDA 5.0 & MSVC 10 using the aforementioned flags.

@oliveRudoll
Copy link

hey guys! i'am quite new to cuda issues but i can report that i have the same cudaLastError message on Ubuntu 14.04, 64bit with a quite old GPU (GTX275, 1.3) - i'm building gpu/cuda modules to use on android device with Tegra K1 - Kepler, 3.2 (NVCC: -arch=sm_32). I tried with CUDA6.0/6.5 GCC4.6/4.8- the build is successful but when i run the app i get the same error.

i looked into ../cuda_compile.dir/cuda/cuda_compile_generated_tsdf_volume.cu.o.cmake - there is no GL flag in my case.

set(CMAKE_HOST_FLAGS -fexceptions -frtti -Wno-psabi --sysroot=/home/con/NDK/android-ndk-r8b/platforms/android-8/arch-arm -fpic -funwind-tables -finline-limit=64 -fsigned-char -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -fdata-sections -ffunction-sections -Wa,--noexecstack -Wall -Wextra -Wabi -Wno-unknown-pragmas -Wconversion -Wold-style-cast -fno-strict-aliasing -Wno-format-extra-args -Wno-sign-compare )
set(CMAKE_HOST_FLAGS_DEBUG -marm -fno-omit-frame-pointer -fno-strict-aliasing -O0 -g -DDEBUG -D_DEBUG)

they only common thing i see is that i used 6.5 aswell. 'll give you an update.

@SteveSmithStyku
Copy link
Author

conconconconconconcon, your GTX275 is pre-Fermi, so it makes sense that kinfu would not run with that card. (The minimum arch pcl compiles for is 2.0, as per the settings in my initial comment above.) Technically the "invalid device function" error is supposed to occur when you have an architecture mismatch.

I can't comment about tegra cross-compilation as I'm not knowledgeable about it, sorry.

@oliveRudoll
Copy link

you're right - i didn't make a proper describtion of what i'am doing - i'am just building on the machine with the GTX275 (pre fermi) i'am running on the tablet with kepler. so my statement about linux is quite irrelavant.

anything new on the cause of the problem? are you sure the lastcudaerror is caused by the initVolume Kernel?

i would love to try your workaround settings
CUDA 5.0.35, CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0, CUDA_ARCH_PTX - 3.0
the problem is that CUDA 5.0.xx has no XCompiling abilities - if i'am not wrong?!

@SteveSmithStyku
Copy link
Author

You could try using the cmake settings CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0, CUDA_ARCH_PTX - 3.0 with your current version of CUDA and see if that works.

There's a few things to note about these settings. If I am not mistaken, the "2.0" would translate into "-gencode=arch=compute_20,code=sm_20", and the "2.1(2.0)" would translate into "-gencode=arch=compute_21,code=sm_20" with the idea being that we're compiling 2.0 code for 2.1 architecture. Meanwhile the ptx setting translates into "-gencode=arch=compute_30,code=compute_30" which is designed to "future-proof" your build. Note that by default in PCL, the PTX setting is blank (and probably shouldn't be, as per below)

From http://docs.nvidia.com/cuda/maxwell-compatibility-guide/#axzz3TFvYAV8o "Note that compute_XX refers to a PTX version and sm_XX refers to a cubin version. The arch= clause of the -gencode= command-line option to nvcc specifies the front-end compilation target and must always be a PTX version. The code= clause specifies the back-end compilation target and can either be cubin or PTX or both. Only the back-end target version(s) specified by the code= clause will be retained in the resulting binary; at least one should be PTX to provide compatibility with future architectures."

I notice that the default compilation settings don't include 3.2. You can either try including it in the CUDA_ARCH_BIN field as "3.2(3.0)" or make edits to the source similar to what was done for issue #880. However, theoretically 3.2 should be able to run PTX version 3.0.

As to my issue, I have tried compiling both on the machine with the GTX 570 and the machine with the maxwell GTX 860M to the same result, so I don't think the compiling GPU is an issue for me at least. I have also tried the settings "CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0, CUDA_ARCH_PTX - 3.0" under CUDA 6.5 but have also gotten the same result. But maybe results are different for you, conconconconconconcon. If you're able to get this to work, then that further makes me suspect a windows-only bug either in CUDA or PCL.

One further dumb suggestion: are you using the Tegra Android Development Pack? See https://developer.nvidia.com/tegra-android-development-pack

In any case, good luck! This issue tracker probably isn't supposed to be used as a support forum, so I've probably overstepped a bit.

@benbennett
Copy link
Contributor

cudaSafeCall has been removed in cude >= 5.0 . Maxwell is compute 5.0 so it isn't there.
https://devtalk.nvidia.com/default/topic/525246/is-cudasafecall-no-longer-needed-/
Fix would be to add the function when compiling /target for 5.0 and above or remove it from the codebase.
I might have the wrong terminology.

@zhangxaochen
Copy link
Contributor

This bug is still replicable till now.
my environment is: GTX750Ti, cuda6.5, vs2013
seems it is an error generated by release optimization
@benbennett I tried what your link said (using custom macro CUDA_SAFE_CALL), but of no luck. It's the function cudaGetLastError which throwed the error and exited.

@SteveSmithStyku have you found a solution?

@jiaxing-li
Copy link

my environment is: GT705, cuda7.5, vs2013,
it is an error, ERROR:invalid device function, tsdf_volume.cu:
@zhangxaochen have you found a solution?

@zhangxaochen
Copy link
Contributor

Till now I've found a workaround: since the problem occurs due to '/GL' optimization flag in release mode, I unchecked the CMAKE_MSVC_CODE_LINK_OPTIMIZATION variable to disable the '/GL' & 'LTCG' flag.
Then kinfu_app & gpu_kinfu works well, yet I'm not sure how much it (and the libs, entire solution) is slowing down by disabling /GL.
Not sure if this's the proper way and I wish some experts could give some detailed explanation and official patch ;-).

@soulslicer
Copy link

soulslicer commented Sep 6, 2016

I have a GTX 1070 with CUDA 7.5 and am experiencing the same issue. I can't even get the octree example to run without invalid device function. How exactly does one fix this? I did exactly as above, disabling the CMAKE_MSVC_CODE_LINK_OPTIMIZATION

@stale
Copy link

stale bot commented May 19, 2020

Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs.

@stale stale bot added the status: stale label May 19, 2020
@FishHe
Copy link

FishHe commented May 25, 2020

Same bug. And don't know how to solve this.

@stale stale bot removed the status: stale label May 25, 2020
@soulslicer
Copy link

Pcl is dead

@kunaltyagi
Copy link
Member

Kinfu sub-module is deprecated due to lack of a maintainer familiar with kinfu. GPU and CUDA modules will be receiving some update in GSoC, but the lack of tests on CI is hampering efforts to increase support there. Community help is welcome in these 2 modules (namely cuda and gpu).

@stale
Copy link

stale bot commented Jun 26, 2020

Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs.

@stale stale bot added the status: stale label Jun 26, 2020
@kunaltyagi kunaltyagi added the needs: pr merge Specify why not closed/merged yet label Jun 26, 2020
@stale stale bot removed the status: stale label Jun 26, 2020
@larshg
Copy link
Contributor

larshg commented Feb 28, 2021

Closing this as it should be fixed by #4197.

@larshg larshg closed this as completed Feb 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: pr merge Specify why not closed/merged yet
Projects
None yet
Development

No branches or pull requests

10 participants