Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA cuDNN CMake try_compile tests fail w/ Xcode clang=8.0.0 #674

Closed
headupinclouds opened this issue Jul 2, 2017 · 18 comments
Closed

CUDA cuDNN CMake try_compile tests fail w/ Xcode clang=8.0.0 #674

headupinclouds opened this issue Jul 2, 2017 · 18 comments

Comments

@headupinclouds
Copy link
Contributor

I'm walking through dlib DNN CUDA and cuDNN setup for a Mac OS X 10.12 with a preferred Xcode 8.1 (or newer) toolchain.

clang --version:

Apple LLVM version 8.0.0 (clang-800.0.42.1)

Apparently, Xcode/clang doesn't like the CUDA and cuDNN CMake try_compile tests. In particular, the -std=c++11 CUDA_NVCC_FLAGS flag triggers an error. Removing the flag makes it happy.

An earlier post ( #356 ) suggests a clang-omp compiler can be used as a workaround on OS X, but the latest Xcode toolchain seems to work fine with a few mods. I'd prefer to work this into the current CMake if possible.

Here is a pointer to the relevant code in the WIP fork:
headupinclouds@1b8bb49

#list(APPEND CUDA_NVCC_FLAGS "-arch=sm_30;-std=c++11;-D__STRICT_ANSI__;-D_MWAITXINTRIN_H_INCLUDED;-D_FORCE_INLINES")
# -std=c++11 fails on OSX/Xcode clang=8.0.0
list(APPEND CUDA_NVCC_FLAGS "-arch=sm_30;-D__STRICT_ANSI__;-D_MWAITXINTRIN_H_INCLUDED;-D_FORCE_INLINES")

I think we can probably exclude the flag for detected XCODE builds:

if(NOT XCODE) # possibly more specific (AND CLANG_SOMETHING)
   list(APPEND CUDA_NVCC_FLAGS "-std=c++11")
endif()

After that, I ran into an issue related to a previously documented CUDA + OpenMP requirement, which doesn't seem to be the case for Xcode/clang. An initial workaround in my test fork is here:

headupinclouds@72b5464

Essentially adding NOT XCODE to omit openmp_libraries linking seems to be sufficient:

if (NOT openmp_libraries AND NOT MSVC AND NOT XCODE) 

If such changes would be merged upstream, I can dig a little deeper and put a revise pr.xcode.cuda.fix PR together for review and further discussion. Let me know if that makes sense. Thanks!

@davisking
Copy link
Owner

Sure, submit a PR if you can. However, definitely look deeper into the -std=c++11 thing since that sounds very wrong. Maybe the switch is appearing twice and removing it makes it appear once or something like that. But not giving -std=c++11 at all is crazy.

@headupinclouds
Copy link
Contributor Author

I'll investigate further. I did confirm that -std=c++11 doesn't already exist in CUDA_NVCC_FLAGS at the point cuda_add_library(cuda_test STATIC cuda_test.cu) is called.

@kino-dome
Copy link
Contributor

kino-dome commented Dec 6, 2017

I just wanted to confirm that the second commit mentioned here (which omits OpenMP libraries) just saved me from a hassle of building dlib with Cuda enabled. Was struggling with llvm, clang and OpenMP installations for 3 days when I found this post and tried the second commit. dlib built fine and dnn example worked without any problems. It seems at least on my OS and Cuda, the Xcode/clang toolchain has omitted the need for OpenMP check.

OS: macOS 10.12.6
Xcode: Version 8.3.3 (8E3004b)
Cuda Version: 9.0.197
cudnn: cudnn-9.0-osx-x64-v7

@davisking
Copy link
Owner

davisking commented Dec 6, 2017 via email

@kino-dome
Copy link
Contributor

Yeah sure, maybe @headupinclouds should do it because he did all the work, if he can't I'll do it gladly. Also I have not tested on other versions of macOS, Is it possible that this PR would ruin it for others with other OS versions?

@davisking
Copy link
Owner

davisking commented Dec 7, 2017 via email

@kino-dome
Copy link
Contributor

The change I made was the exact copy of @headupinclouds 's commit here. it changes the OpenMP check for Xcode by changing if (NOT openmp_libarires AND NOT MSVC) to if (NOT openmp_libraries AND NOT MSVC AND NOT XCODE). The fact that this was the way it was, makes me think that maybe on previous macOS/Xcode versions this was necessary and the change might mess up builds for people with previous versions. I could be totally wrong though.
I have one other system but unfortunately with the exact same version condition as the one I tested on.

@kino-dome
Copy link
Contributor

Nevertheless I can make the PR but you can merge it when we'll be sure. Is that OK? or should we first make sure and then make the PR?

@davisking
Copy link
Owner

It's probably fine. I added that openmp stuff because it was required by the linux version of cuda. Maybe it's not required at all by the macos version of cuda. What happens when you build these things outside xcode with just cmake and make?

@kino-dome
Copy link
Contributor

I'm already building using cmake in terminal in the report I mentioned. cmake -D DLIB_NO_GUI_SUPPORT=yes .. and then cmake --build . --config Release. When compiling dlib the compiler used is Apple Clang 8. Here's what I get in terminal:
dlibbuild

@davisking
Copy link
Owner

Cool. Seems good. Do remove that xcode specific message statement though.

@headupinclouds
Copy link
Contributor Author

Yeah sure, maybe @headupinclouds should do it because he did all the work

@kino-dome : Please feel free to send any of those changes in your own PR. It would be nice to resolve this.

In particular, the -std=c++11 CUDA_NVCC_FLAGS flag triggers an error

☝️ I recall there was still some weirdness related to the -std=c++11 flag breaking Cuda builds in Xcode/Clang (after fixing the OpenMP issue). From your comment above, it sounds like this might be resolved in the Xcode+Cuda+Cudnn combination you tested here:

OS: macOS 10.12.6; Xcode: Version 8.3.3 (8E3004b); Cuda Version: 9.0.197; cudnn: cudnn-9.0-osx-x64-v7

Thanks for providing the configuration details. It looks like that may be addressed by upgrading. I'm looking forward to trying it.

@kino-dome
Copy link
Contributor

@kino-dome : Please feel free to send any of those changes in your own PR. It would be nice to resolve this.

Thanks @headupinclouds. I'll take care of it then.

☝️ I recall there was still some weirdness related to the -std=c++11 flag breaking Cuda builds in Xcode/Clang (after fixing the OpenMP issue). From your comment above, it sounds like this might be resolved in the Xcode+Cuda+Cudnn combination you tested here:

I actually didn't use your first commit with the -std=c++11 flag. It was just the second commit that omits OpenMP.

kino-dome added a commit to kino-dome/dlib that referenced this issue Dec 7, 2017
davisking pushed a commit that referenced this issue Dec 7, 2017
reunanen pushed a commit to reunanen/dlib that referenced this issue Dec 25, 2017
@headupinclouds
Copy link
Contributor Author

I recall there was still some weirdness related to the -std=c++11 flag breaking CUDA builds

In case anyone else encounters this part of the issue...

I believe the core issue is due to a typo in FindCUDA.cmake

If I run the following patch on my FindCUDA.cmake the dlib try_compile test works fine:

sed -i .bk 's|MATCHES \"-std;|MATCHES \"-std=|g' /usr/local/Cellar/cmake/3.10.1/share/cmake/Modules/FindCUDA.cmake

This MATCH operation was failing to detect the current -std=c++11 flag and was adding a second one, which nvcc didn't like (note the ";"):

    if( NOT "${CUDA_NVCC_FLAGS}" MATCHES "-std;c\\+\\+11" ) # needs 's|;|=|'
      list(APPEND nvcc_flags --std c++11)
    endif()

This was tested w/ CUDA 9.0 on OSX and CMake 3.10.1. The actual error from the test (--debug-trycompile) was:

/usr/local/cuda/bin/nvcc -M -D__CUDACC__ /dlib/dlib/cmake_utils/test_for_cuda/cuda_test.cu -o /dlib/_builds/libcxx-Release/dlib/cuda_test_build/CMakeFiles/cuda_test.dir//cuda_test_generated_cuda_test.cu.o.NVCC-depend -ccbin /usr/bin/clang -m64 --std c++11 -DDLIB_USE_CUDA -Xcompiler ,\"-stdlib=libc++\",\"-g\" -arch=sm_30 -std=c++11 -D__STRICT_ANSI__ -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -DNVCC -I/usr/local/cuda/include -I/dlib/dlib/cmake_utils/test_for_cuda/../../dnn
nvcc fatal   : redefinition of argument 'std'
CMake Error at cuda_test_generated_cuda_test.cu.o.cmake:219 (message):
  Error generating
  /dlib/_builds/libcxx-Release/dlib/cuda_test_build/CMakeFiles/cuda_test.dir//./cuda_test_generated_cuda_test.cu.o

see: https://gitlab.kitware.com/cmake/cmake/merge_requests/1628

If this PR is merged, then it should be resolved in a future CMake release.

A workaround would be to omit the -std=c++11 flag in CUDA_NVCC_FLAGS here:

list(APPEND CUDA_NVCC_FLAGS "-arch=sm_30;-std=c++11;-D__STRICT_ANSI__;-D_MWAITXINTRIN_H_INCLUDED;-D_FORCE_INLINES")

This seems to be a combination of nvcc's intolerance of duplicate flags (at least some versions) and a CMake FindCUDA typo. Given that the flags are applied inside cuda_add_library(), and it isn't easy to enumerate the platforms that will have the problem, the best and most future safe local dlib workaround might be to simply run try_compile a second time without -std=c++ if the first one fails. I can send a PR for that. Let me know if you have another preference.

@davisking
Copy link
Owner

Is the error still happening as far as anyone knows though? I was under the impression it was worked around in a previous PR.

@headupinclouds
Copy link
Contributor Author

headupinclouds commented Jan 3, 2018

Is the error still happening as far as anyone knows though

I'm not able to build with DLIB_USE_CUDA=ON unless I apply the patch above. The local FindCUDA.cmake patch is workable for me.

Since Apple hasn't built HW using NVIDIA since 2013 or so, I'm guessing the intersection of dlib dnn + OS X users who are likely to hit this issue is fairly small (Hackintosh, early eGPU, old MacBooks). Still, it does seem curious.

@kino-dome : You mentioned you are now able to build. You provided the following spec.

OS: macOS 10.12.6; Xcode: Version 8.3.3 (8E3004b); Cuda Version: 9.0.197; cudnn: cudnn-9.0-osx-x64-v7

This is the setup I'm using, except I'm on macOS 10.12.2, which I think is unrelated to the issue. The one thing you didn't share was your cmake -version (specifically your FindCUDA.cmake in case it is different). Any chance you can report that here? I would like to understand how you are able to build.


I also realized the other part of the issue if (NOT openmp_libraries AND NOT MSVC AND NOT XCODE) should really be if (NOT openmp_libraries AND NOT MSVC AND NOT APPLE). I hit the OpenMP issue on an OS X system using a pure (Apple) clang toolchain without Xcode. I believe it has to do with the OS X NVIDIA CUDA release, which, from CMake's perspective, is effectively APPLE ("Darwin") and is unrelated to XCODE. I can send a PR for that.

@davisking
Copy link
Owner

Sounds good. Please send a PR for this stuff then :)

@kino-dome
Copy link
Contributor

Hey @headupinclouds, sorry for the delay. I checked my cmake version and it's 3.3.1 . Also what you said about replacing NOT XCODE with NOT APPLE makes sense, I only tried the first variant and it worked for me but your logic is valid.

Is the error still happening as far as anyone knows though? I was under the impression it was worked around in a previous PR.

The PR I made was in relation to the OpenMP issue not the -std=c++11 issue headupinclouds mentioned later. Hope his PR solves this once and for all :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants