Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda 9.0 "error: more than one operator "==" matches these operands" #797

Open
dllehr81 opened this issue Aug 22, 2017 · 32 comments
Open

cuda 9.0 "error: more than one operator "==" matches these operands" #797

dllehr81 opened this issue Aug 22, 2017 · 32 comments

Comments

@dllehr81
Copy link

While attempting to build torch from master with cutorch with cuda 9.0.103-1 on Ubuntu 16.04 I hit an error with multiple attempts to overload the "==" and "!=" operators.

Below is an example of the error I receive.

lib/THC/CMakeFiles/THC.dir/build.make:4243: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o] Error 1
/pkgbuild/torch/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

/pkgbuild/torch/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(414): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

I was able to track down the two operator overloads.
One is in
https://github.com/torch/cutorch/blob/master/lib/THC/THCTensorTypeUtils.cuh#L176

And the other is in
/usr/local/cuda-9.0/targets/ppc64le-linux/include/cuda_fp16.hpp

The operator in cuda_fp16.hpp was provided by the cuda package, but only covers the __device__ and not the __host__. So we still need to overload the "==" for halfs in the __host__, however, the code currently in cutorch fails on compile time.

It looks like @csarofeen worked on the initial port to cuda9.0 for cutorch. I'm not sure if he can provide some help on what's going on here?

Is there any additional information you need from me? Thanks in advance!!

@csarofeen
Copy link
Contributor

csarofeen commented Aug 23, 2017

Before you build try export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"

@dllehr81
Copy link
Author

Hey @csarofeen . That did the trick! On a side note. This has the appearance of disabling the half operators in the cuda code. Will this impact the half variables performance when run on the device?

@csarofeen
Copy link
Contributor

It will, for the better.

@betterjordache
Copy link

@csarofeen this did it for me as well, thank you!

@ProGamerGov
Copy link

ProGamerGov commented Oct 24, 2017

I had the same issue:

[  4%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCSleep.cu.o
[  5%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCStorage.cu.o
[  6%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCStorageCopy.cu.o
[  7%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensor.cu.o
[  8%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorCopy.cu.o
[ 10%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o
[ 11%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath2.cu.o
[ 12%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathBlas.cu.o
[ 13%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathMagma.cu.o
[ 14%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o
/home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

/home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(414): error: more than one operator "==" matches these operands:
            function "operator==(const __half &, const __half &)"
            function "operator==(half, half)"
            operand types are: half == half

[ 15%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathReduce.cu.o
2 errors detected in the compilation of "/tmp/tmpxft_00002141_00000000-4_THCTensorMath.cpp4.ii".
CMake Error at THC_generated_THCTensorMath.cu.o.cmake:267 (message):
  Error generating file
  /home/ubuntu/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorMath.cu.o


lib/THC/CMakeFiles/THC.dir/build.make:112: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
^Clib/THC/CMakeFiles/THC.dir/build.make:105: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorCopy.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorCopy.cu.o] Interrupt
lib/THC/CMakeFiles/THC.dir/build.make:140: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMathPairwise.cu.o] Interrupt
CMakeFiles/Makefile2:172: recipe for target 'lib/THC/CMakeFiles/THC.dir/all' failed
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Interrupt
Makefile:127: recipe for target 'all' failed
make: *** [all] Interrupt

Error: Build error: Failed building.
ubuntu@ip-Address:~/torch$

Running ./clean.sh and then using: export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__", before finally running ./install.sh worked!

I was using Ubuntu 16.04.

@ProGamerGov
Copy link

ProGamerGov commented Oct 24, 2017

@csarofeen If it's better to disable the half operators, then what are they used for? Why are they included in the cuda code? And what kind of performance boost are we talking about here?

@csarofeen
Copy link
Contributor

Cuda 9 added half operators in the cuda half header. Half operations in torch predate that so they already existed in torch. This keeps the half definition from the cuda header, while not compiling the operators.

@ProGamerGov
Copy link

@csarofeen Do you have any other performance tips for Cuda and/or cuDNN with Torch7?

Because I've noticed that Cuda 9.0 and cuDNN v7 have even worse performance than Cuda 8.0 and cuDNN v5: jcjohnson/neural-style#429

@sfzyk
Copy link

sfzyk commented Nov 15, 2017

same issue. but export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" didn't work for me?? how to solve that?

@csarofeen
Copy link
Contributor

@sfzyk Could you please explain all steps you took to install CUDA, NCCL, cuDNN, and pytorch and paste here some of the output from the error? It is very hard to assist the only information provided is "didn't work".

@wuyun8210
Copy link

Install Torch 7 in Ubuntu 16.04 cause error:
cuda 9.0: more than one operator "==" matches these operands"
One possible solution:
1、uninstall cuda9.0
ref:http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-additional

(1)To uninstall the CUDA Toolkit, run the uninstallation script provided in the bin directory of the toolkit. By default, it is located in /usr/local/cuda-9.1/bin:
$ sudo /usr/local/cuda-9.1/bin/uninstall_cuda_9.1.pl
(2)To uninstall the NVIDIA Driver, run nvidia-uninstall(no need to uninstall):
$ sudo /usr/bin/nvidia-uninstall
(3)reboot ubuntu

2、 install cuda 8.0 - download address:https://developer.nvidia.com/cuda-80-ga2-download-archive

(1)install 8.0 deb
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

(2)install patch2
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

3、install Torch
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch
bash install-deps

if earlier error caused, use:sudo ./clean.sh
sudo ./install.sh export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"

source ~/.bashrc

@Amir-Arsalan
Copy link

Amir-Arsalan commented Jan 17, 2018

@csarofeen I tried export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" before ./install.sh and I still cannot install Torch. The installation gets stalled at 81% while compiling the cutorch package. Before setting the environmental variable, the installation would crash. I have CUDA 9.1 and cuDNN 7.05 on a machine with GeForce 1080Ti GPU.

Now, I'm getting warnings like this:

[ 61%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMaskedLong.cu.o
[ 62%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorSortHalf.cu.o
/home/arsalans/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=half, AccT=float]" 
/home/arsalans/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/home/arsalans/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void renormRowsL1(T *, long, long) [with T=float]" 
/home/arsalans/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/home/arsalans/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=float, AccT=float]" 
/home/arsalans/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/home/arsalans/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): warning: specified alignment (8) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void renormRowsL1(T *, long, long) [with T=double]" 
/home/arsalans/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/home/arsalans/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (8) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=double, AccT=double]" 
/home/arsalans/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here
[ 63%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathCompareTHalf.cu.o
[ 64%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathPointwiseHalf.cu.o

@thompa2
Copy link

thompa2 commented Feb 9, 2018

I had the same problem and it was driving me nuts.

This did not work:

./clean.sh
export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
./install.sh

This did work:

./clean.sh
TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" ./install.sh

Hope that helps.

@hzxie
Copy link

hzxie commented May 14, 2018

@thompa2
It works like a charm

@ricpruss
Copy link

With 9.2 you need.

export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF2_OPERATORS__"

@Amir-Arsalan
Copy link

@ricpruss I did export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF2_OPERATORS__" but cannot build Torch still with CUDA 9.2 and cudnn 7.x.x . Any ideas?

@ricpruss
Copy link

ricpruss commented May 31, 2018

You still getting the errors on operator overload?
Did you run clean.sh after the change?
@Amir-Arsalan

@maximiliangoettgens
Copy link

Same here with cuda 9.0 and cudnn 7. export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" does solve the operator issue, but I am still getting these erros for the __half class:

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(173): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(173): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(177): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(177): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCNumerics.cuh(114): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCNumerics.cuh(115): error: class "__half" has no member "x"

6 errors detected in the compilation of "/tmp/tmpxft_0000615c_00000000-6_THCTensorCopy.cpp1.ii".
CMake Error at THC_generated_THCTensorCopy.cu.o.cmake:267 (message):
  Error generating file
  /home/max/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorCopy.cu.o


lib/THC/CMakeFiles/THC.dir/build.make:105: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorCopy.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorCopy.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(173): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(173): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(177): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCTensorTypeUtils.cuh(177): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCNumerics.cuh(114): error: class "__half" has no member "x"

/home/max/torch/extra/cutorch/lib/THC/THCNumerics.cuh(115): error: class "__half" has no member "x"

6 errors detected in the compilation of "/tmp/tmpxft_00006176_00000000-6_THCTensorMath.cpp1.ii".
CMake Error at THC_generated_THCTensorMath.cu.o.cmake:267 (message):
  Error generating file
  /home/max/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorMath.cu.o


lib/THC/CMakeFiles/THC.dir/build.make:112: recipe for target 'lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o' failed
make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath.cu.o] Error 1
CMakeFiles/Makefile2:172: recipe for target 'lib/THC/CMakeFiles/THC.dir/all' failed
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

Error: Build error: Failed building.

@Amir-Arsalan
Copy link

@ricpruss I get these errors:

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void renormRowsL1(T *, long, long) [with T=float]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=float, AccT=float]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): warning: specified alignment (8) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void renormRowsL1(T *, long, long) [with T=double]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (8) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=double, AccT=double]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=half, AccT=float]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void renormRowsL1(T *, long, long) [with T=float]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (4) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=float, AccT=float]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): warning: specified alignment (8) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void renormRowsL1(T *, long, long) [with T=double]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): warning: specified alignment (8) is different from alignment (2) specified on a previous declaration
          detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=double, AccT=double]" 
/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

CMakeFiles/Makefile2:172: recipe for target 'lib/THC/CMakeFiles/THC.dir/all' failed
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

Error: Build error: Failed building.

@fredlemieux
Copy link

@thompa2 AMAZING! Thank you! After hours of searching for a solution this worked. (well I'm at 20% now which is more than I've been able to get to until now) What a pain!

@fredlemieux
Copy link

I lie, failed again........ but this time at 20% . That's progress isn't it? I'm not sure, I'm thinking for giving up.

I've got MacBook Pro (13-inch, 2017)

This is the error message at 20%:

            ^

9 warnings generated.
[ 20%] Building NVCC (Device) object lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorRandom.cu.o
/Users/fredlemieux/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): error: specified alignment (4) is different from alignment (2) specified on a previous declaration
detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=half, AccT=float]"
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/Users/fredlemieux/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): error: specified alignment (4) is different from alignment (2) specified on a previous declaration
detected during instantiation of "void renormRowsL1(T *, long, long) [with T=float]"
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/Users/fredlemieux/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): error: specified alignment (4) is different from alignment (2) specified on a previous declaration
detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=float, AccT=float]"
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

/Users/fredlemieux/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(95): error: specified alignment (8) is different from alignment (2) specified on a previous declaration
detected during instantiation of "void renormRowsL1(T *, long, long) [with T=double]"
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(98): here

/Users/fredlemieux/torch/extra/cutorch/lib/THC/THCTensorRandom.cuh(156): error: specified alignment (8) is different from alignment (2) specified on a previous declaration
detected during instantiation of "void sampleMultinomialOnce<T,AccT>(long *, long, int, T *, T *) [with T=double, AccT=double]"
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorRandom.cu(169): here

5 errors detected in the compilation of "/tmp/tmpxft_00011634_00000000-11_THCTensorRandom.compute_61.cpp1.ii".
CMake Error at THC_generated_THCTensorRandom.cu.o.cmake:267 (message):
Error generating file
/Users/fredlemieux/torch/extra/cutorch/build/lib/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorRandom.cu.o

make[2]: *** [lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorRandom.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
9 warnings generated.
6 warnings generated.
/Users/fredlemieux/torch/extra/cutorch/lib/THC/THCHalf.h:24:17: warning: 'THC_float2half' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THC_float2half(float a);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCStorage.h:28:17: warning: 'THCudaHalfStorage_get' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfStorage_get(THCState * state, const THCudaHalfStorage *, ptrdiff_t);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensor.h:127:17: warning: 'THCudaHalfTensor_get1d' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_get1d(THCState * state, const THCudaHalfTensor * tensor, long x0);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensor.h:128:17: warning: 'THCudaHalfTensor_get2d' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_get2d(THCState * state, const THCudaHalfTensor * tensor, long x0, long x1);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensor.h:129:17: warning: 'THCudaHalfTensor_get3d' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_get3d(THCState * state, const THCudaHalfTensor * tensor, long x0, long x1, long x2);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensor.h:130:17: warning: 'THCudaHalfTensor_get4d' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_get4d(THCState * state, const THCudaHalfTensor * tensor, long x0, long x1, long x2, long x3);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorMathReduce.h:35:17: warning: 'THCudaHalfTensor_minall' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_minall(THCState * state, THCudaHalfTensor * self);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorMathReduce.h:36:17: warning: 'THCudaHalfTensor_maxall' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_maxall(THCState * state, THCudaHalfTensor * self);
^
/Users/fredlemieux/torch/extra/cutorch/lib/THC/generic/THCTensorMathReduce.h:37:17: warning: 'THCudaHalfTensor_medianall' has C-linkage specified, but returns user-defined type 'half' (aka '__half') which is incompatible with C [-Wreturn-type-c-linkage]
extern "C" half THCudaHalfTensor_medianall(THCState * state, THCudaHalfTensor * self);
^
9 warnings generated.
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

@helloall1900
Copy link

Same error here
OSX 10.13.5, Cuda 9.1, cudnn 7

@cag472 cag472 mentioned this issue Aug 13, 2018
@Yonv1943
Copy link

Yonv1943 commented Nov 5, 2018

Before you build try export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"

It works on Ubuntu 18.04 too.

@Amir-Arsalan
Copy link

Amir-Arsalan commented Nov 12, 2018

@ricpruss I know this has been a while but for some reason I need to compile Torch with CUDA 9.2. I remember I tried export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF2_OPERATORS__" and I could not compile Torch. I just added both -D__CUDA_NO_HALF2_OPERATORS__ and -D__CUDA_NO_HALF_OPERATORS__ to TORCH_NVCC_FLAGS and could compile Torch with CUDA 9.2 but when I do require 'cutorch' I get the following errors:

require 'cutorch';
THCudaCheck FAIL file=/torch/extra/cutorch/lib/THC/THCGeneral.c line=70 error=35 : CUDA driver version is insufficient for CUDA runtime version
/torch/install/share/lua/5.1/trepl/init.lua:389: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at /torch/extra/cutorch/lib/THC/THCGeneral.c:70
stack traceback:
	[C]: in function 'error'
	/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
	[string "require 'cutorch';"]:1: in main chunk
	[C]: in function 'xpcall'
	/torch/install/share/lua/5.1/trepl/init.lua:679: in function 'repl'
	/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:204: in main chunk
	[C]: at 0x00405d50

How did you resolve this?

@tjusxh
Copy link

tjusxh commented May 22, 2019

I get the same issue on Windows10, pytorch1.1.0, vs 2017 with version 15.4 toolset. Anyone have the good method?

E:/Program Files/Python35/lib/site-packages/torch/include\THC/THCNumerics.cuh(190): error: more than one operator "<" matches these operands:
built-in operator "arithmetic < arithmetic"
function "operator<(const __half &, const __half &)"
operand types are: c10::Half < c10::Half

E:/Program Files/Python35/lib/site-packages/torch/include\THC/THCNumerics.cuh(191): error: more than one operator "<=" matches these operands:
built-in operator "arithmetic <= arithmetic"
function "operator<=(const __half &, const __half &)"
operand types are: c10::Half <= c10::Half

E:/Program Files/Python35/lib/site-packages/torch/include\THC/THCNumerics.cuh(192): error: more than one operator ">" matches these operands:
built-in operator "arithmetic > arithmetic"
function "operator>(const __half &, const __half &)"
operand types are: c10::Half > c10::Half

E:/Program Files/Python35/lib/site-packages/torch/include\THC/THCNumerics.cuh(193): error: more than one operator ">=" matches these operands:
built-in operator "arithmetic >= arithmetic"
function "operator>=(const __half &, const __half &)"
operand types are: c10::Half >= c10::Half

E:/Program Files/Python35/lib/site-packages/torch/include\THC/THCNumerics.cuh(194): error: more than one operator "==" matches these operands:
built-in operator "arithmetic == arithmetic"
function "operator==(const __half &, const __half &)"
operand types are: c10::Half == c10::Half

E:/Program Files/Python35/lib/site-packages/torch/include\THC/THCNumerics.cuh(196): error: more than one operator "!=" matches these operands:
built-in operator "arithmetic != arithmetic"
function "operator!=(const __half &, const __half &)"
operand types are: c10::Half != c10::Half

@windskyl
Copy link

l got same error on ubuntu 18.04 cudnn 10.1 cuda 7.5
tried all methods above and no process
but the command below helps me to run th successfully
luarocks install cutorch.
without any "export ..."

@chzhan
Copy link

chzhan commented Jun 13, 2019

@tjusxh Me too. Have you solved it?

@YashasSamaga
Copy link

@csarofeen you earlier said that the performance will improve for the better by disabling the CUDA builtin operators. Can you explain how?

It will, for the better.

@maokang94
Copy link

This didn't work for me
./clean.sh
export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
./install.sh

try this
sudo TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" ./install.sh the my project work

@adem404
Copy link

adem404 commented Oct 29, 2020

Windows user have to use:
SET NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
instead of
export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"

@gemfield
Copy link

gemfield commented Dec 2, 2020

This comes from PyTorch CMake files:

  if(CUDA_HAS_FP16 OR NOT ${CUDA_VERSION} LESS 7.5)
    message(STATUS "Found CUDA with FP16 support, compiling with torch.cuda.HalfTensor")
    list(APPEND CUDA_NVCC_FLAGS "-DCUDA_HAS_FP16=1" "-D__CUDA_NO_HALF_OPERATORS__" "-D__CUDA_NO_HALF_CONVERSIONS__"
      "-D__CUDA_NO_BFLOAT16_CONVERSIONS__" "-D__CUDA_NO_HALF2_OPERATORS__")
    add_compile_options(-DCUDA_HAS_FP16=1)
......

That is why normal pytorch build won't get this error.

@ckddls1321
Copy link

ckddls1321 commented Feb 26, 2021

On Nvidia NGC Docker, targets several GPUs. For deepspeed, we have to setup arch_list up to Volta architecture.
So I build like this and works.

TORCH_CUDA_ARCH_LIST="7.0 7.5 8.0" DS_BUILD_OPS=1 DS_BUILD_FUSED_LAMB=1 DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 DS_BUILD_SPARSE_ATTN=1 DS_BUILD_TRANSFORMER=1 DS_BUILD_UTILS=1 python3 setup.py install

atar13 added a commit to tritonuas/obcpp that referenced this issue Apr 10, 2024
not working due to an error compiling half operators in CUDA programs on
github actions torch/cutorch#797

https://github.com/tritonuas/obcpp/actions/runs/8637979133/job/23681389912#step:8:6390
Tyler-Lentz pushed a commit to tritonuas/obcpp that referenced this issue Apr 10, 2024
* nvidia docker image builds

also moved x86 dockerfile to docker folder. also also obcpp segfaults when run in the container. will look into soon

* cuda_check integration test

* docker dir makefile

* build all binaries

also add a comment for where base image came from

* fix build_jetson_image paths

* cuda_check builds and works

* nvidia docker image builds

also moved x86 dockerfile to docker folder. also also obcpp segfaults when run in the container. will look into soon

* cuda_check integration test

* docker dir makefile

* build all binaries

also add a comment for where base image came from

* fix build_jetson_image paths

* cuda_check builds and works

* github action to build jetson dockerfile

* pull_request

* github actions might work?

* jetson Docker: compile obcpp and cuda_check

* bespoke github env

* trying to free github runner space

* disable half operators?

* trying dusty-nv's torchvision install

* moved back to run make for torchvision

* jetson docker composes

* disable jetson docker build

not working due to an error compiling half operators in CUDA programs on
github actions torch/cutorch#797

https://github.com/tritonuas/obcpp/actions/runs/8637979133/job/23681389912#step:8:6390

* devcontianer use new x86 tag

* remove Dockerfile.nvidia in favor of Dockerfile.jetson

* instructions for building jetson image

---------

Co-authored-by: tuas-travis-ci <ucsdtuas@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests