Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error building V2 on windows with CUDA 11.8 #395

Open
yarin177 opened this issue Jul 29, 2023 · 1 comment
Open

Error building V2 on windows with CUDA 11.8 #395

yarin177 opened this issue Jul 29, 2023 · 1 comment

Comments

@yarin177
Copy link

Hey, I am tried to install flash-attn using this command: pip install flash-attn --no-build-isolation on Windows using Conda env.

I encounter many errors, I uploaded some of the traceback here
I managed to sucessfully install flash-attn==0.2.8.
Preview of the traceback:

Collecting flash-attn
  Downloading flash_attn-2.0.2.tar.gz (4.2 MB)
     ---------------------------------------- 4.2/4.2 MB 2.3 MB/s eta 0:00:00
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Requirement already satisfied: torch in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from flash-attn) (2.0.1+cu118)
Collecting einops (from flash-attn)
  Downloading einops-0.6.1-py3-none-any.whl (42 kB)
     ---------------------------------------- 42.2/42.2 kB ? eta 0:00:00
Requirement already satisfied: packaging in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from flash-attn) (23.1)
Requirement already satisfied: ninja in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from flash-attn) (1.11.1)
Requirement already satisfied: filelock in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from torch->flash-attn) (3.9.0)
Requirement already satisfied: typing-extensions in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from torch->flash-attn) (4.4.0)
Requirement already satisfied: sympy in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from torch->flash-attn) (1.11.1)
Requirement already satisfied: networkx in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from torch->flash-attn) (3.0)
Requirement already satisfied: jinja2 in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from torch->flash-attn) (3.1.2)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from jinja2->torch->flash-attn) (2.1.2)
Requirement already satisfied: mpmath>=0.19 in c:\users\yarin\anaconda3\envs\llama-32k\lib\site-packages (from sympy->torch->flash-attn) (1.2.1)
Building wheels for collected packages: flash-attn
  Building wheel for flash-attn (setup.py): started
  Building wheel for flash-attn (setup.py): finished with status 'error'
  error: subprocess-exited-with-error
  
  python setup.py bdist_wheel did not run successfully.
  exit code: 1
  
  [25223 lines of output]
  
  
  torch.__version__  = 2.0.1+cu118
  
  
  fatal: not a git repository (or any of the parent directories): .git
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-311
  creating build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\attention_kernl.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\bert_padding.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\fav2_interface.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attention.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_interface.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton_og.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton_single_query.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton_tmp.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton_tmp_og.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton_varlen.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_blocksparse_attention.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_blocksparse_attn_interface.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\fused_softmax.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\rotary.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\tmp.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn
  creating build\lib.win-amd64-cpython-311\flash_attn\layers
  copying flash_attn\layers\patch_embed.py -> build\lib.win-amd64-cpython-311\flash_attn\layers
  copying flash_attn\layers\rotary.py -> build\lib.win-amd64-cpython-311\flash_attn\layers
  copying flash_attn\layers\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\layers
  creating build\lib.win-amd64-cpython-311\flash_attn\losses
  copying flash_attn\losses\cross_entropy.py -> build\lib.win-amd64-cpython-311\flash_attn\losses
  copying flash_attn\losses\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\losses
  creating build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\bert.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\falcon.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\gpt.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\gptj.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\gpt_neox.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\llama.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\opt.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\vit.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  creating build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\block.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\embedding.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\mha.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\mlp.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  creating build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\activations.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\fused_dense.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\gelu_activation.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\layer_norm.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\rms_norm.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  creating build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\benchmark.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\distributed.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\generation.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\pretrained.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  running build_ext
  C:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\utils\cpp_extension.py:359: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified
    warnings.warn(f'Error checking compiler version for {compiler}: {error}')
  building 'flash_attn_2_cuda' extension
  creating C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311
  creating C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release
  creating C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc
  creating C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc\flash_attn
  creating C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc\flash_attn\src
  Emitting ninja build file C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  [1/33] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj.d --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\flash_attn -IC:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\flash_attn\src -IC:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\cutlass\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include\TH -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include" -IC:\Users\Yarin\anaconda3\envs\llama-32k\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\flash_attn\src\flash_bwd_hdim128_fp16_sm80.cu -o C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  FAILED: C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/build/temp.win-amd64-cpython-311/Release/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj
  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvcc --generate-dependencies-with-compile --dependency-output C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj.d --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\flash_attn -IC:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\flash_attn\src -IC:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\cutlass\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include\TH -IC:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include" -IC:\Users\Yarin\anaconda3\envs\llama-32k\include -IC:\Users\Yarin\anaconda3\envs\llama-32k\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.22000.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -c C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\csrc\flash_attn\src\flash_bwd_hdim128_fp16_sm80.cu -o C:\Users\Yarin\AppData\Local\Temp\pip-install-8ikwb3vg\flash-attn_d30e4bdc739f4aed9856b2a5cde670a1\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
  flash_bwd_hdim128_fp16_sm80.cu
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF_OPERATORS__' with '/U__CUDA_NO_HALF_OPERATORS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF_CONVERSIONS__' with '/U__CUDA_NO_HALF_CONVERSIONS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF2_OPERATORS__' with '/U__CUDA_NO_HALF2_OPERATORS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_BFLOAT16_CONVERSIONS__' with '/U__CUDA_NO_BFLOAT16_CONVERSIONS__'
  flash_bwd_hdim128_fp16_sm80.cu
  C:/Users/Yarin/anaconda3/envs/llama-32k/Lib/site-packages/torch/include\c10/macros/Macros.h(138): warning C4067: unexpected tokens following preprocessor directive - expected a newline
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF_OPERATORS__' with '/U__CUDA_NO_HALF_OPERATORS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF_CONVERSIONS__' with '/U__CUDA_NO_HALF_CONVERSIONS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF2_OPERATORS__' with '/U__CUDA_NO_HALF2_OPERATORS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_BFLOAT16_CONVERSIONS__' with '/U__CUDA_NO_BFLOAT16_CONVERSIONS__'
  flash_bwd_hdim128_fp16_sm80.cu
  C:/Users/Yarin/anaconda3/envs/llama-32k/Lib/site-packages/torch/include\c10/macros/Macros.h(138): warning C4067: unexpected tokens following preprocessor directive - expected a newline
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF_OPERATORS__' with '/U__CUDA_NO_HALF_OPERATORS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF_CONVERSIONS__' with '/U__CUDA_NO_HALF_CONVERSIONS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_HALF2_OPERATORS__' with '/U__CUDA_NO_HALF2_OPERATORS__'
  cl : Command line warning D9025 : overriding '/D__CUDA_NO_BFLOAT16_CONVERSIONS__' with '/U__CUDA_NO_BFLOAT16_CONVERSIONS__'
  flash_bwd_hdim128_fp16_sm80.cu
  C:/Users/Yarin/anaconda3/envs/llama-32k/Lib/site-packages/torch/include\c10/macros/Macros.h(138): warning C4067: unexpected tokens following preprocessor directive - expected a newline
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(299): error: identifier "not" is undefined
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(299): error: expected a ")"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(299): error: expected a "," or ">"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(299): error: the global scope has no "type"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(299): error: expected an identifier
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(300): error: identifier "__host__" is undefined
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(300): error: expected a ";"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/numeric/math.hpp(319): warning #12-D: parsing restarts here after previous syntax error
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(126): error: identifier "not" is undefined
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(126): error: expected a ")"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(126): error: expected a "," or ">"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(126): error: the global scope has no "type"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(126): error: expected an identifier
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(127): error: identifier "__host__" is undefined
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(127): error: expected a ";"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(149): warning #12-D: parsing restarts here after previous syntax error
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(177): error: identifier "not" is undefined
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(177): error: expected a ")"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(177): error: expected a "," or ">"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(177): error: the global scope has no "type"
  
  C:/Users/Yarin/AppData/Local/Temp/pip-install-8ikwb3vg/flash-attn_d30e4bdc739f4aed9856b2a5cde670a1/csrc/cutlass/include\cute/container/array_subbyte.hpp(177): error: expected an identifier
    File "C:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\setuptools\_distutils\command\build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
                ^^^^^^^^^^^^^^^^^^^^^^
    File "C:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\utils\cpp_extension.py", line 815, in win_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "C:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\utils\cpp_extension.py", line 1574, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "C:\Users\Yarin\anaconda3\envs\llama-32k\Lib\site-packages\torch\utils\cpp_extension.py", line 1909, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for flash-attn
  Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects
@snykral
Copy link

snykral commented Jul 31, 2023

#345

@grimulkan grimulkan mentioned this issue Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants