-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rotary Emb compile fail under Ubuntu 22.04 with gcc/g++ v12 installed #484
Comments
Yeah idk, maybe the compiler version is too new. It's erroring on some pybind11 code :D |
$ sudo apt install gcc-10 g++-10 I installed gcc-10 as per NVlabs/instant-ngp#119. This worked for me |
FYI, this happens on Lambda stack which means anyone trying to use rotary on Lambda H100s is in for a bad time. I tried the above ^ and it didn't fix the problem for me (the ln commands didn't work because stuff wasn't where they expected it to be). |
Yes, CUDA 12.0 and 12.1's nvcc compiler, cannot compile pybind11 2.11.1: Specifically, it cannot compile The fix is simple (thanks @archibate): - return caster.operator typename make_caster<T>::template cast_op_type<T>();
+ return caster; So, you just need to find the
In this example it's: So, we modify this Then try compiling rotary-emb again: MAX_JOBS=2 pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary And voilà:
|
Thanks for that! Is this something you'd expect one of pybind11 or CUDA to fix at some point? |
@Birch-san thank you! fixed it for me as well, really appreciate it |
@andersonbcdefg yes, CUDA 12.2's nvcc compiler can now compile pybind11. |
flash_attn core was compiled correctly but runtime error asks to compile rotary module for llama2. However, the compilation fails on Ubuntu 22.0 with cuda 12.1, pytorch nightly for 12.1 and gcc/g++ 12.
Thanks for any pointers. I am scratching my heads on this one.
The text was updated successfully, but these errors were encountered: