-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Installation]: vllm on NVIDIA jetson AGX orin #5640
Comments
I suggest you contact your support team. You are using a custom built pytorch, which we can answer very limited questions about. |
Thanks, will do. But does anyone have rough idea what might have caused this? |
Same issue here here some sw info
|
Same problem here. Using To work around this, I attempted to install vllm-0.2.4, which requires PyTorch only 2.1.0. However, I encountered an |
怎么解决 |
how to install vllm on jetson? |
Did you resolve that? |
@youkaichao Hey, I met the same problem, could you tell me is it feasible to run on Jetson AGX Orin (aarch64) even if I only call some classes (configuration classes and operators) from it? If so, how should I install it? |
@KungFuPandaPro @walker-ai |
Thank you for your time, I m trying to install it by the method that you mentioned here, but I encountered some error, could you show the current environment that you're using? Such as jetpack, pytorch, vllm and etc. I'm afraid it's a version inconsistency. |
@walker-ai could you provide further details of the error? My project is already completed, and I do not have access to the environment anymore, but maybe your error was one of the ones I have also encountered along the way |
When I type: # cd /workspace/vllm/
python setup.py develop it shows: Error: could not find CMAKE_PROJECT_NAME in Cache
Traceback (most recent call last):
File "setup.py", line 486, in <module>
setup(
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/__init__.py", line 117, in setup
return distutils.core.setup(**attrs)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 183, in setup
return run_commands(dist)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 199, in run_commands
dist.run_commands()
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
self.run_command(cmd)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/dist.py", line 950, in run_command
super().run_command(command)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/command/develop.py", line 35, in run
self.install_for_development()
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/command/develop.py", line 112, in install_for_development
self.run_command('build_ext')
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/dist.py", line 950, in run_command
super().run_command(command)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "setup.py", line 243, in run
super().run()
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 98, in run
_build_ext.run(self)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
self.build_extensions()
File "setup.py", line 217, in build_extensions
subprocess.check_call(["cmake", *build_args], cwd=self.build_temp)
File "/home/orin/tools/anaconda3/envs/tmp/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '-j=8', '--target=_moe_C', '--target=vllm_flash_attn_c', '--target=_C']' returned non-zero exit status 1. then I try to figure out the detail about cmake error log: cmake --build . -j8 --target=_moe_C --verbose it shows (partly): [1/5] Building CUDA object CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o
FAILED: CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DPy_LIMITED_API=3 -DTORCH_EXTENSION_NAME=_moe_C -D_moe_C_EXPORTS -I/home/orin/tools/vllm/csrc -isystem /home/orin/tools/anaconda3/envs/tmp/include/python3.8 -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda/include -DONNX_NAMESPACE=onnx_c2 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O2 -g -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DENABLE_FP8 --threads=1 -DENABLE_SCALED_MM_C2X=1 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode arch=compute_86,code=sm_86 -MD -MT CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o -MF CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o.d -x cu -c /home/orin/tools/vllm/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu -o CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4b8.cu.o
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: qualified name is not allowed
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: explicit type is missing ("int" assumed)
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: expected a ";"
...
[2/5] Building CUDA object CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o
FAILED: CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DPy_LIMITED_API=3 -DTORCH_EXTENSION_NAME=_moe_C -D_moe_C_EXPORTS -I/home/orin/tools/vllm/csrc -isystem /home/orin/tools/anaconda3/envs/tmp/include/python3.8 -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include -isystem /home/orin/tools/anaconda3/envs/tmp/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda/include -DONNX_NAMESPACE=onnx_c2 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O2 -g -DNDEBUG -std=c++17 -Xcompiler=-fPIC -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DENABLE_FP8 --threads=1 -DENABLE_SCALED_MM_C2X=1 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode arch=compute_86,code=sm_86 -MD -MT CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o -MF CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o.d -x cu -c /home/orin/tools/vllm/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu -o CMakeFiles/_moe_C.dir/csrc/moe/marlin_kernels/marlin_moe_kernel_ku4.cu.o
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: qualified name is not allowed
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: explicit type is missing ("int" assumed)
/home/orin/tools/vllm/csrc/core/scalar_type.hpp(215): error: expected a ";"
|
Ok, I did not encounter this error.
But your error seems more like a cmake problem. I guess you already checked the Install Guide, which got updated a lot since I did my installation. Hope you can fix it 👍 |
Thank you for providing these information, I will try that :) |
I'm currently in the process of updating the nixpkgs build definition for vLLM from 0.5.3.post1 to 0.6.3.post1. Build works great on x86_64 machine with NVIDIA GPU, and have successfully built for aarch64 Jetson Orin AGX with correct dependency / CUDA library versions. Unfortunately, at runtime on the Jetson, vLLM fails to start engine process:
Have traced this back to the root cause - newer versions of vLLM use NVML to detect CUDA support. I'd be happy to open a PR to add a fallback mode on CUDA platforms where NVML isn't supported, if the maintainers would be open to it? Otherwise I will have to maintain a patch downstream. |
Your current environment
How you are installing vllm
Hi,
I'm trying to install vllm on my Jetson AGX orin developer kit.
I'm using the following image: nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3
and I get this error when I
pip install vllm
Note the error message
Unknown runtime environment
I figured that this is thrown here https://github.com/vllm-project/vllm/blob/main/setup.py#L347 due to
torch.version.cuda
beingnone
However, when I prompt python3 and try verifying the cuda availability,
Any help would be appreciated. Thanks
The text was updated successfully, but these errors were encountered: