-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Fast inference with vLLM (Llama 2 13B)" example is broken #463
Comments
irfansharif
added a commit
that referenced
this issue
Oct 13, 2023
Fixes #463. Pytorch 2.1.0 (https://github.com/pytorch/pytorch/releases/tag/v2.1.0) was just released just last week, and it's built using CUDA 12.1. The image we're using uses CUDA 11.8, as recommended by vLLM. Previously vLLM specified a dependency on torch>=2.0.0, and picked up this 2.1.0 version. That was pinned back to 2.0.1 in vllm-project/vllm#1290. When picking up that SHA however, we ran into what vllm-project/vllm#1239 fixes. So for now point to temporary fork with that fix.
Merged
irfansharif
added a commit
that referenced
this issue
Oct 14, 2023
Fixes #463. Pytorch 2.1.0 (https://github.com/pytorch/pytorch/releases/tag/v2.1.0) was just released just last week, and it's built using CUDA 12.1. The image we're using uses CUDA 11.8, as recommended by vLLM. Previously vLLM specified a dependency on torch>=2.0.0, and picked up this 2.1.0 version. That was pinned back to 2.0.1 in vllm-project/vllm#1290. When picking up that SHA however, we ran into what vllm-project/vllm#1239 fixes. So for now point to temporary fork with that fix.
gongy
pushed a commit
that referenced
this issue
Jan 5, 2024
Fixes #463. Pytorch 2.1.0 (https://github.com/pytorch/pytorch/releases/tag/v2.1.0) was just released just last week, and it's built using CUDA 12.1. The image we're using uses CUDA 11.8, as recommended by vLLM. Previously vLLM specified a dependency on torch>=2.0.0, and picked up this 2.1.0 version. That was pinned back to 2.0.1 in vllm-project/vllm#1290. When picking up that SHA however, we ran into what vllm-project/vllm#1239 fixes. So for now point to temporary fork with that fix.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As per https://modal.com/docs/guide/ex/vllm_inference, I ran:
Here's the error that I got:
The text was updated successfully, but these errors were encountered: