-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: The output of Aria model is not correct
bug
Something isn't working
#12241
opened Jan 21, 2025 by
xffxff
1 task done
[Usage]: how to input messages as multi-message (a batch) instead of just one
usage
How to use vllm
#12234
opened Jan 21, 2025 by
Hyfred
1 task done
[Bug]: RuntimeError: Error in model execution: CUDA error: an illegal memory access was encountered
bug
Something isn't working
#12233
opened Jan 21, 2025 by
Quang-elec44
1 task done
[New Model]: Add support for DeepSeek R1
new model
Requests to new models
#12226
opened Jan 20, 2025 by
jorgeantonio21
1 task done
[Usage]: Guided choice not working as expected
usage
How to use vllm
#12225
opened Jan 20, 2025 by
srsingh24
1 task done
[Usage]: Context window crashes web window when full
usage
How to use vllm
#12221
opened Jan 20, 2025 by
seabastard
1 task done
[Feature]: SwiftKV cache compression
feature request
#12220
opened Jan 20, 2025 by
arunpatala
1 task done
[Bug]: ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details.
bug
Something isn't working
#12219
opened Jan 20, 2025 by
walker-ai
1 task done
[Usage]: BNB quantization not supported for Paligemma2 model
usage
How to use vllm
#12216
opened Jan 20, 2025 by
ken2190
1 task done
[Usage]: how to generate results and get the embeddings of the result
usage
How to use vllm
#12213
opened Jan 20, 2025 by
daiwk
1 task done
[Usage]: what is the most efficient way to do with a 72b model and 8 * A100 ?
usage
How to use vllm
#12205
opened Jan 20, 2025 by
Chandler-Bing
1 task done
[Usage]: Does vLLM support deploying the speculative model on a second device?
usage
How to use vllm
#12200
opened Jan 20, 2025 by
CharlesRiggins
1 task done
[Bug]: Inconsistent data received and sent using PyNcclPipe
bug
Something isn't working
#12197
opened Jan 20, 2025 by
fanfanaaaa
1 task done
[Bug]: CUDA initialization error with vLLM 0.5.4 and PyTorch 2.4.0+cu121
bug
Something isn't working
#12189
opened Jan 19, 2025 by
TaoShuchang
1 task done
[Bug]: Fail to use beamsearch with llm.chat
bug
Something isn't working
#12183
opened Jan 18, 2025 by
gystar
1 task done
[Bug]: Multi-Node Online Inference on TPUs Failing
bug
Something isn't working
#12179
opened Jan 17, 2025 by
BabyChouSr
1 task done
[Bug]: AMD GPU docker image build No matching distribution found for torch==2.6.0.dev20241113+rocm6.2
bug
Something isn't working
rocm
#12178
opened Jan 17, 2025 by
samos123
1 task done
[Bug]: Slow huggingface weights download. Sequential download
bug
Something isn't working
#12177
opened Jan 17, 2025 by
NikolaBorisov
1 task done
[RFC]: Distribute LoRA adapters across deployment
RFC
#12174
opened Jan 17, 2025 by
joerunde
1 task done
[Feature]: Serve /metrics while a model is loading
feature request
#12173
opened Jan 17, 2025 by
xfalcox
1 task done
[Bug]: Issue running the Granite-7b GGUF quantized model on multiple GPUs with vLLM due to a tensor size mismatch.
bug
Something isn't working
#12170
opened Jan 17, 2025 by
tarukumar
1 task done
[New Model]: openbmb/MiniCPM-o-2_6
new model
Requests to new models
#12162
opened Jan 17, 2025 by
myoss
1 task done
[Usage]: Terminates without any error 30 seconds after a successful run.
usage
How to use vllm
#12160
opened Jan 17, 2025 by
hznnnnnn
1 task done
[Feature]: Any plan to support key features of nanoflow?
feature request
#12157
opened Jan 17, 2025 by
dwq370
1 task done
[Bug]: After updating VLLM from 0.4.0.post1 to 0.6.4, the model loading time increased by one minute.
bug
Something isn't working
#12155
opened Jan 17, 2025 by
123qwe-ux
1 task done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.