-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: how to use openai compatible api to run GGUF model? #8401
Comments
Hey @weiminw, Let me know if you are seeing something else. |
Hi,
GGUF format has all metadata, isnt it? |
The same holds for me as also described in here When trying to load a GGUF model, e.g., https://huggingface.co/bartowski/reader-lm-1.5b-GGUF , vLLM requires a OSError: /reader-lm-1.5b-GGUF/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//u01/app/mlo/models/reader-lm-1.5b-GGUF//tree/Nles. |
Hey @paolovic, Yes, this error occurs because vLLM is currently not looking for I have tested this for llm = LLM(model = "/path/to/model.gguf", tokenizer="/path/to/tokenizer") |
ahhhh....ok, easy |
Your current environment
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: