-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query vLLM OpenAI /models endpoint to get model name and context window #1632
Conversation
❌ Deploy Preview for continuedev failed.
|
sorry I messed something up, I only intended to include my one change in the OpenAI.ts file |
@simon376 I think this might be a good enough reason to make a subclass of the OpenAI class (we do this with Deepseek, Groq, and others for example). Everything can be the same except that there doesn't need to be a URL check to find out whether it is an instance of vLLM |
Hey @simon376 , any progress on this? Would love to get it shipped! |
@sestinj I've created a new vLLM subclass and add documentation for it. Sorry for the whitespace changes in OpenAI.ts, couldn't figure out how to get rid of them |
Great, now worries on the whitespace this looks perfect |
Hi @sestinj what version of the extension has this feature been integrated into? |
Description
Similar to #755 , the OpenAI-compatible API server by vLLM exposes both the model name via
id
and the context length viamax_model_len
, which can be used to automatically setup the context length of the deployed LLM.I'm not sure how to best check if the endpoint should be queried, e.g. by checking the API Base URL, or introducing a new apiType besides Azure, etc.
Checklist
dev
, rather thanmain
it's preview as in your contributing guidelinesReferences
max_model_len
load.ts:244
). So it may be better to adjust thelistModels()
interface to include model parameters in its response