Query vLLM OpenAI /models endpoint to get model name and context window #1632

simon376 · 2024-07-02T09:57:52Z

Description

Similar to #755 , the OpenAI-compatible API server by vLLM exposes both the model name via id and the context length via max_model_len, which can be used to automatically setup the context length of the deployed LLM.

I'm not sure how to best check if the endpoint should be queried, e.g. by checking the API Base URL, or introducing a new apiType besides Azure, etc.

Checklist

[x ] The base branch of this PR is dev, rather than main it's preview as in your contributing guidelines
The relevant docs, if any, have been updated or created

References

vLLM PR adding max_model_len
Query server for model availability of OpenAI-compatible servers #609 Looks like this is closely related to the "AUTODETECT" functionality. however, that only uses the model names, not additional parameters as far as I am aware of (ref. load.ts:244). So it may be better to adjust the listModels() interface to include model parameters in its response

…ike vLLM

netlify · 2024-07-02T09:58:09Z

❌ Deploy Preview for continuedev failed.

Name	Link
🔨 Latest commit	`ecb530e`
🔍 Latest deploy log	https://app.netlify.com/sites/continuedev/deploys/669f7d4ca57f240008992099

simon376 · 2024-07-02T09:59:15Z

sorry I messed something up, I only intended to include my one change in the OpenAI.ts file

sestinj · 2024-07-03T01:34:06Z

@simon376 I think this might be a good enough reason to make a subclass of the OpenAI class (we do this with Deepseek, Groq, and others for example). Everything can be the same except that there doesn't need to be a URL check to find out whether it is an instance of vLLM

Patrick-Erichsen · 2024-07-22T20:24:44Z

Hey @simon376 , any progress on this? Would love to get it shipped!

simon376 · 2024-07-23T09:54:46Z

@sestinj I've created a new vLLM subclass and add documentation for it. Sorry for the whitespace changes in OpenAI.ts, couldn't figure out how to get rid of them

sestinj · 2024-08-05T23:23:17Z

Great, now worries on the whitespace this looks perfect

agm-eratosth · 2024-08-07T15:14:14Z

Hi @sestinj what version of the extension has this feature been integrated into?

Simon Müller added 2 commits July 2, 2024 11:42

add autodetecting context length for OpenAI-like custom deployments l…

adc268e

…ike vLLM

Merge remote-tracking branch 'origin/preview'

a62d927

simon376 marked this pull request as draft July 2, 2024 09:58

simon376 marked this pull request as ready for review July 2, 2024 10:04

Simon Müller added 4 commits July 23, 2024 11:21

Merge remote-tracking branch 'origin/preview'

a1691fc

add new subclass vllm from OpenAI

aaeddd3

revert whitespace change

780241b

add documentation

ecb530e

sestinj approved these changes Aug 5, 2024

View reviewed changes

sestinj merged commit 9e6461e into continuedev:preview Aug 5, 2024
1 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query vLLM OpenAI /models endpoint to get model name and context window #1632

Query vLLM OpenAI /models endpoint to get model name and context window #1632

simon376 commented Jul 2, 2024 •

edited

Loading

netlify bot commented Jul 2, 2024 •

edited

Loading

simon376 commented Jul 2, 2024

sestinj commented Jul 3, 2024

Patrick-Erichsen commented Jul 22, 2024

simon376 commented Jul 23, 2024

sestinj commented Aug 5, 2024

agm-eratosth commented Aug 7, 2024

Query vLLM OpenAI /models endpoint to get model name and context window #1632

Query vLLM OpenAI /models endpoint to get model name and context window #1632

Conversation

simon376 commented Jul 2, 2024 • edited Loading

Description

Checklist

References

netlify bot commented Jul 2, 2024 • edited Loading

❌ Deploy Preview for continuedev failed.

simon376 commented Jul 2, 2024

sestinj commented Jul 3, 2024

Patrick-Erichsen commented Jul 22, 2024

simon376 commented Jul 23, 2024

sestinj commented Aug 5, 2024

agm-eratosth commented Aug 7, 2024

simon376 commented Jul 2, 2024 •

edited

Loading

netlify bot commented Jul 2, 2024 •

edited

Loading