No support for longer context lengths. #1108

bhumik1310 · 2023-09-20T05:07:15Z

Model: Llama-2-chat-hf
The current implementation of vLLM gives the finish_reason as 'length' whilst the native model supports context length of 4024(And works well with the context we've tested it with) , Is the option available to change the native context length supported by the vLLM instance?

I've retried the experiments with the latest release and the issue still persists.

viktor-ferenczi · 2023-11-03T12:26:30Z

YaRN model support was merged into 0.2.1 already. There have been recently merged PRs like #1510 about longer context length support. It is just not well documented yet.

Try to use YaRN models from https://huggingface.co/NousResearch

I tested YaRN models up to 25k model length before and they worked well.

Depending on your use case you may be able to use models like Code Llama (16k) or Mistral (8k). I tested Code Llama up to the full 16k and it worked well, even for reasoning.

hmellor closed this as completed Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No support for longer context lengths. #1108

No support for longer context lengths. #1108

bhumik1310 commented Sep 20, 2023

viktor-ferenczi commented Nov 3, 2023 •

edited

Loading

No support for longer context lengths. #1108

No support for longer context lengths. #1108

Comments

bhumik1310 commented Sep 20, 2023

viktor-ferenczi commented Nov 3, 2023 • edited Loading

viktor-ferenczi commented Nov 3, 2023 •

edited

Loading