Adjust the maximum input length / chunk inputs #11

sammcj · 2024-08-12T05:53:23Z

I've found that when adding input data of any reasonable length the demo errors and in the console I see it's limited to just 2048 tokens:

ValueError: Input length of input_ids is 22711, but `max_length` is set to 2048. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`.

Is it possible to set the maximum input length at runtime or perhaps chunk the input data into <= 2048 tokens?

Neat demo!

The text was updated successfully, but these errors were encountered:

davidmezzetti · 2024-08-12T12:10:35Z

Thank you for the kind words.

22711 seems like quite a long context. Are you running this through Docker or did you install directly?

Nonetheless, I can add a MAXLENGTH env parameter.

sammcj · 2024-08-12T12:38:29Z

Hey David,

Thanks for taking the time to respond.

It may sound a little crazy at first but my most common use case for rag is with code repos, these usually range from 12-32k - I was keen on trying this out with them and comparing it to having it completely loaded into context. Thought it’d be an interesting experiment!

Im running it through docker.

davidmezzetti · 2024-08-12T14:12:32Z

Got it. I'll use this issue to add the new parameter. Obviously it will depend on having a LLM that accepts a context that size and enough GPU memory.

sammcj · 2024-08-12T21:19:11Z

Yeah for sure, for most other AI/LLM projects I use either Ollama or Exllamav2 (via TabbyAPI) and quantise the k/v cache to q8_0 - so I regurarly run models that are 8-22b with 32-64K context sizes (1x 3090 + 2x A4000) which is incredibly useful!

davidmezzetti · 2024-08-14T13:38:25Z

This has been added in.

If you want to try it before the next Docker build, you can copy over the updated rag.py and rebuild the Docker image.

Kareem21 · 2024-10-03T14:28:18Z

How would you change the LLM to llama for example? Im a bit confused as it seems like the application has a built in LLM? under rag.py it does the

self.llm = LLM(... and passes mistral7b here)..

where is it getting this from?

davidmezzetti closed this as completed in 051c4a2 Aug 14, 2024

davidmezzetti added this to the v0.4.0 milestone Aug 15, 2024

davidmezzetti self-assigned this Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust the maximum input length / chunk inputs #11

Adjust the maximum input length / chunk inputs #11

sammcj commented Aug 12, 2024

davidmezzetti commented Aug 12, 2024

sammcj commented Aug 12, 2024

davidmezzetti commented Aug 12, 2024

sammcj commented Aug 12, 2024

davidmezzetti commented Aug 14, 2024

Kareem21 commented Oct 3, 2024

Adjust the maximum input length / chunk inputs #11

Adjust the maximum input length / chunk inputs #11

Comments

sammcj commented Aug 12, 2024

davidmezzetti commented Aug 12, 2024

sammcj commented Aug 12, 2024

davidmezzetti commented Aug 12, 2024

sammcj commented Aug 12, 2024

davidmezzetti commented Aug 14, 2024

Kareem21 commented Oct 3, 2024