Potential bug: ConversationSummaryBufferMemory not actually forcing length to be under maxTokenLimit #5044
ConnorLanglois
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Was reading LangChain docs and was curious about how ConversationSummaryBufferMemory works. Looking through the code, it appears that it prunes chat messages and then computes the new summary with the old summary + those pruned messages, by doing:
from here.
However, what if the LLM produces a summary much longer the the old one? Let's say the pruned messages total to a length of X and the old summary is length Y. And let's say the new summary comes out to be Y + X + 123 (some arbitrary number) characters. Thus, the total new prompt length will be greater than it would have with the old summary and original messages. And thus, the next time the LLM is called it will error out with exceeding max token length for input.
Is my understanding correct here? Should this be fixed to forcibly
slice
the summary, or perhaps run this function in a loop until summary is under the length needed?Beta Was this translation helpful? Give feedback.
All reactions