Potential bug: ConversationSummaryBufferMemory not actually forcing length to be under maxTokenLimit #5044

ConnorLanglois · 2024-04-11T01:27:08Z

ConnorLanglois
Apr 11, 2024

Hi,

Was reading LangChain docs and was curious about how ConversationSummaryBufferMemory works. Looking through the code, it appears that it prunes chat messages and then computes the new summary with the old summary + those pruned messages, by doing:

this.movingSummaryBuffer = await this.predictNewSummary(
        prunedMemory,
        this.movingSummaryBuffer
      );

from here.

However, what if the LLM produces a summary much longer the the old one? Let's say the pruned messages total to a length of X and the old summary is length Y. And let's say the new summary comes out to be Y + X + 123 (some arbitrary number) characters. Thus, the total new prompt length will be greater than it would have with the old summary and original messages. And thus, the next time the LLM is called it will error out with exceeding max token length for input.

Is my understanding correct here? Should this be fixed to forcibly slice the summary, or perhaps run this function in a loop until summary is under the length needed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential bug: ConversationSummaryBufferMemory not actually forcing length to be under maxTokenLimit #5044

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Potential bug: ConversationSummaryBufferMemory not actually forcing length to be under maxTokenLimit #5044

ConnorLanglois Apr 11, 2024

Replies: 0 comments

ConnorLanglois
Apr 11, 2024