feat: enhance context handling by adding code context selection and implementing summary generation #1091

thecodacus · 2025-01-14T18:22:16Z

Dynamic LLM Context Optimization Implementation

Overview

This PR implements intelligent context management for LLM interactions by introducing two key optimizations:

Code Context Optimization: Dynamically selects relevant files for the context buffer based on the current conversation
Chat Context Optimization: Summarizes chat history to maintain context while reducing token usage

PR attempts to solve :
#1041
#1042

Key Changes

1. Context Buffer Management

Added new system for selecting relevant files using LLM analysis
Implemented file filtering with ignore patterns similar to .gitignore
Limited context buffer to 5 files maximum for efficient token usage
Added support for relative and absolute file paths

2. Chat Summarization

Implemented chat history summarization to compress context while preserving key information
Created progress tracking for multi-step context optimization process
Added annotation system for tracking context and summary metadata

Technical Details

Context Selection System

Key implementation details from select-context.ts:

export async function selectContext(props: {
  messages: Message[];
  env: Env;
  apiKeys?: Record<string, string>;
  files: FileMap;
  providerSettings?: Record<string, IProviderSetting>;
  promptId?: string;
  contextOptimization?: boolean;
  summary: string;
}) {
  // ... initialization ...

  const resp = await generateText({
    system: `
      You are a software engineer. You are working on a project. You need to select files that are relevant to the task from the list of files above.
    `,
    prompt: `
      ${summaryText}
      Users Question: ${processedMessages.filter((x) => x.role == 'user').pop()?.content}
    `
  });

  // Parse response for file selection
  const updateContextBuffer = response.match(/<updateContextBuffer>([\s\S]*?)<\/updateContextBuffer>/);
}

Chat Summary Generation

From create-summary.ts:

export async function createSummary(props: {
  messages: Message[];
  env: Env;
  apiKeys?: Record<string, string>;
  providerSettings?: Record<string, IProviderSetting>;
}) {
  // ... initialization ...
  
  const extractTextContent = (message: Message) =>
    Array.isArray(message.content)
      ? (message.content.find((item) => item.type === 'text')?.text as string) || ''
      : message.content;

  const resp = await generateText({
    system: `
      You are a software engineer. You need to summarize the work till now and provide a summary of the chat.
    `,
    prompt: `
      please provide a summary of the chat till now.
      below is the latest chat:
      ${slicedMessages.map((x) => `[${x.role}] ${extractTextContent(x)}`).join('\n')}
    `
  });
}

Progress Tracking

Implementation of progress annotations in api.chat.ts:

type ProgressAnnotation = {
  type: 'progress';
  value: number;
  message: string;
};

// Usage in stream
dataStream.writeMessageAnnotation({
  type: 'progress',
  value: progressCounter++,
  message: 'Generating Chat Summary'
} as ProgressAnnotation);

Migration Impact

Breaking changes to the chat API response format to include progress and context annotations
New type definitions for context and progress annotations
Updated stream handling to support multi-stage processing

Future Improvements

Add configurable context buffer size limits
Add support for custom ignore patterns
Add support for partial file content selection

…rocessing

…plementing summary generation

mrsimpson

I have not reviewed the code in detail, but the nicely summarized general design looks like an interesting approach 👍

I wish we had a real context-management-api which allows to chain multiple operation that manipulate the context. Your LLM-summarization calls could be steps in them.

I added some questions to the code, but honestly, I am not really able to fully understand our codebase and what does which part now 😬
Thus, it's majorly trying to understand better. Hope you don't mind, @thecodacus

app/components/chat/AssistantMessage.tsx

app/lib/.server/llm/constants.ts

app/lib/.server/llm/create-summary.ts

app/lib/.server/llm/select-context.ts

leex279 · 2025-01-15T18:34:02Z

Does it need to toggle on/off this features to valid test this PR?

thecodacus · 2025-01-15T19:56:44Z

yes. still made it as an optional optimization feature

wonderwhy-er · 2025-01-17T08:51:08Z

Sorry, am bit out of things at the moment. Gut bit burned out juggling multiple things.

Taking a look now, on surface, structurally looks good, close to how I was thinking this should work and we can experiment with alternative approaches of filtering chat and selecting files afterwards.

So far I run a test for making a snake game with enabled and disabled context optimisation.
Here is the gist with both
https://gist.github.com/wonderwhy-er/a236bc73d7e19d93d154f5430793284b

What is weird.
With context optimisation it used 8327 tokens
Without optimisation it used 8038

So it was better without :D

But I have suspicion it just did not work for me even though I switched that switch in settings on and off.
Should difference be seen in chat history?

wonderwhy-er · 2025-01-17T08:52:58Z

I run out of time I have at the moment, need to explore more but am bit short on time.

wonderwhy-er · 2025-01-17T09:00:59Z

Ok actually I do see chat summary in annotations in chat history

thecodacus · 2025-01-17T12:49:40Z

Sorry, am bit out of things at the moment. Gut bit burned out juggling multiple things.

No worries 😄

What is weird. With context optimisation it used 8327 tokens Without optimisation it used 8038

So it was better without :D

this needs some additional tokens for summary generation and context selection. so for smaller chats and smaller projects it takes more tokens but for larger chats and project it should reverse 😄

maybe we can dynamically switch it on when the context becomes large and for smaller one it defaults to the regular approach

leex279 · 2025-01-18T19:32:22Z

I testet out with a bigger project, and its not that much that is reduced in this case:

Testproject: https://github.com/leex279/task-list-advanced

thecodacus · 2025-01-20T08:31:19Z

I testet out with a bigger project, and its not that much that is reduced in this case:

Testproject: https://github.com/leex279/task-list-advanced

I don't see the optimization is in place.

when it is active you will see logs like this

leex279 · 2025-01-20T11:21:23Z

@thecodacus thanks for the hint. Testet again and looks fine (maybe it was a cache thing or I mixed up the PRs as I was testing several at this time :D)

Without:

With Optimization:

leex279 · 2025-01-20T11:30:35Z

@thecodacus it looks fine, but with Mistral + Codestral I see that the complete UI is blocking and no streaming seen as long as it is doing the implementation.

Dont know if this has to do with the PR.

thecodacus · 2025-01-20T15:42:17Z

Mistral + Codestral

is it not happening for main branch ?

leex279 · 2025-01-22T10:50:17Z

Mistral + Codestral

is it not happening for main branch ?

its also on the main, so forget it here. Seems a problem with Mistral/Codestral

leex279 · 2025-01-22T10:52:43Z

@thecodacus I think we should merge this now to main and to additional fixes in a new PR.

thecodacus · 2025-01-22T16:07:53Z

@leex279 resolved the merge conflicts, need to approve again

thecodacus added 3 commits January 14, 2025 20:35

feat: add context annotation types and enhance file handling in LLM p…

ac0936a

…rocessing

Merge branch 'main' into context-selection

126072f

feat: enhance context handling by adding chatId to annotations and im…

e1a8598

…plementing summary generation

thecodacus mentioned this pull request Jan 14, 2025

Deepseek AI_APICallError #1088

Closed

removed useless changes

ca8d0ca

thecodacus requested review from mrsimpson and wonderwhy-er January 15, 2025 11:40

feat: updated token counts to include optimization requests

bc36dad

mrsimpson reviewed Jan 15, 2025

View reviewed changes

prompt fix

f1a6f2b

This was referenced Jan 16, 2025

Wrong files are getting badly updated and right files are not #1045

Open

We really need Diffs #984

Closed

thecodacus added 2 commits January 18, 2025 03:13

logging added

74b786a

useless logs removed

c765e08

Merge branch 'main' into context-selection

613e920

This was referenced Jan 20, 2025

Request too large for gpt-4o #1135

Open

maxTokenAllowed #1143

Closed

thecodacus added this to the v0.0.6 milestone Jan 21, 2025

leex279 self-requested a review January 22, 2025 10:50

leex279 previously approved these changes Jan 22, 2025

View reviewed changes

Merge branch 'main' into context-selection

4ffbc2e

thecodacus dismissed leex279’s stale review via 4ffbc2e January 22, 2025 15:57

thecodacus added the stable-release Used In PR: Tag to publish the changes from main to stable Branch label Jan 22, 2025

leex279 approved these changes Jan 22, 2025

View reviewed changes

thecodacus merged commit 3c56346 into stackblitz-labs:main Jan 22, 2025
4 checks passed

thecodacus deleted the context-selection branch January 30, 2025 06:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance context handling by adding code context selection and implementing summary generation #1091

feat: enhance context handling by adding code context selection and implementing summary generation #1091

thecodacus commented Jan 14, 2025 •

edited

Loading

mrsimpson left a comment •

edited

Loading

leex279 commented Jan 15, 2025

thecodacus commented Jan 15, 2025 •

edited

Loading

wonderwhy-er commented Jan 17, 2025

wonderwhy-er commented Jan 17, 2025

wonderwhy-er commented Jan 17, 2025

thecodacus commented Jan 17, 2025 •

edited

Loading

leex279 commented Jan 18, 2025

thecodacus commented Jan 20, 2025 •

edited

Loading

leex279 commented Jan 20, 2025

leex279 commented Jan 20, 2025

thecodacus commented Jan 20, 2025

leex279 commented Jan 22, 2025

leex279 commented Jan 22, 2025

thecodacus commented Jan 22, 2025

feat: enhance context handling by adding code context selection and implementing summary generation #1091

feat: enhance context handling by adding code context selection and implementing summary generation #1091

Conversation

thecodacus commented Jan 14, 2025 • edited Loading

Dynamic LLM Context Optimization Implementation

Overview

Key Changes

1. Context Buffer Management

2. Chat Summarization

Technical Details

Context Selection System

Chat Summary Generation

Progress Tracking

Migration Impact

Future Improvements

mrsimpson left a comment • edited Loading

Choose a reason for hiding this comment

leex279 commented Jan 15, 2025

thecodacus commented Jan 15, 2025 • edited Loading

wonderwhy-er commented Jan 17, 2025

wonderwhy-er commented Jan 17, 2025

wonderwhy-er commented Jan 17, 2025

thecodacus commented Jan 17, 2025 • edited Loading

leex279 commented Jan 18, 2025

thecodacus commented Jan 20, 2025 • edited Loading

leex279 commented Jan 20, 2025

leex279 commented Jan 20, 2025

thecodacus commented Jan 20, 2025

leex279 commented Jan 22, 2025

leex279 commented Jan 22, 2025

thecodacus commented Jan 22, 2025

thecodacus commented Jan 14, 2025 •

edited

Loading

mrsimpson left a comment •

edited

Loading

thecodacus commented Jan 15, 2025 •

edited

Loading

thecodacus commented Jan 17, 2025 •

edited

Loading

thecodacus commented Jan 20, 2025 •

edited

Loading