Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enhance context handling by adding code context selection and implementing summary generation #1091

Merged
merged 10 commits into from
Jan 22, 2025

Conversation

thecodacus
Copy link
Collaborator

@thecodacus thecodacus commented Jan 14, 2025

Dynamic LLM Context Optimization Implementation

Overview

This PR implements intelligent context management for LLM interactions by introducing two key optimizations:

  1. Code Context Optimization: Dynamically selects relevant files for the context buffer based on the current conversation
  2. Chat Context Optimization: Summarizes chat history to maintain context while reducing token usage

PR attempts to solve :
#1041
#1042

Key Changes

1. Context Buffer Management

  • Added new system for selecting relevant files using LLM analysis
  • Implemented file filtering with ignore patterns similar to .gitignore
  • Limited context buffer to 5 files maximum for efficient token usage
  • Added support for relative and absolute file paths

2. Chat Summarization

  • Implemented chat history summarization to compress context while preserving key information
  • Created progress tracking for multi-step context optimization process
  • Added annotation system for tracking context and summary metadata

Technical Details

Context Selection System

Key implementation details from select-context.ts:

export async function selectContext(props: {
  messages: Message[];
  env: Env;
  apiKeys?: Record<string, string>;
  files: FileMap;
  providerSettings?: Record<string, IProviderSetting>;
  promptId?: string;
  contextOptimization?: boolean;
  summary: string;
}) {
  // ... initialization ...

  const resp = await generateText({
    system: `
      You are a software engineer. You are working on a project. You need to select files that are relevant to the task from the list of files above.
    `,
    prompt: `
      ${summaryText}
      Users Question: ${processedMessages.filter((x) => x.role == 'user').pop()?.content}
    `
  });

  // Parse response for file selection
  const updateContextBuffer = response.match(/<updateContextBuffer>([\s\S]*?)<\/updateContextBuffer>/);
}

Chat Summary Generation

From create-summary.ts:

export async function createSummary(props: {
  messages: Message[];
  env: Env;
  apiKeys?: Record<string, string>;
  providerSettings?: Record<string, IProviderSetting>;
}) {
  // ... initialization ...
  
  const extractTextContent = (message: Message) =>
    Array.isArray(message.content)
      ? (message.content.find((item) => item.type === 'text')?.text as string) || ''
      : message.content;

  const resp = await generateText({
    system: `
      You are a software engineer. You need to summarize the work till now and provide a summary of the chat.
    `,
    prompt: `
      please provide a summary of the chat till now.
      below is the latest chat:
      ${slicedMessages.map((x) => `[${x.role}] ${extractTextContent(x)}`).join('\n')}
    `
  });
}

Progress Tracking

Implementation of progress annotations in api.chat.ts:

type ProgressAnnotation = {
  type: 'progress';
  value: number;
  message: string;
};

// Usage in stream
dataStream.writeMessageAnnotation({
  type: 'progress',
  value: progressCounter++,
  message: 'Generating Chat Summary'
} as ProgressAnnotation);

Migration Impact

  • Breaking changes to the chat API response format to include progress and context annotations
  • New type definitions for context and progress annotations
  • Updated stream handling to support multi-stage processing

Future Improvements

  • Add configurable context buffer size limits
  • Add support for custom ignore patterns
  • Add support for partial file content selection

Copy link
Collaborator

@mrsimpson mrsimpson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not reviewed the code in detail, but the nicely summarized general design looks like an interesting approach 👍

I wish we had a real context-management-api which allows to chain multiple operation that manipulate the context. Your LLM-summarization calls could be steps in them.

I added some questions to the code, but honestly, I am not really able to fully understand our codebase and what does which part now 😬
Thus, it's majorly trying to understand better. Hope you don't mind, @thecodacus

app/components/chat/AssistantMessage.tsx Show resolved Hide resolved
app/lib/.server/llm/constants.ts Show resolved Hide resolved
app/lib/.server/llm/create-summary.ts Show resolved Hide resolved
app/lib/.server/llm/create-summary.ts Show resolved Hide resolved
app/lib/.server/llm/select-context.ts Show resolved Hide resolved
@leex279
Copy link
Collaborator

leex279 commented Jan 15, 2025

Does it need to toggle on/off this features to valid test this PR?
image

@thecodacus
Copy link
Collaborator Author

thecodacus commented Jan 15, 2025

yes. still made it as an optional optimization feature

@wonderwhy-er
Copy link
Collaborator

Sorry, am bit out of things at the moment. Gut bit burned out juggling multiple things.

Taking a look now, on surface, structurally looks good, close to how I was thinking this should work and we can experiment with alternative approaches of filtering chat and selecting files afterwards.

So far I run a test for making a snake game with enabled and disabled context optimisation.
Here is the gist with both
https://gist.github.com/wonderwhy-er/a236bc73d7e19d93d154f5430793284b

What is weird.
With context optimisation it used 8327 tokens
Without optimisation it used 8038

So it was better without :D

But I have suspicion it just did not work for me even though I switched that switch in settings on and off.
Should difference be seen in chat history?

@wonderwhy-er
Copy link
Collaborator

I run out of time I have at the moment, need to explore more but am bit short on time.

@wonderwhy-er
Copy link
Collaborator

Ok actually I do see chat summary in annotations in chat history

@thecodacus
Copy link
Collaborator Author

thecodacus commented Jan 17, 2025

Sorry, am bit out of things at the moment. Gut bit burned out juggling multiple things.

No worries 😄

What is weird. With context optimisation it used 8327 tokens Without optimisation it used 8038

So it was better without :D

this needs some additional tokens for summary generation and context selection. so for smaller chats and smaller projects it takes more tokens but for larger chats and project it should reverse 😄

maybe we can dynamically switch it on when the context becomes large and for smaller one it defaults to the regular approach

@leex279
Copy link
Collaborator

leex279 commented Jan 18, 2025

I testet out with a bigger project, and its not that much that is reduced in this case:
image

Testproject: https://github.com/leex279/task-list-advanced

@thecodacus
Copy link
Collaborator Author

thecodacus commented Jan 20, 2025

I testet out with a bigger project, and its not that much that is reduced in this case: image

Testproject: https://github.com/leex279/task-list-advanced

I don't see the optimization is in place.

when it is active you will see logs like this
image

@leex279
Copy link
Collaborator

leex279 commented Jan 20, 2025

@thecodacus thanks for the hint. Testet again and looks fine (maybe it was a cache thing or I mixed up the PRs as I was testing several at this time :D)

Without:
image

With Optimization:
image

@leex279
Copy link
Collaborator

leex279 commented Jan 20, 2025

@thecodacus it looks fine, but with Mistral + Codestral I see that the complete UI is blocking and no streaming seen as long as it is doing the implementation.

Dont know if this has to do with the PR.

@thecodacus
Copy link
Collaborator Author

Mistral + Codestral

is it not happening for main branch ?

This was referenced Jan 20, 2025
@thecodacus thecodacus added this to the v0.0.6 milestone Jan 21, 2025
@leex279
Copy link
Collaborator

leex279 commented Jan 22, 2025

Mistral + Codestral

is it not happening for main branch ?

its also on the main, so forget it here. Seems a problem with Mistral/Codestral

@leex279 leex279 self-requested a review January 22, 2025 10:50
leex279
leex279 previously approved these changes Jan 22, 2025
@leex279
Copy link
Collaborator

leex279 commented Jan 22, 2025

@thecodacus I think we should merge this now to main and to additional fixes in a new PR.

@thecodacus
Copy link
Collaborator Author

@leex279 resolved the merge conflicts, need to approve again

@thecodacus thecodacus added the stable-release Used In PR: Tag to publish the changes from main to stable Branch label Jan 22, 2025
@thecodacus thecodacus merged commit 3c56346 into stackblitz-labs:main Jan 22, 2025
4 checks passed
@thecodacus thecodacus deleted the context-selection branch January 30, 2025 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stable-release Used In PR: Tag to publish the changes from main to stable Branch
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants