Skip to content

Local RAG as a simple CLI, for standalone use or as a gptme tool

License

Notifications You must be signed in to change notification settings

ErikBjare/gptme-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

65 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

gptme-rag

A powerful RAG (Retrieval-Augmented Generation) system that enhances AI interactions by providing relevant context from your local files. Built primarily for gptme, but can be used standalone.

RAG systems improve AI responses by retrieving and incorporating relevant information from a knowledge base into the generation process. This leads to more accurate, contextual, and factual responses.

Tests PyPI version License
Built using gptme

Features

  • πŸ“š Document indexing with ChromaDB
    • Fast and efficient vector storage
    • Semantic search capabilities
    • Persistent storage
  • πŸ” Semantic search with embeddings
    • Relevance scoring
    • Token-aware context assembly
    • Clean output formatting
  • πŸ“„ Smart document processing
    • Streaming large file handling
    • Automatic document chunking
    • Configurable chunk size/overlap
    • Document reconstruction
  • πŸ‘€ File watching and auto-indexing
    • Real-time index updates
    • Pattern-based file filtering
    • Efficient batch processing
    • Automatic persistence
  • πŸ› οΈ CLI interface for testing and development
    • Index management
    • Search functionality
    • Context assembly
    • File watching

Quick Start

# Install (requires Python 3.10+)
pipx install gptme-rag  # or: pip install gptme-rag

# Index your documents
gptme-rag index **.md

# Search
gptme-rag search "What is the architecture of the system?"

For development installation:

git clone https://github.com/ErikBjare/gptme-rag.git
cd gptme-rag
poetry install

Usage

Indexing Documents

# Index markdown files in a directory
gptme-rag index *.md

# Index with custom persist directory
gptme-rag index *.md --persist-dir ./index

Searching

# Basic search
gptme-rag search "your query here"

# Advanced search with options
gptme-rag search "your query" \
  --n-results 5 \
  --persist-dir ./index \
  --max-tokens 4000 \
  --show-context

File Watching

The watch command monitors directories for changes and automatically updates the index:

# Watch a directory with default settings
gptme-rag watch /path/to/documents

# Watch with custom pattern and ignore rules
gptme-rag watch /path/to/documents \
  --pattern "**/*.{md,py}" \
  --ignore-patterns "*.tmp" "*.log" \
  --persist-dir ./index

Features:

  • πŸ”„ Real-time index updates
  • 🎯 Pattern matching for file types
  • 🚫 Configurable ignore patterns
  • πŸ”‹ Efficient batch processing
  • πŸ’Ύ Automatic persistence

The watcher will:

  • Perform initial indexing of existing files
  • Monitor for file changes (create/modify/delete/move)
  • Update the index automatically
  • Handle rapid changes efficiently with debouncing
  • Continue running until interrupted (Ctrl+C)

Performance Benchmarking

The benchmark commands help measure and optimize performance:

# Benchmark document indexing
gptme-rag benchmark indexing /path/to/documents \
  --pattern "**/*.md" \
  --persist-dir ./benchmark_index

# Benchmark search performance
gptme-rag benchmark search /path/to/documents \
  --queries "python" "documentation" "example" \
  --n-results 10

# Benchmark file watching
gptme-rag benchmark watch-perf /path/to/documents \
  --duration 10 \
  --updates-per-second 5

Features:

  • πŸ“Š Comprehensive metrics
    • Operation duration
    • Memory usage
    • Throughput
    • Custom metrics per operation
  • πŸ”¬ Multiple benchmark types
    • Document indexing
    • Search operations
    • File watching
  • πŸ“ˆ Performance tracking
    • Memory efficiency
    • Processing speed
    • System resource usage

Example benchmark output:

┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Operation      ┃ Duration(s) ┃ Memory(MB) ┃ Throughput ┃ Additional Metrics ┃
┑━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
β”‚ indexing       β”‚      0.523 β”‚     15.42 β”‚   19.12/s β”‚ files: 10         β”‚
β”‚ search         β”‚      0.128 β”‚      5.67 β”‚   23.44/s β”‚ queries: 3        β”‚
β”‚ file_watching  β”‚      5.012 β”‚      8.91 β”‚    4.99/s β”‚ updates: 25       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Document Chunking

The indexer supports automatic document chunking for efficient processing of large files:

# Index with custom chunk settings
gptme-rag index /path/to/documents \
  --chunk-size 1000 \
  --chunk-overlap 200

# Search with chunk grouping
gptme-rag search "your query" \
  --group-chunks \
  --n-results 5

Features:

  • πŸ”„ Streaming processing
    • Handles large files efficiently
    • Minimal memory usage
    • Progress reporting
  • πŸ“‘ Smart chunking
    • Configurable chunk size
    • Overlapping chunks for context
    • Token-aware splitting
  • πŸ” Enhanced search
    • Chunk-aware relevance
    • Result grouping by document
    • Full document reconstruction

Example Output:

Most Relevant Documents:

1. documentation.md#chunk2 (relevance: 0.85)
  Detailed section about configuration options, including chunk size and overlap settings.
  [Part of: documentation.md]

2. guide.md#chunk5 (relevance: 0.78)
  Example usage showing how to process large documents efficiently.
  [Part of: guide.md]

3. README.md#chunk1 (relevance: 0.72)
  Overview of the chunking system and its benefits for large document processing.
  [Part of: README.md]

Full Context:
Total tokens: 850
Documents included: 3 (from 3 source documents)
Truncated: False

The chunking system automatically:

  • Splits large documents into manageable pieces
  • Maintains context across chunk boundaries
  • Groups related chunks in search results
  • Provides document reconstruction when needed

Development

Running Tests

# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov=gptme_rag

Project Structure

gptme_rag/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ cli.py               # CLI interface
β”œβ”€β”€ indexing/           # Document indexing
β”‚   β”œβ”€β”€ document.py    # Document model
β”‚   └── indexer.py     # ChromaDB integration
β”œβ”€β”€ query/             # Search functionality
β”‚   └── context_assembler.py  # Context assembly
└── utils/             # Utility functions

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Run tests and linting
  6. Submit a pull request

Releases

Releases are automated through GitHub Actions. The process is:

  1. Update version in pyproject.toml
  2. Commit the change: git commit -am "chore: bump version to x.y.z"
  3. Create and push a tag: git tag vx.y.z && git push origin master vx.y.z
  4. Create a GitHub release (can be done with gh release create vx.y.z)
  5. The publish workflow will automatically:
    • Run tests
    • Build the package
    • Publish to PyPI

Integration with gptme

This package is designed to integrate with gptme to provide AI assistants with relevant context from your local files. When used with gptme, it:

  • Automatically indexes your project files
  • Enhances AI responses with relevant context
  • Provides semantic search across your codebase
  • Maintains a persistent knowledge base
  • Assembles context intelligently within token limits

To use with gptme, simply install both packages and gptme will automatically detect and use gptme-rag for context management.

License

MIT License. See LICENSE for details.