Skip to content

Reference architecture for LLM-based applications on Google Cloud Platform with Redis Enterprise as a high-performance data layer.

License

Notifications You must be signed in to change notification settings

jcaw07/gcp-redis-llm-stack

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Reference Architecture using Redis & Google Cloud Platform

Open In Colab

☁️ Google's Vertex AI has expanded its capabilities by introducing Generative AI. This advanced technology comes with a specialized in-console studio experience, a dedicated API and Python SDK designed for deploying and managing instances of Google's powerful PaLM language models. With a distinct focus on text generation, summarization, chat completion, and embedding creation, PaLM models are reshaping the boundaries of natural language processing and machine learning.

⚡ Redis Enterprise offers robust vector database features, with an API for index creation, management, distance metric selection, similarity search, and hybrid filtering. When coupled with its versatile data structures - including lists, hashes, JSON, and sets - Redis Enterprise shines as the optimal solution for crafting high-quality Large Language Model (LLM)-based applications. It embodies a streamlined, "shared-nothing" architecture and exceptional SLAs, making it an instrumental tool for production environments.

This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.

Reference Architecture

Core Components

  1. Primary Storage >>> GCP BigQuery
  2. Foundation Models >>> GCP Vertex AI
    • PaLM API for text embedding creation
    • PaLM API for text generation
    • PaLM API for chat completion
  3. High-Performance Data Layer >>> Redis Enterprise
    • Vector database for semantic search + context retrieval
    • LLM Cache
    • LLM Memory for application chat history and session metadata

Setup Workflow

  1. Load Libraries and Tools: Before building language modeling applications, we install the right Python libraries, connect to the proper datastores, and authenticate with GCP.
  2. Create BigQuery Table: Drawing in data from one or more sources, we populate an enriched table in BigQuery that holds the primary data for building language model applications. This could be a custom knowledge base, domain-specific proprietary data, customer records (typically any kind of data that has text fields).
  3. Generate Embeddings: Leveraging Google’s Vertex PaLM API, we generate semantic text embeddings that characterize & represent “chunks” of underlying text. These embeddings are lists of numbers that capture the meaning and context and can be used for similarity search between a user input question or prompt and source text.
  4. Load Embeddings: We store the rich embeddings in Redis Enterprise as an additional low-latency data layer on top of BigQuery.
  5. Create Vector Index: We create a search index in Redis Enterprise that enables real-time semantic search. While BigQuery holds the primary data, Redis holds the embeddings.

Potential Use Cases

This architecture contains many essential elements required to build real-world LLM applications that can enhance your business. A few examples include:

Tutorial

Open In Colab

Open the code tutorial in a Colab notebook (recommended) to get your hands dirty with LLMs on GCP. This guide is a step-by-step walkthrough of setting up the required data, services, and databases in order to build LLM applications, and then highlights a few Redis & LLM design principles including: Semantic Search, Retrieval, Caching, and Memory storage.

Additional Resources

About

Reference architecture for LLM-based applications on Google Cloud Platform with Redis Enterprise as a high-performance data layer.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%