Skip to content

Commit

Permalink
Integrate VoyageAI Vectorizer and Reranker class (#223)
Browse files Browse the repository at this point in the history
This PR (thanks to @fzowl) integrates the VoyageAI vectorizer and
reranker into the RedisVL client, streamlining access for devs to embed
data and rerank search results from Redis.

---------

Co-authored-by: fzowl <zoltan@voyageai.com>
Co-authored-by: Justin Cechmanek <165097110+justin-cechmanek@users.noreply.github.com>
  • Loading branch information
3 people authored Jan 14, 2025
1 parent 787f214 commit 18d1cfd
Show file tree
Hide file tree
Showing 16 changed files with 1,637 additions and 289 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ jobs:
GCP_PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
AZURE_OPENAI_API_KEY: ${{secrets.AZURE_OPENAI_API_KEY}}
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
Expand All @@ -86,6 +87,7 @@ jobs:
GCP_PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
AZURE_OPENAI_API_KEY: ${{secrets.AZURE_OPENAI_API_KEY}}
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
Expand Down
59 changes: 30 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Install `redisvl` into your Python (>=3.8) environment using `pip`:
```bash
pip install redisvl
```
> For more detailed instructions, visit the [installation guide](https://docs.redisvl.com/en/latest/overview/installation.html).
> For more detailed instructions, visit the [installation guide](https://docs.redisvl.com/en/stable/overview/installation.html).
## Setting up Redis

Expand All @@ -71,7 +71,7 @@ Choose from multiple Redis deployment options:


## 🗃️ Redis Index Management
1. [Design a schema for your use case](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.html#define-an-indexschema) that models your dataset with built-in Redis and indexable fields (*e.g. text, tags, numerics, geo, and vectors*). [Load a schema](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.html#example-schema-creation) from a YAML file:
1. [Design a schema for your use case](https://docs.redisvl.com/en/stable/user_guide/getting_started_01.html#define-an-indexschema) that models your dataset with built-in Redis and indexable fields (*e.g. text, tags, numerics, geo, and vectors*). [Load a schema](https://docs.redisvl.com/en/stable/user_guide/getting_started_01.html#example-schema-creation) from a YAML file:
```yaml
index:
name: user-idx
Expand Down Expand Up @@ -121,7 +121,7 @@ Choose from multiple Redis deployment options:
})
```

2. [Create a SearchIndex](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.html#create-a-searchindex) class with an input schema and client connection in order to perform admin and search operations on your index in Redis:
2. [Create a SearchIndex](https://docs.redisvl.com/en/stable/user_guide/getting_started_01.html#create-a-searchindex) class with an input schema and client connection in order to perform admin and search operations on your index in Redis:
```python
from redis import Redis
from redisvl.index import SearchIndex
Expand All @@ -133,10 +133,10 @@ Choose from multiple Redis deployment options:
# Create the index in Redis
index.create()
```
> Async compliant search index class also available: [AsyncSearchIndex](https://docs.redisvl.com/en/latest/api/searchindex.html#redisvl.index.AsyncSearchIndex).
> Async compliant search index class also available: [AsyncSearchIndex](https://docs.redisvl.com/en/stable/api/searchindex.html#redisvl.index.AsyncSearchIndex).

3. [Load](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.html#load-data-to-searchindex)
and [fetch](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.html#fetch-an-object-from-redis) data to/from your Redis instance:
3. [Load](https://docs.redisvl.com/en/stable/user_guide/getting_started_01.html#load-data-to-searchindex)
and [fetch](https://docs.redisvl.com/en/stable/user_guide/getting_started_01.html#fetch-an-object-from-redis) data to/from your Redis instance:
```python
data = {"user": "john", "credit_score": "high", "embedding": [0.23, 0.49, -0.18, 0.95]}
Expand All @@ -151,7 +151,7 @@ and [fetch](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.htm

Define queries and perform advanced searches over your indices, including the combination of vectors, metadata filters, and more.

- [VectorQuery](https://docs.redisvl.com/en/latest/api/query.html#vectorquery) - Flexible vector queries with customizable filters enabling semantic search:
- [VectorQuery](https://docs.redisvl.com/en/stable/api/query.html#vectorquery) - Flexible vector queries with customizable filters enabling semantic search:

```python
from redisvl.query import VectorQuery
Expand Down Expand Up @@ -179,24 +179,25 @@ Define queries and perform advanced searches over your indices, including the co
results = index.query(query)
```

- [RangeQuery](https://docs.redisvl.com/en/latest/api/query.html#rangequery) - Vector search within a defined range paired with customizable filters
- [FilterQuery](https://docs.redisvl.com/en/latest/api/query.html#filterquery) - Standard search using filters and the full-text search
- [CountQuery](https://docs.redisvl.com/en/latest/api/query.html#countquery) - Count the number of indexed records given attributes
- [RangeQuery](https://docs.redisvl.com/en/stable/api/query.html#rangequery) - Vector search within a defined range paired with customizable filters
- [FilterQuery](https://docs.redisvl.com/en/stable/api/query.html#filterquery) - Standard search using filters and the full-text search
- [CountQuery](https://docs.redisvl.com/en/stable/api/query.html#countquery) - Count the number of indexed records given attributes

> Read more about building [advanced Redis queries](https://docs.redisvl.com/en/latest/user_guide/hybrid_queries_02.html).
> Read more about building [advanced Redis queries](https://docs.redisvl.com/en/stable/user_guide/hybrid_queries_02.html).


## 🔧 Utilities

### Vectorizers
Integrate with popular embedding providers to greatly simplify the process of vectorizing unstructured data for your index and queries:
- [AzureOpenAI](https://docs.redisvl.com/en/latest/api/vectorizer.html#azureopenaitextvectorizer)
- [Cohere](https://docs.redisvl.com/en/latest/api/vectorizer.html#coheretextvectorizer)
- [Custom](https://docs.redisvl.com/en/latest/api/vectorizer.html#customtextvectorizer)
- [GCP VertexAI](https://docs.redisvl.com/en/latest/api/vectorizer.html#vertexaitextvectorizer)
- [HuggingFace](https://docs.redisvl.com/en/latest/api/vectorizer.html#hftextvectorizer)
- [Mistral](https://docs.redisvl.com/en/latest/api/vectorizer/html#mistralaitextvectorizer)
- [OpenAI](https://docs.redisvl.com/en/latest/api/vectorizer.html#openaitextvectorizer)
- [AzureOpenAI](https://docs.redisvl.com/en/stable/api/vectorizer.html#azureopenaitextvectorizer)
- [Cohere](https://docs.redisvl.com/en/stable/api/vectorizer.html#coheretextvectorizer)
- [Custom](https://docs.redisvl.com/en/stable/api/vectorizer.html#customtextvectorizer)
- [GCP VertexAI](https://docs.redisvl.com/en/stable/api/vectorizer.html#vertexaitextvectorizer)
- [HuggingFace](https://docs.redisvl.com/en/stable/api/vectorizer.html#hftextvectorizer)
- [Mistral](https://docs.redisvl.com/en/stable/api/vectorizer/html#mistralaitextvectorizer)
- [OpenAI](https://docs.redisvl.com/en/stable/api/vectorizer.html#openaitextvectorizer)
- [VoyageAI](https://docs.redisvl.com/en/stable/api/vectorizer/html#voyageaitextvectorizer)

```python
from redisvl.utils.vectorize import CohereTextVectorizer
Expand All @@ -215,11 +216,11 @@ embeddings = co.embed_many(
)
```

> Learn more about using [vectorizers]((https://docs.redisvl.com/en/latest/user_guide/vectorizers_04.html)) in your embedding workflows.
> Learn more about using [vectorizers]((https://docs.redisvl.com/en/stable/user_guide/vectorizers_04.html)) in your embedding workflows.


### Rerankers
[Integrate with popular reranking providers](https://docs.redisvl.com/en/latest/user_guide/rerankers_06.html) to improve the relevancy of the initial search results from Redis
[Integrate with popular reranking providers](https://docs.redisvl.com/en/stable/user_guide/rerankers_06.html) to improve the relevancy of the initial search results from Redis



Expand All @@ -229,7 +230,7 @@ We're excited to announce the support for **RedisVL Extensions**. These modules
*Have an idea for another extension? Open a PR or reach out to us at applied.ai@redis.com. We're always open to feedback.*
### LLM Semantic Caching
Increase application throughput and reduce the cost of using LLM models in production by leveraging previously generated knowledge with the [`SemanticCache`](https://docs.redisvl.com/en/latest/api/cache.html#semanticcache).
Increase application throughput and reduce the cost of using LLM models in production by leveraging previously generated knowledge with the [`SemanticCache`](https://docs.redisvl.com/en/stable/api/cache.html#semanticcache).
```python
from redisvl.extensions.llmcache import SemanticCache
Expand All @@ -256,11 +257,11 @@ print(response[0]["response"])
>>> Paris
```
> Learn more about [semantic caching]((https://docs.redisvl.com/en/latest/user_guide/llmcache_03.html)) for LLMs.
> Learn more about [semantic caching]((https://docs.redisvl.com/en/stable/user_guide/llmcache_03.html)) for LLMs.
### LLM Session Management
Improve personalization and accuracy of LLM responses by providing user chat history as context. Manage access to the session data using recency or relevancy, *powered by vector search* with the [`SemanticSessionManager`](https://docs.redisvl.com/en/latest/api/session_manager.html).
Improve personalization and accuracy of LLM responses by providing user chat history as context. Manage access to the session data using recency or relevancy, *powered by vector search* with the [`SemanticSessionManager`](https://docs.redisvl.com/en/stable/api/session_manager.html).
```python
from redisvl.extensions.session_manager import SemanticSessionManager
Expand Down Expand Up @@ -292,7 +293,7 @@ session.get_relevant("weather", top_k=1)
```stdout
>>> [{"role": "user", "content": "what is the weather going to be today?"}]
```
> Learn more about [LLM session management]((https://docs.redisvl.com/en/latest/user_guide/session_manager_07.html)).
> Learn more about [LLM session management]((https://docs.redisvl.com/en/stable/user_guide/session_manager_07.html)).
### LLM Semantic Routing
Expand Down Expand Up @@ -329,7 +330,7 @@ router("Hi, good morning")
```stdout
>>> RouteMatch(name='greeting', distance=0.273891836405)
```
> Learn more about [semantic routing](https://docs.redisvl.com/en/latest/user_guide/semantic_router_08.html).
> Learn more about [semantic routing](https://docs.redisvl.com/en/stable/user_guide/semantic_router_08.html).
## 🖥️ Command Line Interface
Create, destroy, and manage Redis index configurations from a purpose-built CLI interface: `rvl`.
Expand All @@ -345,7 +346,7 @@ Commands:
stats Obtain statistics about an index
```
> Read more about [using the CLI](https://docs.redisvl.com/en/latest/user_guide/cli.html).
> Read more about [using the CLI](https://docs.redisvl.com/en/stable/user_guide/cli.html).
## 🚀 Why RedisVL?
Expand All @@ -359,9 +360,9 @@ The Redis Vector Library bridges the gap between the AI-native developer ecosyst
## 😁 Helpful Links

For additional help, check out the following resources:
- [Getting Started Guide](https://docs.redisvl.com/en/latest/user_guide/getting_started_01.html)
- [API Reference](https://docs.redisvl.com/en/latest/api/index.html)
- [Example Gallery](https://docs.redisvl.com/en/latest/examples/index.html)
- [Getting Started Guide](https://docs.redisvl.com/en/stable/user_guide/getting_started_01.html)
- [API Reference](https://docs.redisvl.com/en/stable/api/index.html)
- [Example Gallery](https://docs.redisvl.com/en/stable/examples/index.html)
- [Redis AI Recipes](https://github.com/redis-developer/redis-ai-resources)
- [Official Redis Vector API Docs](https://redis.io/docs/interact/search-and-query/advanced-concepts/vectors/)

Expand Down
18 changes: 15 additions & 3 deletions docs/api/reranker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Rerankers
***********

CohereReranker
================
==============

.. _coherereranker_api:

Expand All @@ -15,12 +15,24 @@ CohereReranker


HFCrossEncoderReranker
========================
======================

.. _hfcrossencoderreranker_api:

.. currentmodule:: redisvl.utils.rerank.hf_cross_encoder

.. autoclass:: HFCrossEncoderReranker
:show-inheritance:
:members:
:members:


VoyageAIReranker
================

.. _voyageaireranker_api:

.. currentmodule:: redisvl.utils.rerank.voyageai

.. autoclass:: VoyageAIReranker
:show-inheritance:
:members:
16 changes: 15 additions & 1 deletion docs/api/vectorizer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ CohereTextVectorizer
:show-inheritance:
:members:


BedrockTextVectorizer
=====================

Expand All @@ -72,6 +73,7 @@ BedrockTextVectorizer
:show-inheritance:
:members:


CustomTextVectorizer
====================

Expand All @@ -81,4 +83,16 @@ CustomTextVectorizer

.. autoclass:: CustomTextVectorizer
:show-inheritance:
:members:
:members:


VoyageAITextVectorizer
======================

.. _voyageaitextvectorizer_api:

.. currentmodule:: redisvl.utils.vectorize.text.voyageai

.. autoclass:: VoyageAITextVectorizer
:show-inheritance:
:members:
96 changes: 95 additions & 1 deletion docs/user_guide/rerankers_06.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"\n",
"- A re-ranker that uses pre-trained [Cross-Encoders](https://sbert.net/examples/applications/cross-encoder/README.html) which can use models from [Hugging Face cross encoder models](https://huggingface.co/cross-encoder) or Hugging Face models that implement a cross encoder function ([example: BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)).\n",
"- The [Cohere /rerank API](https://docs.cohere.com/docs/rerank-2).\n",
"- The [VoyageAI /rerank API](https://docs.voyageai.com/docs/reranker).\n",
"\n",
"Before running this notebook, be sure to:\n",
"1. Have installed ``redisvl`` and have that environment active for this notebook.\n",
Expand Down Expand Up @@ -306,7 +307,100 @@
"for result, score in zip(results, scores):\n",
" print(score, \" -- \", result)"
]
}
},

{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using the VoyageAI Reranker\n",
"\n",
"To initialize the VoyageAI reranker you'll need to install the voyaeai library and provide the right VoyageAI API Key."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"metadata": {}
},
"outputs": [],
"source": [
"#!pip install voyageai"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"metadata": {}
},
"outputs": [],
"source": [
"import getpass\n",
"\n",
"# setup the API Key\n",
"api_key = os.environ.get(\"VOYAGE_API_KEY\") or getpass.getpass(\"Enter your VoyageAI API key: \")"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"metadata": {}
},
"outputs": [],
"source": [
"from redisvl.utils.rerank import VoyageAIReranker\n",
"\n",
"reranker = VoyageAIReranker(model=\"rerank-lite-1\", limit=3, api_config={\"api_key\": api_key})",
"# Please check the available models at https://docs.voyageai.com/docs/reranker"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Rerank documents with VoyageAIReranker\n",
"\n",
"Below we will use the `VoyageAIReranker` to rerank and also truncate the list of\n",
"documents above based on relevance to the initial query."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"metadata": {}
},
"outputs": [],
"source": [
"results, scores = reranker.rank(query=query, docs=docs)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"metadata": {}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.796875 -- Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.\n",
"0.578125 -- Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.\n",
"0.5625 -- Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.\n"
]
}
],
"source": [
"for result, score in zip(results, scores):\n",
" print(score, \" -- \", result)"
]
}

],
"metadata": {
"kernelspec": {
Expand Down
Loading

0 comments on commit 18d1cfd

Please sign in to comment.