Skip to content

Commit

Permalink
feat(RAG): Introduce SentenceTransformer Reranker (#1810)
Browse files Browse the repository at this point in the history
  • Loading branch information
machatschek authored Apr 2, 2024
1 parent f83abff commit 83adc12
Show file tree
Hide file tree
Showing 7 changed files with 198 additions and 8 deletions.
2 changes: 2 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ navigation:
contents:
- page: LLM Backends
path: ./docs/pages/manual/llms.mdx
- page: Reranking
path: ./docs/pages/manual/reranker.mdx
- section: User Interface
contents:
- page: User interface (Gradio) Manual
Expand Down
36 changes: 36 additions & 0 deletions fern/docs/pages/manual/reranker.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## Enhancing Response Quality with Reranking

PrivateGPT offers a reranking feature aimed at optimizing response generation by filtering out irrelevant documents, potentially leading to faster response times and enhanced relevance of answers generated by the LLM.

### Enabling Reranking

Document reranking can significantly improve the efficiency and quality of the responses by pre-selecting the most relevant documents before generating an answer. To leverage this feature, ensure that it is enabled in the RAG settings and consider adjusting the parameters to best fit your use case.

#### Additional Requirements

Before enabling reranking, you must install additional dependencies:

```bash
poetry install --extras rerank-sentence-transformers
```

This command installs dependencies for the cross-encoder reranker from sentence-transformers, which is currently the only supported method by PrivateGPT for document reranking.

#### Configuration

To enable and configure reranking, adjust the `rag` section within the `settings.yaml` file. Here are the key settings to consider:

- `similarity_top_k`: Determines the number of documents to initially retrieve and consider for reranking. This value should be larger than `top_n`.
- `rerank`:
- `enabled`: Set to `true` to activate the reranking feature.
- `top_n`: Specifies the number of documents to use in the final answer generation process, chosen from the top-ranked documents provided by `similarity_top_k`.

Example configuration snippet:

```yaml
rag:
similarity_top_k: 10 # Number of documents to retrieve and consider for reranking
rerank:
enabled: true
top_n: 3 # Number of top-ranked documents to use for generating the answer
```
119 changes: 118 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 15 additions & 6 deletions private_gpt/server/chat/chat_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from llama_index.core.indices.postprocessor import MetadataReplacementPostProcessor
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core.postprocessor import (
SentenceTransformerRerank,
SimilarityPostprocessor,
)
from llama_index.core.storage import StorageContext
Expand Down Expand Up @@ -113,16 +114,24 @@ def _chat_engine(
context_filter=context_filter,
similarity_top_k=self.settings.rag.similarity_top_k,
)
node_postprocessors = [
MetadataReplacementPostProcessor(target_metadata_key="window"),
SimilarityPostprocessor(
similarity_cutoff=settings.rag.similarity_value
),
]

if settings.rag.rerank.enabled:
rerank_postprocessor = SentenceTransformerRerank(
model=settings.rag.rerank.model, top_n=settings.rag.rerank.top_n
)
node_postprocessors.append(rerank_postprocessor)

return ContextChatEngine.from_defaults(
system_prompt=system_prompt,
retriever=vector_index_retriever,
llm=self.llm_component.llm, # Takes no effect at the moment
node_postprocessors=[
MetadataReplacementPostProcessor(target_metadata_key="window"),
SimilarityPostprocessor(
similarity_cutoff=settings.rag.similarity_value
),
],
node_postprocessors=node_postprocessors,
)
else:
return SimpleChatEngine.from_defaults(
Expand Down
18 changes: 17 additions & 1 deletion private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,15 +284,31 @@ class UISettings(BaseModel):
)


class RerankSettings(BaseModel):
enabled: bool = Field(
False,
description="This value controls whether a reranker should be included in the RAG pipeline.",
)
model: str = Field(
"cross-encoder/ms-marco-MiniLM-L-2-v2",
description="Rerank model to use. Limited to SentenceTransformer cross-encoder models.",
)
top_n: int = Field(
2,
description="This value controls the number of documents returned by the RAG pipeline.",
)


class RagSettings(BaseModel):
similarity_top_k: int = Field(
2,
description="This value controls the number of documents returned by the RAG pipeline",
description="This value controls the number of documents returned by the RAG pipeline or considered for reranking if enabled.",
)
similarity_value: float = Field(
None,
description="If set, any documents retrieved from the RAG must meet a certain match score. Acceptable values are between 0 and 1.",
)
rerank: RerankSettings


class PostgresSettings(BaseModel):
Expand Down
6 changes: 6 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ asyncpg = {version="^0.29.0", optional = true}

# Optional Sagemaker dependency
boto3 = {version ="^1.34.51", optional = true}

# Optional Reranker dependencies
torch = {version ="^2.1.2", optional = true}
sentence-transformers = {version ="^2.6.1", optional = true}

# Optional UI
gradio = {version ="^4.19.2", optional = true}

Expand All @@ -57,6 +62,7 @@ vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-index-storage-index-store-postgres","psycopg2-binary","asyncpg"]
rerank-sentence-transformers = ["torch", "sentence-transformers"]

[tool.poetry.group.dev.dependencies]
black = "^22"
Expand Down
4 changes: 4 additions & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ rag:
#This value controls how many "top" documents the RAG returns to use in the context.
#similarity_value: 0.45
#This value is disabled by default. If you enable this settings, the RAG will only use articles that meet a certain percentage score.
rerank:
enabled: false
model: cross-encoder/ms-marco-MiniLM-L-2-v2
top_n: 1

llamacpp:
prompt_style: "mistral"
Expand Down

0 comments on commit 83adc12

Please sign in to comment.