Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG: Reranking to improve results #39

Open
MrCsabaToth opened this issue Aug 19, 2024 · 4 comments
Open

RAG: Reranking to improve results #39

MrCsabaToth opened this issue Aug 19, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request RAG Retrieval Augmented Generation related

Comments

@MrCsabaToth
Copy link
Member

Right now the vector DB is working (#7) and we also made the ANN distance thresholds configurable (#35), but for proper RAG it'd be great to have re-ranking. Using Gemini this could mean many calls. Maybe we could leverage Gemma 2b model (FP16, int4, instruction tuned) locally with MediaPipe or something? That's not a re-ranker model though. And how to do that with Flutter in a platform independent way?

@MrCsabaToth MrCsabaToth added the enhancement New feature or request label Aug 19, 2024
@MrCsabaToth
Copy link
Member Author

MediaPipe GenAI Flutter package by Google https://pub.dev/packages/mediapipe_genai
unfortunately v0.0.1

@MrCsabaToth
Copy link
Member Author

Open reranker model performing well (besides closed source Cohere reranker / embedding): mxbai https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v1, see reference post https://www.rungalileo.io/blog/mastering-rag-how-to-select-a-reranking-model

A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE:

  1. Cross-Encoders vs. LLMs: Effective cross-encoders, when paired with strong retrievers, have shown the ability to outperform most LLMs in reranking tasks, except for GPT-4 on some datasets. Notably, cross-encoders offer this improved performance while being more efficient, making them an attractive option for reranking tasks.
  2. LLM-based Rerankers: Zero-shot LLM-based rerankers, including those based on OpenAI and open models, exhibit competitive effectiveness, with some even matching the performance of GPT3.5 Turbo. However, the inefficiency and high cost associated with these models currently limit their practical use in retrieval systems, despite their promising performance.

@MrCsabaToth MrCsabaToth changed the title Reranking for RAG RAG: Reranking to improve results Aug 21, 2024
@MrCsabaToth MrCsabaToth added the RAG Retrieval Augmented Generation related label Aug 21, 2024
@MrCsabaToth
Copy link
Member Author

Potential reranking code on Vertex AI: https://cloud.google.com/generative-ai-app-builder/docs/ranking#rank_or_rerank_a_set_of_records_according_to_a_query

We'll potentially need a cloud function for this.

@MrCsabaToth
Copy link
Member Author

Besides invoking a reranking specialized model via cloud function, we can also perform reranking purely based on a well crafted (possibly few shot) prompt LLM call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request RAG Retrieval Augmented Generation related
Projects
None yet
Development

No branches or pull requests

1 participant