-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RAG: Reranking to improve results #39
Comments
MediaPipe GenAI Flutter package by Google https://pub.dev/packages/mediapipe_genai |
Open reranker model performing well (besides closed source Cohere reranker / embedding): mxbai https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v1, see reference post https://www.rungalileo.io/blog/mastering-rag-how-to-select-a-reranking-model A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE:
|
Potential reranking code on Vertex AI: https://cloud.google.com/generative-ai-app-builder/docs/ranking#rank_or_rerank_a_set_of_records_according_to_a_query We'll potentially need a cloud function for this. |
…s & runs, still no effect tho #39
Besides invoking a reranking specialized model via cloud function, we can also perform reranking purely based on a well crafted (possibly few shot) prompt LLM call. |
Right now the vector DB is working (#7) and we also made the ANN distance thresholds configurable (#35), but for proper RAG it'd be great to have re-ranking. Using Gemini this could mean many calls. Maybe we could leverage Gemma 2b model (FP16, int4, instruction tuned) locally with MediaPipe or something? That's not a re-ranker model though. And how to do that with Flutter in a platform independent way?
The text was updated successfully, but these errors were encountered: