Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please support reranker API #2018

Closed
thiner opened this issue Apr 12, 2024 · 12 comments · Fixed by #2121
Closed

Please support reranker API #2018

thiner opened this issue Apr 12, 2024 · 12 comments · Fixed by #2121
Assignees
Labels
enhancement New feature or request roadmap

Comments

@thiner
Copy link
Contributor

thiner commented Apr 12, 2024

Is your feature request related to a problem? Please describe.
Nowadays, embedding + reranker is the SOTA solution to improve the accuracy of RAG system. We've already have the embedding API support in LocalAI, it would be a big step forward if we can support reranker API.

Describe the solution you'd like
There are many reranker models out there, some famous names: bce-reranker-base_v1, CohereRerank, bge-reranker-v2-m3. I think the Jina reranker API would be a good format to implement. https://jina.ai/reranker/#apiform

Describe alternatives you've considered
n/a

Additional context
The benchmark regarding embedding+reranker for RAG:

@thiner thiner added the enhancement New feature or request label Apr 12, 2024
@thiner
Copy link
Contributor Author

thiner commented Apr 12, 2024

I spent some time trying to figure out how to implement. Below are my workouts:

  1. Use reranker model is as easy as using embedding model. Below is the example of bce-reranker-base-v1.
from sentence_transformers import CrossEncoder

# init reranker model
model = CrossEncoder('maidalun1020/bce-reranker-base_v1', max_length=512)

# calculate scores of sentence pairs
scores = model.predict(sentence_pairs)
  1. backend/python/sentencetransformers/sentencetransoformers.py supports embedding already, maybe add one more Rerank method with above code is enough. The major work would be refactor protobuf and API component.

@mudler mudler added the roadmap label Apr 12, 2024
@mudler
Copy link
Owner

mudler commented Apr 12, 2024

that's definetly a good addition - adding to our roadmap. Thanks also for pointing out the steps

@mudler mudler added the up for grabs Tickets that no-one is currently working on label Apr 12, 2024
@localai-bot
Copy link
Contributor

Thank you for your feature request, Thiner! We appreciate the details and the code example you provided to help illustrate the implementation. Adding reranker API support to LocalAI would indeed be a valuable improvement, especially considering the current state-of-the-art solutions for RAG systems.

To proceed with this feature request, we will evaluate the feasibility of incorporating a reranker API implementation similar to the example you provided, using a model like bce-reranker-base_v1 or other popular options. We will also research the best practices for integrating this functionality into LocalAI's existing architecture.

We will update the roadmap to include this feature request. Once we have completed our internal discussions and evaluations, we will provide an estimate of when this feature can be implemented, along with any additional details regarding the implementation.

Feel free to reach out if you have any further questions or concerns in the meantime. Thanks again for your suggestion and for helping us improve LocalAI!

@shuther
Copy link

shuther commented Apr 24, 2024

as a possible more configurable approach we may benefit from the project https://github.com/AnswerDotAI/rerankers

@mudler mudler self-assigned this Apr 24, 2024
@mudler mudler removed the up for grabs Tickets that no-one is currently working on label Apr 24, 2024
@mudler
Copy link
Owner

mudler commented Apr 24, 2024

having a quick look at this - let's see if can get something working before the weekend

@mudler
Copy link
Owner

mudler commented Apr 24, 2024

output

ok that was easy enough - going to open up a PR soon

@mudler
Copy link
Owner

mudler commented Apr 24, 2024

@thiner @BradKML will be part of the next release - if you want to test out before that, PR is at #2121

@shuther
Copy link

shuther commented Apr 25, 2024

that is a very good news. thank you very much

@thiner
Copy link
Contributor Author

thiner commented Apr 28, 2024

@mudler How can I attach the reranker feature to cublas-cuda12-core image? I tried to do so in Dockerfile: RUN make BUILD_TYPE=cublas -C backend/python/rerankers, but failed due to below error:

0.189 make: Entering directory '/build/backend/python/rerankers'
0.189 python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
0.189 make: Leaving directory '/build/backend/python/rerankers'
0.189 /opt/conda/bin/python3: Error while finding module specification for 'grpc_tools.protoc' (ModuleNotFoundError: No module named 'grpc_tools')
0.189 make: *** [Makefile:27: backend_pb2.py] Error 1

What should I do to fix?

Please forgive me for my lazy thinking, the solution is quite straightforward, pip install grpcio-tools.

@BradKML
Copy link

BradKML commented Apr 28, 2024

@mudler thank you for the cross-link references since I was mostly focused on LiteLLM and Ollama for maximizing compatibility, but knowing that LocalAI is "getting there" is quite a relief

@mudler
Copy link
Owner

mudler commented Apr 28, 2024

@mudler How can I attach the reranker feature to cublas-cuda12-core image? I tried to do so in Dockerfile: RUN make BUILD_TYPE=cublas -C backend/python/rerankers, but failed due to below error:

0.189 make: Entering directory '/build/backend/python/rerankers'
0.189 python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
0.189 make: Leaving directory '/build/backend/python/rerankers'
0.189 /opt/conda/bin/python3: Error while finding module specification for 'grpc_tools.protoc' (ModuleNotFoundError: No module named 'grpc_tools')
0.189 make: *** [Makefile:27: backend_pb2.py] Error 1

What should I do to fix?

Please forgive me for my lazy thinking, the solution is quite straightforward, pip install grpcio-tools.

I'd suggest you to use standard (non-core) images as those do not come with additional python dependencies. If you want to use the core images still, you can or either create a Dockerfile based on top of it and run the command to prepare the backend, or use EXTRA_BACKENDS, as outlined in the docs here:

https://localai.io/advanced/#extra-backends

So, for instance you can use it like this:

docker run --env EXTRA_BACKENDS="backend/python/rerankers" quay.io/go-skynet/local-ai:master-ffmpeg-core

and that should bring you up the needed python dependencies on startup.

@thiner
Copy link
Contributor Author

thiner commented Apr 28, 2024

Yes, I did so. The docker image is created successfully by specifying the extra nackend. But I got the grpc error at runtime still. Is there any changes to grpc module in the v2.13.0 release? I built the autogptq image with this dockerfile previousely, and that's working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request roadmap
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants