Pass kwargs to encoder #482

RichaMax · 2024-11-28T11:04:18Z

Feature request

Models like https://huggingface.co/BAAI/bge-m3 and https://huggingface.co/jinaai/jina-embeddings-v3 can take extras kwargs as input of the encode function such as task=... for Jina v3 or return_dense=False/True for bge-m3

It would be great if we could pass these kwargs either when using the async engine via the Python API
engine.embed(sentences=[...], additional_args=**kwargs)

or when we are sending requests to an endpoint create using your docker image

r = requests.post("http://0.0.0.0:7997/embeddings", json={"model":"test_model","input":["Two cute cats."], "task": "text-matching"})

Motivation

This would could also be used to handle truncate_dim for Matryoshka embeddings.

might be linked to: #476

Your contribution

I could try to implement it on my free time but I do not have much currently plus I'm still navigating the code. Any pointers at where to start are welcome.

The text was updated successfully, but these errors were encountered:

michaelfeil · 2024-11-28T16:21:53Z

Matryoshka embeddings is handled in another issue. Please comment in #476

Due to the dynamic batching nature the models such as Jina need to handle the prompt template on instance level. Sentence transformer implements this on per batch level. In infinity, batch level likely/might have over multiple tenants.
Therefore this request is not possible & has a decently high complexity. Also the interface is not standardized, the above one if just for this jina model family.

RichaMax · 2024-11-29T09:59:50Z

Ok my bad, thanks for the explanations. I was under the impression that similarly to what has been done in vLLM we could pass custom args to the encode method for each request.

So this means that if I want to use model like jina on specific task such as retrieval.query like the example on the model card:

model = SentenceTransformer("jinaai/jina-embeddings-v3", trust_remote_code=True, model_kwargs={'default_task': 'retrieval.query'})

I need to find a way to pass it through the EngineArgs to the AsyncEmbeddingEngine?

Thanks for the hard work on this lib!

s04 · 2024-12-02T12:40:28Z

Hi, looking into this as well. Could is possibly done by baking/hard coding into the model when passing it in rather than through kwargs?

Thanks for all the hard work on the lib, it's a pleasure to use. Hoping to pick up the Matryoshka embeddings issue.

michaelfeil · 2024-12-02T17:30:45Z

@s04 What do you mean by hard-coding? Currently it uses the default template / no template.

wirthual mentioned this issue Jan 9, 2025

support for dimensions field like in OpenAI text-embedding-3, thanks #476

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass kwargs to encoder #482

Pass kwargs to encoder #482

RichaMax commented Nov 28, 2024 •

edited

Loading

michaelfeil commented Nov 28, 2024 •

edited

Loading

RichaMax commented Nov 29, 2024 •

edited

Loading

s04 commented Dec 2, 2024

michaelfeil commented Dec 2, 2024

Pass kwargs to encoder #482

Pass kwargs to encoder #482

Comments

RichaMax commented Nov 28, 2024 • edited Loading

Feature request

Motivation

Your contribution

michaelfeil commented Nov 28, 2024 • edited Loading

RichaMax commented Nov 29, 2024 • edited Loading

s04 commented Dec 2, 2024

michaelfeil commented Dec 2, 2024

RichaMax commented Nov 28, 2024 •

edited

Loading

michaelfeil commented Nov 28, 2024 •

edited

Loading

RichaMax commented Nov 29, 2024 •

edited

Loading