Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pipelines.js to allow for token_embeddings as well #770

Merged
merged 3 commits into from
May 23, 2024

Conversation

NikhilVerma
Copy link
Contributor

In recent examples of optimum pipeline export the feature extraction pipelines have their output state as token_embeddings instead of last_hidden_state.

So we should support this as well.

In recent examples of optimum pipeline export the feature extraction pipelines have their output state as `token_embeddings` instead of `last_hidden_state`.

So we should support this as well.
@xenova
Copy link
Collaborator

xenova commented May 22, 2024

Thanks for this! I'm pretty sure this is only for sentence-transformers models exported via Optimum (see here), but a good update regardless!

@xenova
Copy link
Collaborator

xenova commented May 22, 2024

Can you give an example of a model which has this as an output name?

@NikhilVerma
Copy link
Contributor Author

NikhilVerma commented May 23, 2024

@xenova Sorry I should have provided. this is the command I ran

optimum-cli export onnx --model sentence-transformers/all-MiniLM-L6-v2 ./sbert/

I am using the pnpm patch-package ability to do this and it works quite well!

@xenova
Copy link
Collaborator

xenova commented May 23, 2024

Thanks! Since the exported model returns two outputs (token_embeddings and sentence_embedding), I think it will be a good idea to try get the sentence_embedding if it is present and the user does not specify a pooling method. Otherwise, if pooling is set, we compute it the normal way. We should also allow the user to normalize their sentence embeddings with the normalize option.

Is this something you'd like to add to the PR?

src/pipelines.js Outdated Show resolved Hide resolved
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@xenova
Copy link
Collaborator

xenova commented May 23, 2024

^^ Will merge for now, but could be a good follow-up PR. Let me know if you're interested!

@xenova xenova merged commit 64b3da6 into huggingface:main May 23, 2024
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants