-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference]Support sentence transformers clip #495
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
||
# Copied and adapted from https://github.com/huggingface/optimum/blob/d03ab100206cb9f0e62167a36ee6997424bb9bb5/optimum/utils/save_utils.py#L27 | ||
# To remove once we can bump to a transformers release including https://github.com/huggingface/transformers/pull/29169 | ||
def maybe_load_preprocessors( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put a check of the transformers version, if it's above the coming release, then fail. This way we can actually catch that and remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked the PR I submited to transformers, it's not merged yet and I'm not 100% percent sure that it should be accepted by trfrs maintainers, thus completely don't know what trfr version to put here. Let's keep it this way, I will keep my eyes on how it goes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this PR do?
Follow up for #408, support the inference of sentence transformers clip.
[Compilation]
optimum-cli export neuron -m sentence-transformers/clip-ViT-B-32 --sequence_length 64 --text_batch_size 3 --image_batch_size 1 --num_channels 3 --height 224 --width 224 --task feature-extraction --library-name sentence_transformers --subfolder 0_CLIPModel clip_emb/
[Inference]
Caveat
Since compiled models with dynamic batch size enabled only accept tensors with the same batch size, we cannot set
dynamic_batch_size=True
if the input texts and images have different batch sizes. And asNeuronModelForSentenceTransformers
pad the inputs to the batch size used during the compilation, you could use a relatively larger batch_size during the compilation for flexibility with the trade-off of compute).eg. if you want to encode 3 or 4 or 5 texts and 1 image, you could set
text_batch_size = 5 = max(3, 4, 5)
andimage_batch_size = 1
during the compilation.Before submitting