[Inference]Support sentence transformers clip #495

JingyaHuang · 2024-02-21T15:50:05Z

What does this PR do?

Follow up for #408, support the inference of sentence transformers clip.

[Compilation]

With CLI

optimum-cli export neuron -m sentence-transformers/clip-ViT-B-32 --sequence_length 64 --text_batch_size 3 --image_batch_size 1 --num_channels 3 --height 224 --width 224 --task feature-extraction --library-name sentence_transformers --subfolder 0_CLIPModel clip_emb/

Code

from optimum.neuron import NeuronModelForSentenceTransformers

# [Compile]
model_id = "sentence-transformers/clip-ViT-B-32"

# configs for compiling model
input_shapes = {
    "num_channels": 3,
    "height": 224,
    "width": 224,
    "text_batch_size": 3,
    "image_batch_size": 1,
    "sequence_length": 64,
}

emb_model = NeuronModelForSentenceTransformers.from_pretrained(
    model_id, subfolder="0_CLIPModel", export=True, library_name="sentence_transformers", dynamic_batch_size=False, **input_shapes
)

# Save locally or upload to the HuggingFace Hub
save_directory = "clip_emb/"
emb_model.save_pretrained(save_directory)

[Inference]

from PIL import Image
from sentence_transformers import util
from transformers import CLIPProcessor

from optimum.neuron import NeuronModelForSentenceTransformers

save_directory = "clip_emb"
emb_model = NeuronModelForSentenceTransformers.from_pretrained(save_directory)

processor = CLIPProcessor.from_pretrained(save_directory)
inputs = processor(
    text=["Two dogs in the snow", 'A cat on a table', 'A picture of London at night'], images=Image.open("two_dogs_in_snow.jpg"), return_tensors="pt", padding=True
)  

outputs = emb_model(**inputs)


# Compute cosine similarities
cos_scores = util.cos_sim(outputs.image_embeds, outputs.text_embeds)
print(cos_scores)

# tensor([[0.3072, 0.1016, 0.1095]])

Caveat

Since compiled models with dynamic batch size enabled only accept tensors with the same batch size, we cannot set dynamic_batch_size=True if the input texts and images have different batch sizes. And as NeuronModelForSentenceTransformers pad the inputs to the batch size used during the compilation, you could use a relatively larger batch_size during the compilation for flexibility with the trade-off of compute).

eg. if you want to encode 3 or 4 or 5 texts and 1 image, you could set text_batch_size = 5 = max(3, 4, 5) and image_batch_size = 1 during the compilation.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2024-02-21T15:54:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

docs/source/tutorials/sentence_transformers.mdx

optimum/commands/export/neuronx.py

optimum/exporters/neuron/convert.py

optimum/neuron/modeling.py

michaelbenayoun · 2024-02-27T09:51:52Z

optimum/neuron/utils/misc.py

+
+# Copied and adapted from https://github.com/huggingface/optimum/blob/d03ab100206cb9f0e62167a36ee6997424bb9bb5/optimum/utils/save_utils.py#L27
+# To remove once we can bump to a transformers release including https://github.com/huggingface/transformers/pull/29169
+def maybe_load_preprocessors(


Maybe put a check of the transformers version, if it's above the coming release, then fail. This way we can actually catch that and remove it.

Sounds good!

Just checked the PR I submited to transformers, it's not merged yet and I'm not 100% percent sure that it should be accepted by trfrs maintainers, thus completely don't know what trfr version to put here. Let's keep it this way, I will keep my eyes on how it goes...

huggingface/transformers#29169

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

michaelbenayoun

LGTM

support sentence transformers clip

cb2a498

JingyaHuang added 3 commits February 21, 2024 16:14

add doc

e93c058

fix test

3dbe4e4

update doc

2cc04f4

JingyaHuang requested review from michaelbenayoun, philschmid and dacorvo February 21, 2024 23:03

JingyaHuang added 2 commits February 22, 2024 10:09

fix tests

ea48e8a

fix export tests

97e9562

michaelbenayoun reviewed Feb 27, 2024

View reviewed changes

JingyaHuang and others added 7 commits February 27, 2024 17:59

Update docs/source/tutorials/sentence_transformers.mdx

d769f63

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update docs/source/tutorials/sentence_transformers.mdx

10c297d

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update docs/source/tutorials/sentence_transformers.mdx

1c53a3d

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update optimum/commands/export/neuronx.py

c50cbf2

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Update optimum/commands/export/neuronx.py

3e9d3ec

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

Merge branch 'main' into support-st-trfrs-clip

9169dbf

fix style

8a50570

JingyaHuang requested a review from michaelbenayoun February 27, 2024 14:12

michaelbenayoun approved these changes Feb 28, 2024

View reviewed changes

JingyaHuang merged commit 196f3c7 into main Feb 28, 2024
14 checks passed

JingyaHuang deleted the support-st-trfrs-clip branch February 28, 2024 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference]Support sentence transformers clip #495

[Inference]Support sentence transformers clip #495

JingyaHuang commented Feb 21, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 21, 2024

michaelbenayoun Feb 27, 2024

JingyaHuang Feb 27, 2024

JingyaHuang Feb 27, 2024

JingyaHuang Feb 27, 2024

michaelbenayoun left a comment

[Inference]Support sentence transformers clip #495

[Inference]Support sentence transformers clip #495

Conversation

JingyaHuang commented Feb 21, 2024 • edited Loading

What does this PR do?

Before submitting

HuggingFaceDocBuilderDev commented Feb 21, 2024

michaelbenayoun Feb 27, 2024

Choose a reason for hiding this comment

JingyaHuang Feb 27, 2024

Choose a reason for hiding this comment

JingyaHuang Feb 27, 2024

Choose a reason for hiding this comment

JingyaHuang Feb 27, 2024

Choose a reason for hiding this comment

michaelbenayoun left a comment

Choose a reason for hiding this comment

JingyaHuang commented Feb 21, 2024 •

edited

Loading