Generate: move `prepare_inputs_for_generation` in encoder-decoder llms #34048

gante · 2024-10-09T18:34:25Z

What does this PR do?

Part of step 6 in #32685
Follow-up to #33870

This PR:

Adds a minor change to GenerationMixin.prepare_inputs_for_generation to use decoder_input_ids in encoder-decoder models
Deletes almost all prepare_inputs_for_generation in encoder-decoder llms 🔪 😎

gante · 2024-10-09T18:35:49Z

@zucchini-nlp this PR may have a conflict with your encoder-decoder+compile PR 👀

zucchini-nlp

Thanks! I will update my PR when this one gets merged. Left a tiny question about Blip-2, overall LGTM as long as the tests don't complain

zucchini-nlp · 2024-10-10T08:43:40Z

src/transformers/models/blip/modeling_blip_text.py

-            "past_key_values": past_key_values,
-            "encoder_hidden_states": model_kwargs.get("encoder_hidden_states", None),
-            "encoder_attention_mask": model_kwargs.get("encoder_attention_mask", None),
-            "is_decoder": True,


is it okay we're losing this? Seems like BLIP was forcefully passing this kwarg for later setting the cache?

O think we don't have tests for BlipText, neither for VLM part so we can't rely on tests for BLIP 😭 (I'll work on it soon, rn I'm working on Idefics models and BLIP will be next)

uhmm perhaps -- is_decoder=True is the default everywhere (in forward, in the config), but the user could force it to False. Going to revert

(I suspect this class is never used with is_decoder=True, but too late to fix that :D )

yeah, blip is a difficult case, better keep it overriden hehe

ArthurZucker

🧼 🧼 🧼 🧼 Very nice!

gante · 2024-10-11T15:11:12Z

Ran the following slow tests before merging:

Llama
BART
T5 (same failures as main)

Don't assume that past_key_values is part of the model_kwargs. This fix is similar to huggingface#2140 but for encoder-decoder models. It became necessary after huggingface/transformers#34048 was merged into transformers.

Don't assume that past_key_values is part of the model_kwargs. This fix is similar to #2140 but for encoder-decoder models. It became necessary after huggingface/transformers#34048 was merged into transformers.

Don't assume that past_key_values is part of the model_kwargs. This fix is similar to huggingface#2140 but for encoder-decoder models. It became necessary after huggingface/transformers#34048 was merged into transformers.

SabaPivot

Could you please reply?
Much appreciated!

SabaPivot · 2024-11-22T05:02:20Z

tests/generation/test_utils.py

+    def test_prepare_inputs_for_generation_encoder_decoder_llm(self):
+        """
+        Same as `test_prepare_inputs_for_generation_decoder_llm` but for encoder-decoder models. Main difference: we
+        should look for `decoder_input_ids`, instead of `input_ids`.
+        """
+        model = AutoModelForSeq2SeqLM.from_pretrained("hf-internal-testing/tiny-random-t5")
+        model = model.to(torch_device)
+
+        # 1. Sanity check: the model's `prepare_inputs_for_generation` comes from `GenerationMixin`
+        self.assertTrue("GenerationMixin" in str(model.prepare_inputs_for_generation))
+
+        # 2. If we pass input ids by themselves, we should get back the same input ids -- with the encoder-decoder key
+        decoder_input_ids = torch.tensor([[1, 2, 3], [4, 5, 6]]).to(torch_device)
+        model_inputs = model.prepare_inputs_for_generation(decoder_input_ids)
+        self.assertTrue(torch.all(model_inputs["decoder_input_ids"] == decoder_input_ids))
+
+        # 3. If we pass the attention mask too, we will get back the attention mask. Encoder-decoder models usually
+        # don't use `position_ids`
+        decoder_attention_mask = torch.tensor([[1, 1, 1], [1, 1, 1]]).to(torch_device)
+        model_inputs = model.prepare_inputs_for_generation(
+            decoder_input_ids, decoder_attention_mask=decoder_attention_mask
+        )
+        self.assertTrue(torch.all(model_inputs["decoder_attention_mask"] == decoder_attention_mask))
+        self.assertTrue("position_ids" not in model_inputs)
+
+        # 4. `use_cache` (and other kwargs, like the encoder outputs) are forwarded
+        self.assertFalse("use_cache" in model_inputs)  # From the previous input, there is no `use_cache`
+        model_inputs = model.prepare_inputs_for_generation(decoder_input_ids, use_cache=True, encoder_outputs="foo")
+        self.assertTrue(model_inputs["use_cache"] is True)
+        self.assertTrue(model_inputs["encoder_outputs"] == "foo")
+        # See the decoder-only test for more corner cases. The code is the same, so we don't repeat it here.
+


Should I add this to my
AutoAdapterModel
to generate in adapters using T5?

If you mean the tests, you should not need to add it anywhere as it is ran only to test the correctness of new modifications.

In general it is advised to post question in the forum if it is not a bug or feature request

huggingface#34048)

gante requested review from zucchini-nlp and ArthurZucker October 9, 2024 18:34

zucchini-nlp approved these changes Oct 10, 2024

View reviewed changes

ArthurZucker approved these changes Oct 10, 2024

View reviewed changes

gante force-pushed the encoder_decoder_prepare branch 2 times, most recently from ca46d3b to 40d6c34 Compare October 11, 2024 12:16

gante added 3 commits October 11, 2024 13:44

poof

0bf42a4

rm comment

a447197

revert blip 2

369b614

gante force-pushed the encoder_decoder_prepare branch from 40d6c34 to 369b614 Compare October 11, 2024 13:44

gante added 3 commits October 11, 2024 14:24

handle trainer case

6c4ae1b

encoder-decoder models don't try to forward inputs_embeds

ed666f3

test all

af47968

gante merged commit 37ac078 into huggingface:main Oct 11, 2024
23 of 24 checks passed

gante deleted the encoder_decoder_prepare branch October 11, 2024 15:11

gante mentioned this pull request Oct 11, 2024

tracker: move prepare_inputs_for_generation into the generation mixin 🧹 #32685

Closed

8 tasks

BenjaminBossan mentioned this pull request Oct 14, 2024

FIX Don't assume past_key_valus for encoder models huggingface/peft#2149

Merged

SabaPivot reviewed Nov 22, 2024

View reviewed changes

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Generate: move prepare_inputs_for_generation in encoder-decoder llms (

d7524a4

huggingface#34048)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate: move `prepare_inputs_for_generation` in encoder-decoder llms #34048

Generate: move `prepare_inputs_for_generation` in encoder-decoder llms #34048

gante commented Oct 9, 2024

gante commented Oct 9, 2024 •

edited

Loading

zucchini-nlp left a comment

zucchini-nlp Oct 10, 2024

gante Oct 10, 2024

zucchini-nlp Oct 10, 2024

ArthurZucker left a comment

gante commented Oct 11, 2024

SabaPivot left a comment

SabaPivot Nov 22, 2024

zucchini-nlp Nov 22, 2024

Generate: move prepare_inputs_for_generation in encoder-decoder llms #34048

Generate: move prepare_inputs_for_generation in encoder-decoder llms #34048

Conversation

gante commented Oct 9, 2024

What does this PR do?

gante commented Oct 9, 2024 • edited Loading

zucchini-nlp left a comment

Choose a reason for hiding this comment

zucchini-nlp Oct 10, 2024

Choose a reason for hiding this comment

gante Oct 10, 2024

Choose a reason for hiding this comment

zucchini-nlp Oct 10, 2024

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

gante commented Oct 11, 2024

SabaPivot left a comment

Choose a reason for hiding this comment

SabaPivot Nov 22, 2024

Choose a reason for hiding this comment

zucchini-nlp Nov 22, 2024

Choose a reason for hiding this comment

Generate: move `prepare_inputs_for_generation` in encoder-decoder llms #34048

Generate: move `prepare_inputs_for_generation` in encoder-decoder llms #34048

gante commented Oct 9, 2024 •

edited

Loading