Fix ONNX Runtime cache usage for decoders, add relevant tests #756

fxmarty · 2023-02-07T11:28:53Z

This should fix at least partially #753

It seems possible that the prepare_inputs_for_generation redefined in modeling_decoder.py and modeling_seq2seq.py is not valid for some models that do some special preprocessing before the self call in generate().

Tests are added to make sure reusing past key values is faster (on CPU).

HuggingFaceDocBuilderDev · 2023-02-07T11:50:03Z

The documentation is not available anymore as the PR was closed or merged.

michaelbenayoun

LGTM

echarlaix

Thanks for the fix @fxmarty

fix onnxruntime cache usage

c460bd4

fxmarty requested review from michaelbenayoun, mht-sharma and JingyaHuang February 7, 2023 11:29

fix num beams

d656988

fix test

27edfd6

fxmarty mentioned this pull request Feb 7, 2023

Fix past key values usage huggingface/optimum-intel#187

Merged

michaelbenayoun approved these changes Feb 8, 2023

View reviewed changes

echarlaix approved these changes Feb 8, 2023

View reviewed changes

fxmarty merged commit 4d3ec82 into huggingface:main Feb 9, 2023

fxmarty mentioned this pull request Feb 11, 2023

Support bigbird ONNX export with attention_type == "block_sparse" #754

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ONNX Runtime cache usage for decoders, add relevant tests #756

Fix ONNX Runtime cache usage for decoders, add relevant tests #756

fxmarty commented Feb 7, 2023

HuggingFaceDocBuilderDev commented Feb 7, 2023 •

edited

Loading

michaelbenayoun left a comment

echarlaix left a comment

Fix ONNX Runtime cache usage for decoders, add relevant tests #756

Fix ONNX Runtime cache usage for decoders, add relevant tests #756

Conversation

fxmarty commented Feb 7, 2023

HuggingFaceDocBuilderDev commented Feb 7, 2023 • edited Loading

michaelbenayoun left a comment

Choose a reason for hiding this comment

echarlaix left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Feb 7, 2023 •

edited

Loading