Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support exporting text-generation models for sequence classification to ONNX #1527

Closed
dwyatte opened this issue Nov 10, 2023 · 3 comments · Fixed by #1679
Closed

Support exporting text-generation models for sequence classification to ONNX #1527

dwyatte opened this issue Nov 10, 2023 · 3 comments · Fixed by #1679

Comments

@dwyatte
Copy link
Contributor

dwyatte commented Nov 10, 2023

Feature request

Exporting text-generation models (e.g., LlamaForSequenceClassification) was disabled in #1308

Given https://arxiv.org/abs/2310.01208, these models can outperform typical encoder models for sequence classification (I can confirm this on my own datasets)

What would it take to support this in transformers? CC @fxmarty since it was mentioned specifically in the PR above

Motivation

I would like to export fine-tuned sequence classification models that use a decoder-only model as their base architecture to ONNX

Your contribution

Happy to submit a PR to transformers with guidance on what would be needed

@dwyatte
Copy link
Contributor Author

dwyatte commented Nov 11, 2023

Ah, I see what's happening -- ONNX doesn't support int64 inputs to argmax which is how these models are computing the sequence lengths for pooling. Will open a PR over in transformers and leave this open until we can enable export in this repo

@fxmarty
Copy link
Contributor

fxmarty commented Nov 14, 2023

Thank you for the fix!

@dwyatte
Copy link
Contributor Author

dwyatte commented Dec 19, 2023

Requires one more fix in transformers to get output validation passing with the current dummy inputs: huggingface/transformers#28144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants