You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're trying to convert a fine-tuned T5 model to ONNX and run it in Triton. We've managed to convert the model to ONNX and use the T5 notebook guide to run the model just fine in python.
But trying to get it to run in Triton has been a challenge. In particular, we're not sure how to get past_key_values to be passed through in Triton. We have the decoder config as follows:
Hello,
Thanks for trying our library,
We are actually working on adding T5 officialy in the convert script so that you can do conversion with one line command,
It will be added very soon (especially onnx conversion maybe in less than a week), but if you want I can help you with the triton configuration (it is a little bit complicated).
Hi there,
Thanks again for this library!
We're trying to convert a fine-tuned T5 model to ONNX and run it in Triton. We've managed to convert the model to ONNX and use the T5 notebook guide to run the model just fine in python.
But trying to get it to run in Triton has been a challenge. In particular, we're not sure how to get
past_key_values
to be passed through in Triton. We have the decoder config as follows:And when we do the following:
We get this error:
Any idea how we can fix this?
The text was updated successfully, but these errors were encountered: