#

quantized-onnx-models

Here is 1 public repository matching this topic...

fastT5

Ki6an / fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.

python nlp fast translation deep-learning inference pytorch transformer question-answering quantization onnx t5 onnxruntime fastt5 quantized-onnx-models inference-speed

Updated Apr 24, 2023
Python

Improve this page

Add a description, image, and links to the quantized-onnx-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the quantized-onnx-models topic, visit your repo's landing page and select "manage topics."