inference-speed

Here are 4 public repositories matching this topic...

Ki6an / fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.

python nlp fast translation deep-learning inference pytorch transformer question-answering quantization onnx t5 onnxruntime fastt5 quantized-onnx-models inference-speed

Updated Apr 24, 2023
Python

HKUDS / SepLLM

Star

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

inference-speed large-language-models llms

Updated Dec 20, 2024
Python

renebidart / text-classification-benchmark

Star

Inference speed / accuracy tradeoff on text classification with transformer models such as BERT, RoBERTa, DeBERTa, SqueezeBERT, MobileBERT, Funnel Transformer, etc.

text-classification transformer bert efficient-inference inference-speed fast-text-classification efficient-transfo

Updated Feb 7, 2023
Jupyter Notebook

1023280072 / test_cpu_inference_speed

Star

用于测试mmdetection模型的CPU推理速度

cpu mmdetection inference-speed

Updated Dec 4, 2022
Python

Improve this page

Add a description, image, and links to the inference-speed topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference-speed topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference-speed

Here are 4 public repositories matching this topic...

Ki6an / fastT5

HKUDS / SepLLM

renebidart / text-classification-benchmark

1023280072 / test_cpu_inference_speed

Improve this page

Add this topic to your repo