Releases · ELS-RD/transformer-deploy · GitHub

08 Feb 23:07

pommedeterresautee

Add GPT-2 acceleration support Latest

Latest

add support for decoder based model (GPT-2) on both ONNX Runtime and TensorRT
refactor triton configuration generation (simplification)
add GPT-2 model documentation (notebook)
fix CPU quantization benchmark (was not using the quant model)
fix sentence transformers bug

Assets 2

28 Dec 22:52

pommedeterresautee

add CPU support and generic GPU quantization support

What's Changed

Update requirements_gpu.txt by @sam-writer in #22
refactoring by @pommedeterresautee in #27
add CPU inference support by @pommedeterresautee in #28
Add QAT support to more models by @pommedeterresautee in #29

Full Changelog: v0.2.0...v0.3.0

Contributors

pommedeterresautee and sam-writer

Assets 2

08 Dec 22:46

pommedeterresautee

add GPU quantization support

support int-8 GPU quantization
add a tuto to perform quantization end to end
add QDQRoberta model
switch to ONNX opset 13
refactoring in the TensorRT engine creation
fix bugs
add auth token (for private HF repo)

What's Changed

Update triton by @pommedeterresautee in #11
fix README.md by @pommedeterresautee in #13
Fix install errors by @sam-writer in #20
Add auth token by @sam-writer in #19
Support GPU INT-8 quantization by @pommedeterresautee in #15

New Contributors

@sam-writer made their first contribution in #20

Full Changelog: v0.1.1...v0.2.0

Contributors

pommedeterresautee and sam-writer

Assets 2

24 Nov 08:24

pommedeterresautee

update Triton image to 21.11-py3

update Docker image
update documentation

Assets 2

23 Nov 13:41

pommedeterresautee

from PoC to library

switch from a proof of concept to a library
add support for TensorRT Python API (for best performances)
improve documentation (separate Hugging Face Infinity thing from the doc, add benchmark, etc.)
fix issues with mixed precision
add license
add tests, Github actions, Makefile
change the way the Docker image is built

Assets 2

08 Nov 21:15

pommedeterresautee

first release

all the scripts to reproduce https://medium.com/p/e1be0057a51c

Assets 2