v0.0.19: AWS Neuron SDK 2.17.0, training cache system, TGI improved batching

JingyaHuang released this 19 Feb 15:48

· 149 commits to main since this release

What's Changed

Training

Integrate new cache system for training by @michaelbenayoun in #472

TGI

Support higher batch sizes using transformers-neuronx continuous batching by @dacorvo in #488
Lift max-concurrent-request limitation usingTGI 1.4.1 by @dacorvo in #488

AMI

Add packer support for building AWS AMI by @shub-kris in #441
[AMI] Updates base ami to new id by @philschmid in #482

Major bugfixes

Fix sdxl inpaint pipeline for diffusers 0.26.* by @JingyaHuang in #458
TGI: update to controller version 1.4.0 & bug fixes by @dacorvo in #470
Fix optimum-cli export for inf1 by @JingyaHuang in #474

Other changes

Add TGI tests and CI workflow by @dacorvo in #355
Bump to optimum 1.17 - Adapt to optimum exporter refactoring by @JingyaHuang in #414
[Training] Support for Transformers 4.37 by @michaelbenayoun in #459
Add contribution guide for Neuron exporter by @JingyaHuang in #461
Fix path, update versions by @shub-kris in #462
Add issue and PR templates & build optimum env cli for Neuron by @JingyaHuang in #463
Fix trigger for actions by @philschmid in #468
TGI: bump rust version by @dacorvo in #477
[documentation] Add Container overview page. by @philschmid in #481
Bump to Neuron sdk 2.17.0 by @JingyaHuang in #487

New Contributors

@shub-kris made their first contribution in #441

Full Changelog: v0.0.18...v0.0.19

Contributors

dacorvo, shub-kris, and 3 other contributors

Assets 2