v0.0.19: AWS Neuron SDK 2.17.0, training cache system, TGI improved batching
What's Changed
Training
- Integrate new cache system for training by @michaelbenayoun in #472
TGI
- Support higher batch sizes using transformers-neuronx continuous batching by @dacorvo in #488
- Lift max-concurrent-request limitation usingTGI 1.4.1 by @dacorvo in #488
AMI
- Add packer support for building AWS AMI by @shub-kris in #441
- [AMI] Updates base ami to new id by @philschmid in #482
Major bugfixes
- Fix sdxl inpaint pipeline for diffusers 0.26.* by @JingyaHuang in #458
- TGI: update to controller version 1.4.0 & bug fixes by @dacorvo in #470
- Fix optimum-cli export for inf1 by @JingyaHuang in #474
Other changes
- Add TGI tests and CI workflow by @dacorvo in #355
- Bump to optimum 1.17 - Adapt to optimum exporter refactoring by @JingyaHuang in #414
- [Training] Support for Transformers 4.37 by @michaelbenayoun in #459
- Add contribution guide for Neuron exporter by @JingyaHuang in #461
- Fix path, update versions by @shub-kris in #462
- Add issue and PR templates & build optimum env cli for Neuron by @JingyaHuang in #463
- Fix trigger for actions by @philschmid in #468
- TGI: bump rust version by @dacorvo in #477
- [documentation] Add Container overview page. by @philschmid in #481
- Bump to Neuron sdk 2.17.0 by @JingyaHuang in #487
New Contributors
- @shub-kris made their first contribution in #441
Full Changelog: v0.0.18...v0.0.19