Skip to content

v0.0.17: AWS Neuron SDK 2.16, Mistral, sentence transformers, inference cache

Compare
Choose a tag to compare
@dacorvo dacorvo released this 19 Jan 07:19

What's Changed

AWS SDK

  • Use AWS Neuron SDK 2.16 (#398)
  • Use offical serialization API for transformers_neuronx models instead of beta by @aws-yishanm (#387, #393)

Inference

  • Improve the support of sentence transformers by @JingyaHuang (#408)
  • Add Neuronx compile cache Hub proxy and use it for LLM decoder models by @dacorvo (#410)
  • Add support for Mistral models by @dacorvo (#411)
  • Do not upload Neuron LLM weights when they can be fetched from the hub by @dacorvo (#413)

Training

Tutorials and doc improvement

Major bugfixes

Other changes

New Contributors

Full Changelog: v0.0.16...v0.0.17