Skip to content

v1.3.0

Compare
Choose a tag to compare
@ggerganov ggerganov released this 15 Apr 14:41
· 1451 commits to master since this release
c23588c

Overview

This release should be considered in Beta stage, since I haven't done a lot of testing and I am not sure if I didn't break something.
But overall, I believe both the performance and the quality are improved.

  • Added Core ML support #566
  • Restored decoding fallbacks with default size of 2 instead of 5 (f19e23f)
  • Pad the audio with zeros instead of the spectrogram (5108b30)
  • Added talk-llama example
  • Added whisper_state which allows parallel transcriptions with a single model in memory (#523)

The C-style API has been extended significantly to support the new whisper_state, but in general should be backwards compatible.
The only breaking change is in the callbacks signatures.

Please provide feedback in the discussion if you observe any issues.

The next release v1.4.0 will follow up relatively soon and will provide 4-bit integer quantization support.

What's Changed

New Contributors

Full Changelog: v1.2.1...v1.3.0