Release v1.3.0 · ggerganov/whisper.cpp

Overview

This release should be considered in Beta stage, since I haven't done a lot of testing and I am not sure if I didn't break something.
But overall, I believe both the performance and the quality are improved.

Added Core ML support #566
Restored decoding fallbacks with default size of 2 instead of 5 (f19e23f)
Pad the audio with zeros instead of the spectrogram (5108b30)
Added talk-llama example
Added whisper_state which allows parallel transcriptions with a single model in memory (#523)

The C-style API has been extended significantly to support the new whisper_state, but in general should be backwards compatible.
The only breaking change is in the callbacks signatures.

Please provide feedback in the discussion if you observe any issues.

The next release v1.4.0 will follow up relatively soon and will provide 4-bit integer quantization support.

What's Changed

update csv output format to match OpenAI's Whisper dataframe output by @hykelvinlee42 in #552
Go binding: NewContext now returns a clean context by @polarmoon in #537
Added whisper state + default state on the whisper_context by @sandrohanea in #523
whisper.android: Enable fp16 instrinsics (FP16_VA) which is supported by ARMv8.2 or later. by @tinoue in #572
Add quality comparison helper by @venkr in #569
whisper.android: Support benchmark for Android example. by @tinoue in #542
Fix MUSL Linux build by @ggerganov in #576
Change default encoding to UTF-8 by @Kamilake in #605
Provide option for creating JSON output by @tuxpoldo in #615
readme : add react-native bindings by @jhen0409 in #619
Fixed language auto-detection for state provided processing. by @sandrohanea in #627
xcodeproj : add -O3 -DNDEBUG in release mode by @jhen0409 in #640
Nodejs Addon blocking main thread. Implemented Napi::AsyncWorker by @LucasZNK in #642
Include link to R wrapper in README by @jwijffels in #626
Add a cmake flag to disable F16C by @a5huynh in #628
Add talk-llama example by @ggerganov in #664
Add Alpaca support to talk-llama example by @ejones in #668
Update README.md by @razodactyl in #682
issue #470 - working 32-bit ARM by @clach04 in #486
whisper : add initial_prompt param by @jhen0409 in #645
fix typo in JSON output by @egorFiNE in #648
Fix shell script ./models/download-ggml-model.sh to handle spaces and special characters in paths by @be-next in #677
Fixed test to new async implementation by @LucasZNK in #686
Minor: fixing usage message for talk-llama by @InconsolableCellist in #687
Small typo by @ZiggerZZ in #688
feat: add progress callback by @pajowu in #600
ggml : fix q4_1 dot product types by @novag in #759
Exposed various parts to the Go Interface by @bmurray in #697
Adds shell command example for --print-colors by @bocytko in #710
Makefile: disable avx in case f16c is not available by @duthils in #706
Making the quick start instructions clearer. by @Onlyartist9 in #716
Add lrc output support by @WhichWho in #718
Corrects default speak.sh path in talk-llama by @mab122 in #720
Add msvc compiler args /utf-8 fix error C3688 by @WhichWho in #721
Changed convert-pt-to-ggml.py to use .tiktoken tokenizer files by @ivan-gorin in #725
talk/talk-llama: add basic example script for eleven-labs tts by @DGdev91 in #728
readme : add Unity3d bindings by @Macoron in #733
Update stream.cpp by @AliAlameh in #501
Fix typos in whisper.h by @GitAritron in #737
Update LICENSE by @masguit42 in #739
fix potential memory leaks by @baderouaich in #740
readme: Add alternate swift bindings by @exPHAT in #755
Fix the bug related to word splitting errors in the "tokenize" function. by @AfryMask in #760
Do not launch threads for log_mel_spectrogram when singlethreaded by @maxilevi in #763
Core ML support by @ggerganov in #566
ggml : fix build on whisper.android (ARM_NEON) by @jhen0409 in #764

New Contributors

@hykelvinlee42 made their first contribution in #552
@tinoue made their first contribution in #572
@venkr made their first contribution in #569
@Kamilake made their first contribution in #605
@tuxpoldo made their first contribution in #615
@jhen0409 made their first contribution in #619
@LucasZNK made their first contribution in #642
@jwijffels made their first contribution in #626
@a5huynh made their first contribution in #628
@ejones made their first contribution in #668
@razodactyl made their first contribution in #682
@clach04 made their first contribution in #486
@egorFiNE made their first contribution in #648
@be-next made their first contribution in #677
@InconsolableCellist made their first contribution in #687
@ZiggerZZ made their first contribution in #688
@pajowu made their first contribution in #600
@novag made their first contribution in #759
@bmurray made their first contribution in #697
@bocytko made their first contribution in #710
@duthils made their first contribution in #706
@Onlyartist9 made their first contribution in #716
@WhichWho made their first contribution in #718
@mab122 made their first contribution in #720
@ivan-gorin made their first contribution in #725
@DGdev91 made their first contribution in #728
@Macoron made their first contribution in #733
@AliAlameh made their first contribution in #501
@GitAritron made their first contribution in #737
@masguit42 made their first contribution in #739
@baderouaich made their first contribution in #740
@exPHAT made their first contribution in #755
@AfryMask made their first contribution in #760
@maxilevi made their first contribution in #763

Full Changelog: v1.2.1...v1.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.3.0

Overview

What's Changed

New Contributors

Contributors