v1.1.0
Pre-releaseOverview
The major change in this pre-release is the improved decoding implementation in whisper.cpp
:
- Support for average logprob and entropy based criteria for fallback
- Support for temperature
T > 0
- Improved Greedy decoder via
best_of
parameter forT > 0
- Add beam search decoding (a.k.a
beam_size
)
More information about the decoding changes can be found in #291
Additionally, there are a few performance improvements for Apple Silicon, WASM and non-F16C platforms.
Support for POWER9 architectures has been added.
The reason that this is a pre-release and not an official release is that the new implementation has not been sufficiently tested yet and the existing bindings for other languages have not been updated to support the API changes. The official release 1.1.x
will be created when there is enough feedback about the new decoding implementation and when the bindings have been updated. So make sure to send your feedback in the discussion created for this pre-release. For now, the 1.0.4
release should be considered more stable.
What's Changed
Core ggml
/ whisper
ggml
: POWER9 support by @fitzsim in #320, #349, #369ggml
: simplify the SIMD code by @ggerganov in #324ggml
: add SSE3 and fp16 conversion lookup table by @abitofevrything in #368ggml
: utilise Accelerate's vDSP for some computations d51fc3eggml
: speed-up softmax compute via Accelerate and loop unrolling d61d55cggml
: do not start extra threads when using BLAS d347a59whisper
: do sample_to_timestamp calculation with 64 bit precision to avoid overflow by @boolemancer in #388whisper
: various code clean-up and improvements by @asmaloney in #317 #318 #319 #322 etcwhisper
: improve decoding by @ggerganov in #291whisper
: account for speed_up flag for short audio #405
C-style API
- Add loader class to allow loading from buffer and others by @prsyahmi in #353
- Add
whisper_token_data::plog
- Add
whisper_init_from_file()
- Add
whisper_init_from_buffer()
- Change
whisper_init()
- Remove
whisper_sample_best()
- Remove
whisper_sample_timestamp()
- Add
whisper_n_audio_ctx()
- Add
whisper_get_logits()
- Remove
whisper_get_probs()
- Change
struct whisper_full_params
Bindings
Examples
whisper.android
: remove android ABI constraint by @Digipom in #301whisper.swiftui
: SwiftUI example by @Digipom in #308main
: add-ocsv
, aka--output-csv
for writing CSV file containing millisecond timestamps by @NielsMayer in #340command
: refactor to split command list & general transcription modes by @asmaloney in #331command
: always-prompt mode by @dnhkng in #383stream
: fix data race on bool + avoid division-by-zero a466c34stream
: fix a bug that inserted a lot of empty audio at the start a6dbd91bench.wasm
: print system info fafd789
New Contributors
- @djthorpe made their first contribution in #287
- @0xmohit made their first contribution in #296
- @asmaloney made their first contribution in #298
- @fitzsim made their first contribution in #320
- @NielsMayer made their first contribution in #340
- @aviks made their first contribution in #345
- @eltociear made their first contribution in #346
- @abitofevrything made their first contribution in #368
- @Mike-Bell made their first contribution in #381
- @dnhkng made their first contribution in #383
- @prsyahmi made their first contribution in #353
- @ianb made their first contribution in #391
Full Changelog: v1.0.4...v1.1.0
Highlights
- Sample SwiftUI application example/whisper.swiftui