Skip to content

v1.7.3

Latest
Compare
Choose a tag to compare
@ggerganov ggerganov released this 18 Dec 16:15
· 12 commits to master since this release
3de9dee

Overview

  • Massive performance improvements for the Metal backend, especially for beams > 1 and for quantized models
  • Reduce hallucinations during silence by @jkarthic in #2629
  • Implement no_speech_thold by @jkarthic in #2625
CPU Config Model Th FA Enc. Dec. Bch5 PP Commit
M2 Ultra Metal tiny 1 1 7.90 1.26 0.35 0.01 ed733e8
M2 Ultra Metal tiny-q5_0 1 1 8.44 1.23 0.36 0.01 ed733e8
M2 Ultra Metal tiny-q5_1 1 1 8.26 1.27 0.37 0.01 ed733e8
M2 Ultra Metal tiny-q8_0 1 1 8.03 1.21 0.35 0.01 ed733e8
M2 Ultra Metal base 1 1 13.77 1.80 0.42 0.02 ed733e8
M2 Ultra Metal base-q5_0 1 1 15.02 1.72 0.42 0.02 ed733e8
M2 Ultra Metal base-q5_1 1 1 14.93 1.74 0.42 0.02 ed733e8
M2 Ultra Metal base-q8_0 1 1 14.26 1.68 0.41 0.02 ed733e8
M2 Ultra Metal small 1 1 39.76 3.54 0.85 0.05 ed733e8
M2 Ultra Metal small-q5_0 1 1 45.07 3.47 0.87 0.05 ed733e8
M2 Ultra Metal small-q5_1 1 1 44.82 3.49 0.87 0.05 ed733e8
M2 Ultra Metal small-q8_0 1 1 41.79 3.30 0.84 0.05 ed733e8
M2 Ultra Metal medium 1 1 106.73 7.28 1.78 0.11 ed733e8
M2 Ultra Metal medium-q5_0 1 1 124.43 6.63 1.83 0.12 ed733e8
M2 Ultra Metal medium-q5_1 1 1 124.19 6.70 1.84 0.12 ed733e8
M2 Ultra Metal medium-q8_0 1 1 113.88 6.52 1.75 0.11 ed733e8
M2 Ultra Metal medium-dis 1 1 94.97 0.97 0.22 0.01 ed733e8
M2 Ultra Metal large-v2 1 1 193.33 10.53 2.65 0.20 ed733e8
M2 Ultra Metal large-v2-q5_0 1 1 229.22 9.52 2.72 0.23 ed733e8
M2 Ultra Metal large-v2-q5_1 1 1 229.40 9.62 2.73 0.23 ed733e8
M2 Ultra Metal large-v2-q8_0 1 1 207.30 9.36 2.59 0.21 ed733e8
M2 Ultra Metal large-v2-dis 1 1 171.43 1.09 0.25 0.02 ed733e8
M2 Ultra Metal large-v3-turbo 1 1 173.45 1.73 0.41 0.03 ed733e8
M2 Ultra Metal large-v3-turbo-q5_0 1 1 205.52 1.52 0.42 0.04 ed733e8
M2 Ultra Metal large-v3-turbo-q8_0 1 1 185.90 1.48 0.40 0.03 ed733e8

What's Changed

New Contributors

Full Changelog: v1.7.2...v1.7.3