Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

bnosac / audio.whisper Public

Notifications You must be signed in to change notification settings
Fork 13
Star 119

Code
Issues 20
Pull requests 4
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: bnosac/audio.whisper

Releases · bnosac/audio.whisper

0.4.1

07 May 12:33

jwijffels

This commit was created on github.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

0.4.1 Latest

Latest

CHANGES IN audio.whisper VERSION 0.4.1

Added function predict.whisper_transcription which allows to assign a transcription segment to either a left/right channel based on a Voice Activity Detection

Assets 2

Loading

All reactions

0.4

18 Mar 11:53

jwijffels

This commit was created on github.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

0.4

CHANGES IN audio.whisper VERSION 0.4

Allow to pass on multiple offset/durations
Allow to give sections in the audio (e.g. detected with a voice activity detector) to filter out these (voiced) data, make the transcription and make sure to add the amount of time which was cut out to the from/to timestamps such that the resulting timepoints in from/to are aligned to the original audio file
The data element of the predict.whisper now includes a column called segment_offset indicating the offset of the provided sections or offsets

Assets 2

Loading

All reactions

0.3.3

16 Mar 23:17

jwijffels

Compare

Choose a tag to compare

Loading

0.3.3

CHANGES IN audio.whisper VERSION 0.3.3

Fixes of typos in documentation of functions
Add stereo.wav file
Allow to do diarization for audio with 2 channels by comparing the energy of the signal in each channel for each segment

Assets 2

Loading

jmgirard reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

0.3.2

04 Mar 21:11

jwijffels

Compare

Choose a tag to compare

Loading

0.3.2

CHANGES IN audio.whisper VERSION 0.3.2

Documentation of arguments in predict.whisper
Add option to download quantised models
- tiny-q5_1, tiny.en-q5_1
- base-q5_1, base.en-q5_1
- small-q5_1, small.en-q5_1
- medium-q5_0, medium.en-q5_0
- large-v2-q5_0 and large-v3-q5_0
Allow to disable printing the transcription evolution during the prediction with the trace argument
Enable O3 optimisations by default
Allow speedup of transcriptions by compiling with cuBLAS against CUDA on Linux
- specify Sys.setenv(WHISPER_CUBLAS = "1") before installing the package if you have a GPU with CUDA

Assets 2

Loading

All reactions

0.3.1

05 Feb 20:32

jwijffels

Compare

Choose a tag to compare

Loading

0.3.1

CHANGES IN audio.whisper VERSION 0.3.1

Makevars
- Added detection of AVX512F for adding compilation flags to PKG_CFLAGS/PKG_CPPFLAGS
- Enable Metal for speeding up transcriptions on the GPU on Mac
- Enable compiling with OpenBLAS to speed up the transcriptions
Add whisper_languages to get a data.frame of all languages the Whisper model can handle
whisper_download_model
- change default timeout to 10 minutes if no timeout is set by the user + change output element in the list to 'download_success' instead of 'download_failed'
- model_dir now defaults to the directory set in the environment variable WHISPER_MODEL_DIR and if this is not set, the current working directory
whisper
- Add option use_gpu to be able to run the prediction on a GPU (e.g. Metal)
predict.whisper
- Add option to pass on initial prompt
- Output of predict.whisper adds the audio duration of the wav file in seconds in the params list element
- Gains an extra argument indicating to transcribe or translate

Assets 2

Loading

All reactions

0.3

27 Jan 18:46

jwijffels

This commit was created on github.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

0.3

CHANGES IN audio.whisper VERSION 0.3

Upgrade to whisper.cpp version v1.5.4
whisper_download_model allows to download 'large-v1', 'large-v2', 'large-v3' while model 'large' should no longer be used

Assets 2

Loading

All reactions

0.2.2

27 Jan 18:07

jwijffels

Compare

Choose a tag to compare

Loading

0.2.2

CHANGES IN audio.whisper VERSION 0.2.2

Add option to pass on float entropy_thold (similar to compression_ratio_threshold), logprob_thold, beam_size, best_of, split_on_word, max_context when doing the prediction
Output of the predict.whisper function now includes an element called timing indicating how long it took to do the transcription
whisper gains 2 arguments: the model_dir/overwrite which is passed directly to whisper_download_model
whisper_download_model
- gains an argument version which defaults to models for whisper.cpp version 1.2.1
- gets the models now from https://huggingface.co/ggerganov/whisper.cpp/resolve/80da2d8bfee42b0e836fc3a9890373e5defc00a6 instead of https://huggingface.co/ggerganov/whisper.cpp/resolve/main

Assets 2

Loading

All reactions

0.2.1-1

22 Jul 21:21

jwijffels

Compare

Choose a tag to compare

Loading

0.2.1-1

CHANGES IN audio.whisper VERSION 0.2.1-1

whisper_download_model now Deprecates downloading from https://ggml.ggerganov.com and changed the URL's to download models from huggingface (Issue #18)

Assets 2

Loading

All reactions

0.2.1

19 Mar 13:39

jwijffels

Compare

Choose a tag to compare

Loading

0.2.1

CHANGES IN audio.whisper VERSION 0.2.1

Add option to compile with own PKG_CFLAGS by setting environment variable WHISPER_CFLAGS
Add option to compile with extra PKG_CPPFLAGS by setting environment variable WHISPER_CPPFLAGS

Assets 2

Loading

All reactions

0.2.0

19 Mar 13:39

jwijffels

Compare

Choose a tag to compare

Loading

0.2.0

CHANGES IN audio.whisper VERSION 0.2.0

Incorporate whisper.cpp version v1.2.1

Assets 2

Loading

All reactions

Previous 1 2 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.