Releases: bnosac/audio.whisper
Releases · bnosac/audio.whisper
0.4.1
0.4
CHANGES IN audio.whisper VERSION 0.4
- Allow to pass on multiple offset/durations
- Allow to give sections in the audio (e.g. detected with a voice activity detector) to filter out these (voiced) data, make the transcription and make sure to add the amount of time which was cut out to the from/to timestamps such that the resulting timepoints in from/to are aligned to the original audio file
- The data element of the predict.whisper now includes a column called segment_offset indicating the offset of the provided sections or offsets
0.3.3
0.3.2
CHANGES IN audio.whisper VERSION 0.3.2
- Documentation of arguments in predict.whisper
- Add option to download quantised models
- tiny-q5_1, tiny.en-q5_1
- base-q5_1, base.en-q5_1
- small-q5_1, small.en-q5_1
- medium-q5_0, medium.en-q5_0
- large-v2-q5_0 and large-v3-q5_0
- Allow to disable printing the transcription evolution during the prediction with the trace argument
- Enable O3 optimisations by default
- Allow speedup of transcriptions by compiling with cuBLAS against CUDA on Linux
- specify Sys.setenv(WHISPER_CUBLAS = "1") before installing the package if you have a GPU with CUDA
0.3.1
CHANGES IN audio.whisper VERSION 0.3.1
- Makevars
- Added detection of AVX512F for adding compilation flags to PKG_CFLAGS/PKG_CPPFLAGS
- Enable Metal for speeding up transcriptions on the GPU on Mac
- Enable compiling with OpenBLAS to speed up the transcriptions
- Add whisper_languages to get a data.frame of all languages the Whisper model can handle
- whisper_download_model
- change default timeout to 10 minutes if no timeout is set by the user + change output element in the list to 'download_success' instead of 'download_failed'
- model_dir now defaults to the directory set in the environment variable WHISPER_MODEL_DIR and if this is not set, the current working directory
- whisper
- Add option use_gpu to be able to run the prediction on a GPU (e.g. Metal)
- predict.whisper
- Add option to pass on initial prompt
- Output of predict.whisper adds the audio duration of the wav file in seconds in the params list element
- Gains an extra argument indicating to transcribe or translate
0.3
0.2.2
CHANGES IN audio.whisper VERSION 0.2.2
- Add option to pass on float entropy_thold (similar to compression_ratio_threshold), logprob_thold, beam_size, best_of, split_on_word, max_context when doing the prediction
- Output of the predict.whisper function now includes an element called timing indicating how long it took to do the transcription
- whisper gains 2 arguments: the model_dir/overwrite which is passed directly to whisper_download_model
- whisper_download_model
- gains an argument version which defaults to models for whisper.cpp version 1.2.1
- gets the models now from https://huggingface.co/ggerganov/whisper.cpp/resolve/80da2d8bfee42b0e836fc3a9890373e5defc00a6 instead of https://huggingface.co/ggerganov/whisper.cpp/resolve/main