Skip to content

Audio transcription using mlx whisper and vad silence processing

License

Notifications You must be signed in to change notification settings

mbotsu/mlx_speech2text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Abstract

Transcription for Apple Silicon.

Segmentation is performed to divide the sound source into small chunks, a sound source is created by removing silent parts for each chunk, and text is extracted.

Install

$ git clone https://github.com/mbotsu/mlx_speech2text.git
$ pip install -r requirements.txt

Run

// convert to wav 16K
$ ffmpeg -i input.mp4 -ar 16000 out.wav

// run
$ python speech2text.py -i out.wav -o track -v

References

About

Audio transcription using mlx whisper and vad silence processing

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages