Whisper ASR CLI is a command-line interface for Automatic Speech Recognition (ASR) using OpenAI's Whisper ASR model. It allows users to record audio, save it to a file, and obtain transcriptions using the Whisper ASR model.
Speech recognition is a fundamental component of many applications, including voice assistants, transcription services, and more. The goal of this project is to provide a simple and user-friendly interface for leveraging the power of OpenAI's Whisper ASR model.
- Record audio from the command line.
- Save recorded audio to a specified file.
- Obtain transcriptions using the Whisper ASR model.
-
Record Audio:
- Press 'R' to start recording.
- Press 'S' to stop recording.
- Press 'Q' to quit the recording process.
-
Save Audio:
- The recorded audio is saved to a specified file using the soundfile library.
-
Transcribe Audio:
- The recorded audio file is transcribed using the Whisper ASR model.
- Python 3.6 or higher
- Dependencies: See
requirements.txt
-
Clone the repository:
git clone https://github.com/alwaz-shahid/whisper-asr-cli.git cd whisper-asr-cli
-
Install dependencies and run the CLI:
pip install -r requirements.txt
python main.py
- Configuration
- The sample rate and maximum recording duration can be configured in asr/asr_manager.py.
- Default output file path and recording duration are set in main.py.
- Contributing
- Contributions are welcome! Feel free to open issues, submit feature requests, or create pull requests.
- License
- This project is licensed under the MIT License.
- Acknowledgments
- OpenAI for the Whisper ASR model.
- Developers of the libraries and tools used in this project.