Skip to content

Automatic Speech Recognition ASR / Speech To Text STT demonstration using Whisper/base model. The cli python application transcribe an audio to text, works offline.

License

Notifications You must be signed in to change notification settings

alwaz-shahid/whisper-asr-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper ASR CLI

Whisper ASR CLI is a command-line interface for Automatic Speech Recognition (ASR) using OpenAI's Whisper ASR model. It allows users to record audio, save it to a file, and obtain transcriptions using the Whisper ASR model.

Motivation

Speech recognition is a fundamental component of many applications, including voice assistants, transcription services, and more. The goal of this project is to provide a simple and user-friendly interface for leveraging the power of OpenAI's Whisper ASR model.

Features

  • Record audio from the command line.
  • Save recorded audio to a specified file.
  • Obtain transcriptions using the Whisper ASR model.

How it Works

  1. Record Audio:

    • Press 'R' to start recording.
    • Press 'S' to stop recording.
    • Press 'Q' to quit the recording process.
  2. Save Audio:

    • The recorded audio is saved to a specified file using the soundfile library.
  3. Transcribe Audio:

Prerequisites

  • Python 3.6 or higher
  • Dependencies: See requirements.txt

Getting Started

  1. Clone the repository:

    git clone https://github.com/alwaz-shahid/whisper-asr-cli.git
    cd whisper-asr-cli
    
    
  2. Install dependencies and run the CLI:

pip install -r requirements.txt
python main.py
  1. Configuration
  • The sample rate and maximum recording duration can be configured in asr/asr_manager.py.
  • Default output file path and recording duration are set in main.py.
  1. Contributing
  • Contributions are welcome! Feel free to open issues, submit feature requests, or create pull requests.
  1. License
  • This project is licensed under the MIT License.
  1. Acknowledgments
  • OpenAI for the Whisper ASR model.
  • Developers of the libraries and tools used in this project.

About

Automatic Speech Recognition ASR / Speech To Text STT demonstration using Whisper/base model. The cli python application transcribe an audio to text, works offline.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages