WhisperBox

A powerful command-line tool for transcribing and analyzing audio recordings with AI assistance. Record meetings, lectures, or any audio directly from your terminal and get instant transcriptions with summaries, sentiment analysis, and topic detection.

Available in two versions:

Free: Command-line interface (CLI) version - Open source and MIT licensed
Paid: GUI version with native desktop interface - One-time $10 purchase to support development

The GUI version offers the same powerful features in a user-friendly interface, perfect for those who prefer not to use the terminal. Purchase helps support ongoing development of free and open source AI tools. Purchase WhisperBox GUI version here

Features

Live audio recording through terminal
Multiple transcription models via Whisper AI
AI-powered analysis including:
- Meeting summaries and action items
- Keynote presentation generation
- Speech quality feedback
- Modify voice input for any application
Support for multiple AI providers:
- Anthropic
- OpenAI
- Groq
- Ollama (local models)
Export to Markdown
Rich terminal UI with color-coded output
Configurable audio settings and output formats

Prerequisites

Python 3.10 or higher
FFmpeg (required for audio processing)
Poetry (for dependency management)

Installation

Install FFmpeg if not already installed:

# On macOS using Homebrew
brew install ffmpeg

# On Ubuntu/Debian
sudo apt-get install ffmpeg

Install BlackHole for system audio capture (MacOS only)

brew install blackhole-2ch

Install portaudio for audio capture

brew install portaudio

Install whisperbox from pip

pip install whisperbox

System Audio Capture Setup

macOS

Install BlackHole: brew install blackhole-2ch
Set up a Multi-Output Device in Audio MIDI Setup:
- Open Audio MIDI Setup (using Spotlight or from Applications/Utilities)
- Click the + in the bottom left and select "Create Multi-Output Device"
- Check both your regular output and BlackHole 2ch
Select the new Multi-Output Device as your system output
In WhisperBox, select BlackHole 2ch as your system audio input

Linux and Windows

System audio capture requires additional setup that varies by distribution/version. When running wb --devices, WhisperBox will look for common system audio capture devices like:

Linux: PulseAudio monitors, JACK inputs, or audio loopback devices
Windows: Stereo Mix, VB-Cable, or other virtual audio devices

Please check your system's documentation for setting up audio capture devices.

Usage

Setup

The first time you run the app, you will go through the setup wizard.

wb

Then select the Whisper model you want to use. The smaller models are faster and quicker to download but the larger models are more accurate. Download times will vary depending on your internet speed.

Then select the AI provider you want to use. Ollama runs locally and does not require an API key.

Then select the model you want to use.

Then you will have the option to view the config file location so you can customize additional settings. This directory also contains the whisper models you downloaded and the data directory that contains all your recordings and transcriptions.

Basic Transcription

Start recording:

wb

Press Enter to stop recording when finished.

Advanced Options

Specify a profile:

wb --profile monologue_to_keynote

Configure audio devices:

wb --devices

This will launch an interactive device selection menu where you can configure both your microphone input and system audio capture devices. Useful for setting up or changing your recording sources.

Specify a Whisper model:

wb --model large

Enable full analysis (summary, sentiment, intent, topics):

wb --analyze

Enable verbose output:

wb --verbose

Configuration

The config.yaml file allows you to customize:

API settings for AI providers
Audio recording parameters
Transcription settings
Output formats and directories
Display preferences
AI prompt templates

See the example config.yaml for all available options.

Extending WhisperBox

WhisperBox can be customized to handle your recordings exactly how you want. There are two main ways to extend it:

Profiles: Define what to do with your recordings
Scripts: Create custom actions for your profiles

Creating Custom Profiles

A profile is a simple YAML file that tells WhisperBox:

What to do with your recording
How to process the transcript
Where to send the results

To create a profile:

Create a new .yaml file in the profiles/ folder (e.g., my_profile.yaml)
Add these three main sections:

# The name that appears in WhisperBox
name: my_profile

# Instructions for processing your recording
prompt: >
  Here's where you tell WhisperBox what to do with your recording.
  For example: "Create a summary with these key points..."

  The recording will appear here: {transcript}

# What to do with the results
actions:
  - script: output_to_markdown # Save as a file
  - script: copy_to_clipboard # Copy to clipboard

Built-in Actions

WhisperBox comes with several ready-to-use actions:

output_to_markdown: Save as a Markdown file
copy_to_clipboard: Copy to your clipboard
output_to_terminal: Show in the terminal
send_post_request: Send to a webhook URL

You can use multiple actions in a single profile:

actions:
  # Save the file
  - script: output_to_markdown
    config:
      filename: meeting_notes.md

  # Also copy it to clipboard
  - script: copy_to_clipboard

Creating Custom Scripts

Want to do something more custom? You can create your own action scripts:

Create a new Python file in the scripts/ folder (e.g., my_script.py)
Add a run_action function that handles the text:

def run_action(text, config):
    """
    text: The processed recording
    config: Any settings from your profile
    """
    print("I got the text:", text)
    # Do whatever you want with the text here!

Use it in your profile:

actions:
  - script: my_script
    config:
      any_setting: value

Example: Meeting Summary Profile

Here's a complete example that creates meeting summaries:

name: meeting_summary
prompt: >
  Create a clear summary of this meeting with:
  1. Key topics and decisions
  2. Action items and who's responsible
  3. Important deadlines or dates

  Meeting transcript: {transcript}

actions:
  # Save as a file
  - script: output_to_markdown
    config:
      filename: summary.md

  # Also copy to clipboard
  - script: copy_to_clipboard

Tips

Test your profiles with short recordings first
Check the profiles/ folder for more examples
Use multiple actions to create powerful workflows
Keep your prompts clear and specific

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Whisper AI for the transcription models
Rich for the terminal UI
All the AI providers supported by this tool

Authors

Ty Fiero tyfierodev@gmail.com
Mike Bird tooluseai@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
src/whisperbox		src/whisperbox
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhisperBox

Features

Prerequisites

Installation

System Audio Capture Setup

macOS

Linux and Windows

Usage

Setup

Basic Transcription

Advanced Options

Configuration

Extending WhisperBox

Creating Custom Profiles

Built-in Actions

Creating Custom Scripts

Example: Meeting Summary Profile

Tips

Contributing

License

Acknowledgments

Authors

About

Releases

Packages

Contributors 2

Languages

License

ToolUse/whisperbox

Folders and files

Latest commit

History

Repository files navigation

WhisperBox

Features

Prerequisites

Installation

System Audio Capture Setup

macOS

Linux and Windows

Usage

Setup

Basic Transcription

Advanced Options

Configuration

Extending WhisperBox

Creating Custom Profiles

Built-in Actions

Creating Custom Scripts

Example: Meeting Summary Profile

Tips

Contributing

License

Acknowledgments

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages