🪄 Dictate Wizard 🪄

Dictate Wizard is an open source dictation tool. The goal is to obsolete as much typing as possible and let you speak your emails, instant messages etc instead.

It supports local Whisper-based transcription (free, but slower or lower accuracy than paid API solutions) as well as multiple commercial providers like OpenAI, Soniox and Conjecture (fast and high accuracy but you need a paid API key). Users can select a single or multiple providers and compare the transcription results and processing time.

It features an interactive GUI with options to update API provider keys, toggle active providers, and designate a primary provider for the transcription (the one used to output the text to the keyboard). Dictate Wizard also lets users customize their hotkey and modifier keys to activate recording.

This project is written in Python and uses Kivy for the GUI. It's intended to be cross-platform (I've tested on MacOS and Windows but not Linux). It outputs via adding the transcription text into the clipboard and pasting it.

Providers

Suggestions for alternative providers to be added are welcome (please open an Issue). Currently it supports:

OpenAI https://platform.openai.com/docs/guides/speech-to-text/quickstart
Conjecture https://platform.conjecture.dev/transcriptions
Soniox https://soniox.com/products/speech-recognition-ai/

Soniox is the only provider supported in 'streaming' mode, i.e. the transcription happens concurrently with the audio recording. As such it's the fastest provider in the list to return an output as both the local Whisper and the other providers are all processed in a sequential fashion.

The local Whisper functionality is provided by https://github.com/guillaumekln/faster-whisper. It defaults to the base.en model as this is roughly competitive in transcription time with the API providers, however you can select any sized model from the dropdown. Smaller models will transcribe faster but larger models will transcribe more accurately. Select 'en' models if you will be transcribing in English.

Usage

Clone this repository:

git clone https://github.com/markgoodhead/dictate-wizard.git

Change to the project directory:

cd dictate-wizard

Install the prerequisites:

pip install -r requirements.txt

Run the app:

python main.py

Use the GUI to configure your API keys, select the providers you wish to use, and designate your hotkey and modifiers.
Activate recording by pressing and holding your selected hotkey combination (defaults to ctrl+alt+x). Speak into your microphone. Release your hotkey and the transcription will be output wherever your cursor is highlighted.

Known Issues

Windows:

The keyboard actions seem to take 300ms to process which adds extra delay to the text output
Doesn't load the Wizard icon

How to Contribute

Contributions are welcome! Please feel free to submit a pull request or open an issue. For major changes, please open an issue first to discuss what you would like to change.

Roadmap

Wishlist for improvements:

Support CoreML and GPUs for faster local Whisper inference
Support streaming for local Whisper inference
Test Linux version
Package Dictate Wizard up into a deployable, e.g. with PyInstaller

Feature requests welcomed too; please make an Issue or discuss in the Discord server.

License

This project is licensed under the GNU General Public License v3.0 License. See the LICENSE file for more details.

Acknowledgements

We are grateful to all the transcription providers whose services make this project possible.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.github/workflows		.github/workflows
hooks		hooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dictate_wizard.icns		dictate_wizard.icns
dictate_wizard.png		dictate_wizard.png
dictate_wizard_large.png		dictate_wizard_large.png
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🪄 Dictate Wizard 🪄

Providers

Usage

Known Issues

How to Contribute

Roadmap

License

Acknowledgements

About

Releases 2

Packages

Languages

License

markgoodhead/dictate-wizard

Folders and files

Latest commit

History

Repository files navigation

🪄 Dictate Wizard 🪄

Providers

Usage

Known Issues

How to Contribute

Roadmap

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages