openWakeWord is an open-source library for detecting common wake-words like "alexa", "hey mycroft", "hey jarvis", and other models. Rhasspy is an open-source voice assistant.
This project runs openWakeWord as a stand-alone service, receives audio from Rhasspy via UDP, detects when a wake-word is spoken, and notifies Rhasspy using the Hermes MQTT protocol.
I run Rhasspy in Base/Satellite mode. Currently each Satellite captures audio, does the wake-word detection locally and streams audio to the Base which does everything else. The Pi4 satellites runs the Rhasspy Docker container, launched with compose. The Base Rhasspy container runs on a more powerful i7 (runs other home automation software.)
Running openWakeWord in Docker eases distribution and setup (Python dependencies), allows openWakeWord to develop at a separate pace to Rhasspy (instead of bundled and released with Rhasspy.) A single instance of openWakeWord centralises configuration, and allows lower power satellites (e.g. ESP32s) richer wake-word options.
In the future I plan to add a web UI for configuration: which words to detect, thresholds, custom verifier models and maybe speaker identification. It could also include live visualisation for testing and diagnostics.
Using Docker CLI
docker run -d --name openwakeword -p 12202:12202/udp -v /path/to/config/:/config dalehumby/openwakeword-rhasspy
In docker-compose.yml
(or a Docker Swarm stack file)
openwakeword:
image: dalehumby/openwakeword-rhasspy
restart: always
ports:
- "12202:12202/udp"
volumes:
- /path/to/config:/config
For testing and experimentation you can run this project locally:
- Clone the repo
git clone git@github.com:dalehumby/openWakeWord-rhasspy.git
- Create a Python virtul environment (optional)
python3 -m venv env
source env/bin/activate
- Install requirements
pip3 install -r requirements.txt
- After you've done the Configuration below
- Run
python3 detect.py
- Create a file called
config.yaml
, for examplenano /path/to/config/config.yaml
- Paste the contents of
config.yaml.example
intoconfig.yaml
to get started
Rhasspy streams audio from its microphone to openWakeWord over the network using the UDP protocol. On each Rhasspy device that has a microhone attached (typically a Satellite) go to Rhasspy - Settings - Audio Recording and in UDP Audio (Output)
insert the IP address of the host that's running openWakeWord, and choose a port number, usually starting at 12202
. If you have multiple Rhasspy devices then each device needs its own port number, 12202
, 12203
, 12204
, etc.
In openWakeWord config.yaml
, udp_ports
has kay:value pairs. The key is the siteId
shown at the top of Rhasspy - Settings. It might be: base
, satellite
, kitchen
, or bedroom
, etc. The value is the port listed under Rhasspy - Settings - Audio Recording.
udp_ports:
base: 12202
kitchen: 12203
bedroom: 12204
If you are using Docker you need to open the ports to allow UDP network traffic into the container.
Using Docker CLI
docker run -d --name openwakeword -p 12202:12202/udp -p 12203:12203/udp -p 12204:12204/udp -v /path/to/config/:/config dalehumby/openwakeword-rhasspy
Or in docker-compose.yml
openwakeword:
image: dalehumby/openwakeword-rhasspy
restart: always
ports:
- "12202:12202/udp" # base
- "12203:12203/udp" # kitchen
- "12204:12204/udp" # bedroom
# ... etc
volumes:
- /path/to/config:/config
openWakeWord notifies Rhasspy that a wake-word has been spoken using the Hermes MQTT protocol. The MQTT broker needs to be accessible by both Rhasspy and openWakeWord. Rhasspy's internal MQTT broker is not reachable from outside of Rhasspy, so you will need to run a shared broker, like Mosquitto.
Once the broker is running, go to Rhasspy - Settings - MQTT. Choose External
broker, set the IP address of the Host
that the broker is running on, the Port
number, and the Username
/Password
if required, similar to:
openWakeWord config.yaml
would then have:
mqtt:
broker: 10.0.0.10
port: 1883
username: yourusername # Delete row if not required
password: yourpassword # Delete row if not required
On each Rhasspy, in Rhasspy - Settings - Wake Word, set Hermes MQTT
, like
openWakeWord listens for wake-words like "alexa", "hey mycroft", "hey jarvis", and others. Use model_names
to specify which wake-words to listen for. (See Pre-Trained Models documentation, and which model_names
to use.)
Delete any wake-words that you don't want to activate on. Or remove the entire model_names
section to use all pre-trained models.
oww:
model_names: # From https://github.com/dscripka/openWakeWord/blob/main/openwakeword/__init__.py
- alexa # Delete to ignore this wake-word
- hey_mycroft
- hey_jarvis
- timer
- weather
activation_samples: 3 # Number of samples in moving average
activation_threshold: 0.7 # Trigger wakeword when average above this threshold
deactivation_threshold: 0.2 # Do not trigger again until average falls below this threshold
# OWW config, see https://github.com/dscripka/openWakeWord#recommendations-for-usage
vad_threshold: 0.5
enable_speex_noise_suppression: false
The other oww
settings ensure Rhasspy is only activated once per wake-word, and help reduce false activations.
In the example above, the latest 3 audio samples received over UDP are averaged together, and if the average confidence that a wake-word has been spoken is above 0.7 (70%), then Rhasspy is notified. Rhasspy will not be notified again until the average confidence drops below 0.2 (20%), i.e. the wake-word has ended.
Settings for voice activity detection (VAD) and noise suppression are also provided. (See openWakeWord's Recommendations for Usage.)
Feel free to open an Issue if you have a problem, need help or have an idea. PRs always welcome.