Skip to content

Speech-to-Text based on silero-vad + whisper.cpp (GGML STT) for ROS 2

License

Notifications You must be signed in to change notification settings

grupo-avispa/whisper_ros

 
 

Repository files navigation

whisper_ros

This repository provides a set of ROS 2 packages to integrate whisper.cpp into ROS 2 using audio_common. Besides, silero-vad is used to perform VAD (Voice Activity Detection).

Installation

$ cd ~/ros2_ws/src
$ git clone https://github.com/mgonzs13/audio_common.git
$ git clone --recurse-submodules https://github.com/mgonzs13/whisper_ros.git
$ sudo apt install portaudio19-dev
$ pip3 install -r audio_common/requirements.txt
$ pip3 install -r whisper_ros/requirements.txt
$ cd ~/ros2_ws
$ colcon build

CUDA

To run llama_ros with CUDA, first, you must install the CUDA Toolkit. Then, you have to set the environment variable WHISPER_CUDA to on:

export WHISPER_CUDA="on"

Usage

Run Silero for VAD and Whisper for STT:

$ ros2 launch whisper_bringup whisper.launch.py

Send a goal action to listen:

$ ros2 action send_goal /whisper/listen whisper_msgs/action/STT "{}"

Or try the example of a whisper client:

$ ros2 run whisper_ros whisper_demo_node

About

Speech-to-Text based on silero-vad + whisper.cpp (GGML STT) for ROS 2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 48.1%
  • Python 46.7%
  • CMake 5.2%