Skip to content

SoundWave: Real-Time Audio Processing and Speech-to-Text with Kafka and Flink

License

Notifications You must be signed in to change notification settings

somanshurath/soundwave

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SoundWave [⚒️ Work in Progress]

Real-Time Audio Processing and Speech-to-Text with Kafka and Flink

Overview

SoundWave is a real-time audio processing pipeline that ingests audio streams from multiple sources, performs audio processing tasks (including noise reduction, speech-to-text conversion, and signal mixing), and stores the processed, compressed audio along with a text transcript. Built using Apache Kafka and Apache Flink, SoundWave is optimized for fault tolerance, data redundancy, and efficient storage, using a distributed setup across three laptops.


Table of Contents

  1. Project Features
  2. Kafka Setup
  3. Flink Setup

Project Features

  • Real-Time Audio Ingestion: Captures audio streams from multiple sources.
  • Stream Processing: Leverages Kafka for message brokering and Flink for real-time data transformations.
  • Speech-to-Text: Converts audio streams to text transcripts.
  • Noise Reduction: Enhances audio quality by filtering background noise.
  • Signal Mixing: Merges audio from multiple producers into a unified stream.
  • Data Redundancy: Ensures reliable data storage with backup and fault tolerance.

Kafka Setup

download the latest kafka release and extract it from here here

tar -xzf kafka_2.13-3.8.1.tgz
cd kafka_2.13-3.8.1

run the following commands in separate sessions in order to start all services in the correct order:

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

now, your kafka server is up and running.

Flink Setup

(tbd)

Additional Documentation

Audio Consumer

This directory contains a Kafka consumer Python script that consumes raw audio data from a Kafka topic and saves it as a WAV file. For more details, refer to the Audio Consumersection.

Audio Processor

This directory contains various scripts to process audio data using Kafka. One of the scripts, pitch_shift.py, reads raw audio data from a Kafka topic, performs pitch shifting using the librosa library, and sends the processed audio data to another Kafka topic. For more details, refer to the Audio Processor section.

Audio Producer

This directory contains an audio producer Python script that captures real-time audio data and sends it to a Kafka topic. For more details, refer to the Audio Producer section.

About

SoundWave: Real-Time Audio Processing and Speech-to-Text with Kafka and Flink

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published