Description

This is the sample implementation of the asr websocket python server by using WebRTC livekit Egress. You can transcribe the speech audio file from livekit published audio track microphone per 30 seconds while saving the original audio file and resampled audio file that will be fed into faster_whisper.

Demo Video(Japanese Transcription)

Japanese.whisper.ASR.PoC.mp4

Preparation

livekit-server https://github.com/livekit
egress https://github.com/livekit/egress
faster whisper https://github.com/guillaumekln/faster-whisper
AWS g5.xlarge instance
Vultr VPS for livekit
ChatGPT4 for any unknown issues.

Setup

run the egress by "docker run --rm -e EGRESS_CONFIG_FILE=/out/config.yaml --net=host -v ~/egress-test:/out livekit/egress" after setting config.yaml
Publish your audio track and check the audio track and room id from livekit console log
update the request.json accoring to the result
run "livekit-cli start-track-egress --api-key your-livekit-api-key --api-secret your-livekit-secret-key --request request.json" after starting "asr-server.py"
you can see the transcription from the terminal and check the original and resampled audio file to detect the audio issues

the issues that I faced while debugging

Head part audio cut off issue when receiving the websocket audio streaming from egress
Cuda error on g5.xlarge instance (sudo apt install nvidia-driver-470 worked)
Resample from "pcm 16bit 48Khz 2channel" to "pcm 16bit 16Khz 1channel"

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
Japanese whisper ASR PoC.mp4		Japanese whisper ASR PoC.mp4
LICENSE		LICENSE
README.md		README.md
asr-server.py		asr-server.py
config.yaml		config.yaml
request.json		request.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Demo Video(Japanese Transcription)

Preparation

Setup

the issues that I faced while debugging

About

Releases

Packages

Languages

License

atyenoria/livekit-whisper-transcribe

Folders and files

Latest commit

History

Repository files navigation

Description

Demo Video(Japanese Transcription)

Preparation

Setup

the issues that I faced while debugging

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages