GitHub - Pantotone/poc-CousinEar: A proof-of-concept of a Discord bot able to connect into a voice channel, strip down each voice data, and run through a speech-to-text converter.

Cousinear

A proof-of-concept of a Discord bot able to connect into a voice channel, strip down each voice data, and run through a speech-to-text converter.

Speech-to-text services available:
- (local, free) OpenAI's Whisper model - (run on cli)
- (local, free) whisper-ctranslate2 as interface to Whisper - (run on cli, faster output than original whisper)
- (remote, paid) OpenAI Whisper API
- (remote, paid - with free tier) Microsoft Azure STT AI - (allow real-time decoding, fastest, but not as accurate)
Discord.js as a Discord interface
(for local use) prism-media for media conversion (opus to pcm)
(for local use) ffmpeg to process audio (pcm to mp3)

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
public		public
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
fly.toml		fly.toml
package.json		package.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock