voice-chat

Chat with AI using whisper, LLMs, and TTS

Setup

Cloning

git clone --recurse-submodules https://github.com/constellate-ai/voice-chat

LLM Inference with llama.cpp

This repo uses llama.cpp for LLM inference which is optimize for Apple Silicon.
If you have a CUDA device available to you, I recommend using vLLM for inference
download a model GGUF from huggingface. The following command assumes using Nous Research's Hermes 2 Pro Llama 3 8B model, but you can use whatever you'd like (for vLLM, you'll use full-precision, or a BNB or AWQ quant)
this starts the server on port 9000, but use whatever you'd like

cd llama.cpp
make
./server \
  --port 9000 \
  --host 0.0.0.0 \
  -m ~/path/to/model/probably-at-least-Q4.gguf \
  -cb \
  -c 8096 \  # model context length here
  -ngl -1 \  # offload all layers to GPU; change depending on your model size & available shared memory
  --chat-template chatml # your model's chat template; see the llama.cpp repo for supported formats

Speech-to-text Transcription with whisper.cpp

this assumes a english-only model; choose a different one for your use-case if you want

Configure the whisper.cpp repo/submodule:

cd whisper.cpp
make

Download the english model:

bash ./models/download-ggml-model.sh base.en

Start the server to accept WAV files; this will automatically try to use the base.en model which we downloaded in the previous step, BUT, you can download and/or use a different one if so you choose.

cd whisper.cpp
./server \
  --prompt 'User input is presented as text.' \  # condition the model to use proper english and punctuation
  --host 0.0.0.0
  --port 9001

Or, for better quality:

make medium.en  # bigger, higher-quality model
./server \ 
  --prompt 'User input is presented as text.' \
  --host 0.0.0.0 \
  --port 9001 \
  --model models/ggml-medium.en.bin

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
backend		backend
frontend		frontend
llama.cpp @ da799b4		llama.cpp @ da799b4
whisper.cpp @ 87acd6d		whisper.cpp @ 87acd6d
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

voice-chat

Setup

Cloning

LLM Inference with llama.cpp

Speech-to-text Transcription with whisper.cpp

About

Releases

Packages

Languages

License

Constellate-AI/voice-chat

Folders and files

Latest commit

History

Repository files navigation

voice-chat

Setup

Cloning

LLM Inference with llama.cpp

Speech-to-text Transcription with whisper.cpp

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages