Skip to content

Commit

Permalink
Merge pull request #285 from bolna-ai/feat/boodhi-integration
Browse files Browse the repository at this point in the history
add an example to use fully open source stack
  • Loading branch information
marmikcfc authored Jun 24, 2024
2 parents 72b0356 + 3e91c50 commit 594e989
Show file tree
Hide file tree
Showing 20 changed files with 524 additions and 0 deletions.
21 changes: 21 additions & 0 deletions examples/whisper-melo-llama3/.env-sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=
TWILIO_PHONE_NUMBER=

DEEPGRAM_AUTH_TOKEN=
DEEPGRAM_API_KEY=

ELEVENLABS_API_KEY=

OPENAI_API_KEY=
OPENAI_MODEL=gpt-3.5-turbo

ENVIRONMENT=local
WEBSOCKET_URL=
APP_CALLBACK_URL=

REDIS_URL=redis://redis:6379

WHISPER_URL=ws://whisper-app:9000

MELO_TTS=http://melo-app:8000/connection
156 changes: 156 additions & 0 deletions examples/whisper-melo-llama3/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Bolna With MeloTTS and WhisperASR
Introducing our Dockerized solution! Seamlessly merge [Bolna](https://github.com/bolna-ai/bolna) with [Whisper ASR](https://github.com/bolna-ai/streaming-whisper-server) and [Melo TTS](https://github.com/anshjoseph/MiloTTS-Server) for telephone provider we use Twillo and for tunning we use ngrok. This is docker compose by which you can host bolna server Whisper ASR, Melo TTS together in cloud just by clone this repo and follow these simple steps to deploy ,but before that you have to make sure that you have [docker](https://docs.docker.com/engine/install/) and [docker compose](https://docs.docker.com/compose/install/) and make a .env file refer to .env-sample and also put ngrok auth token in ngrok-config.yml file


### Start Serices
```shell
docker compose up -d
```
the output something like this
![alt text](./img/docker_up.png "docker compose up -d")

note: make sure that your all service were runing

`let assume your server IP is 192.168.1.10`

### Creating Agent
for creating agent you have to execute following command mention below
```shell
curl --location 'http://192.168.1.10:5001/agent' \
--header 'Content-Type: application/json' \
--data '{
"agent_config": {
"agent_name": "Alfred",
"agent_type": "other",
"tasks": [
{
"task_type": "conversation",
"tools_config": {
"llm_agent": {
"model": "deepinfra/meta-llama/Meta-Llama-3-70B-Instruct",
"max_tokens": 123,
"agent_flow_type": "streaming",
"use_fallback": true,
"family": "llama",
"temperature": 0.1,
"request_json": true,
"provider":"deepinfra"
},
"synthesizer": {
"provider": "melotts",
"provider_config": {
"voice": "Casey",
"sample_rate": 8000,
"sdp_ratio" : 0.2,
"noise_scale" : 0.6,
"noise_scale_w" : 0.8,
"speed" : 1.0
},
"stream": true,
"buffer_size": 123,
"audio_format": "wav"
},
"transcriber": {
"encoding": "linear16",
"language": "en",
"model": "whisper",
"stream": true,
"task": "transcribe"
},
"input": {
"provider": "twilio",
"format": "wav"
},
"output": {
"provider": "twilio",
"format": "wav"
}
},
"toolchain": {
"execution": "parallel",
"pipelines": [
[
"transcriber",
"llm",
"synthesizer"
]
]
}
}
]
},
"agent_prompts": {
"task_1": {
"system_prompt": "What is the Ultimate Question of Life, the Universe, and Everything?"
}
}
}'

```
below given is the response
![alt text](./img/agent_res.png "agent response")
copy the agent_id we have to use in next step

if you want to [Change voice](#change-voice)

### Make call
```shell
curl --location 'http://192.168.1.10:8001/call' \
--header 'Content-Type: application/json' \
--data '{
"agent_id": "bf2a9e9c-6038-4104-85c4-b71a0d1478c9",
"recipient_phone_number": "+91XXXXXXXXXX"
}'
```
it gonna give output `Done` for succees

note: if you are using trial account use you register phone no

### Stop Services
```shell
docker compose down
```
![alt text](./img/docker_dw.png "docker compose up -d")


### Changing the voice MeloTTS
<a id="change-voice"></a>
by default we resrtict Melo EN but there were 5 option for voice as mention below
- ['EN-US'](./audio/audio_sample/EN_US.wav)
- ['EN-BR'](./audio/audio_sample/EN-BR.wav)
- ['EN-AU'](./audio/audio_sample/EN-AU.wav)
- ['EN-Default'](./audio/audio_sample/EN-Default.wav)
- ['EN_INDIA'](./audio/audio_sample/EN_INDIA.wav)

you have to just change the following section mention below
```JSON
"synthesizer": {
"provider": "melo",
"provider_config": {
"voice": "<put your selected voice here>",
"sample_rate": 8000,
"sdp_ratio" : 0.2,
"noise_scale" : 0.6,
"noise_scale_w" : 0.8,
"speed" : 1.0
},
"stream": true,
"buffer_size": 123,
"audio_format": "pcm"
}
```
and rest of the config gonna be same mention above

### Conservation DENO
This is demo using below prompt to the LLM
```json
"task_1": {
"system_prompt": "You are assistant at Dr. Sharma clinic you have to book an appointment"
}
```



[chat GPT 3.5 turbo 16k demo](./audio/demo_audio.mp3)

you can give prompt as per your use case
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
83 changes: 83 additions & 0 deletions examples/whisper-melo-llama3/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
services:

# main bolna service
bolna-app:
image: bolna-app:latest
build:
context: .
dockerfile: dockerfiles/bolna_server.Dockerfile
ports:
- "5001:5001"
depends_on:
- redis
env_file:
- .env
volumes:
- ../agent_data:/app/agent_data
- $HOME/.aws/credentials:/root/.aws/credentials:ro
- $HOME/.aws/config:/root/.aws/config:ro

# redis service used as a persistent storage
redis:
image: redis:latest
ports:
- "6379:6379"

# ngrok for local tunneling
ngrok:
image: ngrok/ngrok:latest
restart: unless-stopped
command:
- "start"
- "--all"
- "--config"
- "/etc/ngrok.yml"
volumes:
- ./ngrok-config.yml:/etc/ngrok.yml
ports:
- 4040:4040

### Telephony servers ###
twilio-app:
image: twilio-app:latest
build:
context: .
dockerfile: dockerfiles/twilio_server.Dockerfile
ports:
- "8001:8001"
depends_on:
- redis
env_file:
- .env

### whisper servers ###
whisper-app:
image: whisper-app:latest
build:
context: .
dockerfile: dockerfiles/whisper_server.Dockerfile
ports:
- "9002:9000"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
### Melo TTS ###
melo-app:
image: melo-app:latest
build:
context: .
dockerfile: dockerfiles/melo_server.Dockerfile
ports:
- "8002:8000"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]

19 changes: 19 additions & 0 deletions examples/whisper-melo-llama3/dockerfiles/bolna_server.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
FROM python:3.10.13-slim

WORKDIR /app
COPY ./requirements.txt /app
COPY ./quickstart_server.py /app

RUN apt-get update && apt-get install libgomp1 git -y
RUN apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg
RUN pip install -r requirements.txt
RUN pip install --force-reinstall git+https://github.com/bolna-ai/bolna@MeloTTS
RUN pip install scipy==1.11.0
RUN pip install torch==2.0.1
RUN pip install torchaudio==2.0.1
RUN pip install pydub==0.25.1
RUN pip install ffprobe
RUN pip install aiofiles

EXPOSE 5001
CMD ["uvicorn", "quickstart_server:app", "--host", "0.0.0.0", "--port", "5001"]
13 changes: 13 additions & 0 deletions examples/whisper-melo-llama3/dockerfiles/melo_server.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM python:3.10.13-slim
WORKDIR /app

RUN apt-get update && apt-get install libgomp1 git -y
RUN apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg
RUN git clone https://github.com/bolna-ai/MeloTTS
RUN pip install fastapi uvicorn torchaudio
RUN cp -a MeloTTS/. .
RUN python -m pip cache purge
RUN pip install --no-cache-dir txtsplit torch torchaudio cached_path transformers==4.27.4 mecab-python3==1.0.5 num2words==0.5.12 unidic_lite unidic mecab-python3==1.0.5 pykakasi==2.2.1 fugashi==1.3.0 g2p_en==2.1.0 anyascii==0.3.2 jamo==0.4.1 gruut[de,es,fr]==2.2.3 g2pkk>=0.1.1 librosa==0.9.1 pydub==0.25.1 eng_to_ipa==0.0.2 inflect==7.0.0 unidecode==1.3.7 pypinyin==0.50.0 cn2an==0.5.22 jieba==0.42.1 langid==1.1.6 tqdm tensorboard==2.16.2 loguru==0.7.2
RUN python -m unidic download
EXPOSE 8000
CMD ["python3", "Server.py"]
11 changes: 11 additions & 0 deletions examples/whisper-melo-llama3/dockerfiles/twilio_server.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.10.13-slim

WORKDIR /app
COPY ./requirements.txt /app
COPY ./telephony_server/twilio_api_server.py /app

RUN pip install --no-cache-dir -r requirements.txt

EXPOSE 8001

CMD ["uvicorn", "twilio_api_server:app", "--host", "0.0.0.0", "--port", "8001"]
16 changes: 16 additions & 0 deletions examples/whisper-melo-llama3/dockerfiles/whisper_server.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
FROM python:3.10.13-slim

RUN apt-get update && apt-get install libgomp1 git -y
RUN apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg
RUN apt-get -y install build-essential
RUN apt-get -y install portaudio19-dev
RUN git clone https://github.com/bolna-ai/streaming-whisper-server.git
WORKDIR streaming-whisper-server
RUN pip install -e .
RUN pip install git+https://github.com/SYSTRAN/faster-whisper.git
RUN pip install transformers

RUN ct2-transformers-converter --model openai/whisper-small --copy_files preprocessor_config.json --output_dir ./Server/ASR/whisper_small --quantization float16
WORKDIR Server
EXPOSE 9000
CMD ["python3", "Server.py", "-p", "9000"]
Binary file added examples/whisper-melo-llama3/img/agent_res.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/whisper-melo-llama3/img/docker_dw.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/whisper-melo-llama3/img/docker_up.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 10 additions & 0 deletions examples/whisper-melo-llama3/ngrok-config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
region: us
version: '2'
authtoken: <ngrok auth token>
tunnels:
twilio-app:
addr: twilio-app:8001
proto: http
bolna-app:
addr: bolna-app:5001
proto: http
Loading

0 comments on commit 594e989

Please sign in to comment.