-
-
Notifications
You must be signed in to change notification settings - Fork 114
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #285 from bolna-ai/feat/boodhi-integration
add an example to use fully open source stack
- Loading branch information
Showing
20 changed files
with
524 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
TWILIO_ACCOUNT_SID= | ||
TWILIO_AUTH_TOKEN= | ||
TWILIO_PHONE_NUMBER= | ||
|
||
DEEPGRAM_AUTH_TOKEN= | ||
DEEPGRAM_API_KEY= | ||
|
||
ELEVENLABS_API_KEY= | ||
|
||
OPENAI_API_KEY= | ||
OPENAI_MODEL=gpt-3.5-turbo | ||
|
||
ENVIRONMENT=local | ||
WEBSOCKET_URL= | ||
APP_CALLBACK_URL= | ||
|
||
REDIS_URL=redis://redis:6379 | ||
|
||
WHISPER_URL=ws://whisper-app:9000 | ||
|
||
MELO_TTS=http://melo-app:8000/connection |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,156 @@ | ||
# Bolna With MeloTTS and WhisperASR | ||
Introducing our Dockerized solution! Seamlessly merge [Bolna](https://github.com/bolna-ai/bolna) with [Whisper ASR](https://github.com/bolna-ai/streaming-whisper-server) and [Melo TTS](https://github.com/anshjoseph/MiloTTS-Server) for telephone provider we use Twillo and for tunning we use ngrok. This is docker compose by which you can host bolna server Whisper ASR, Melo TTS together in cloud just by clone this repo and follow these simple steps to deploy ,but before that you have to make sure that you have [docker](https://docs.docker.com/engine/install/) and [docker compose](https://docs.docker.com/compose/install/) and make a .env file refer to .env-sample and also put ngrok auth token in ngrok-config.yml file | ||
|
||
|
||
### Start Serices | ||
```shell | ||
docker compose up -d | ||
``` | ||
the output something like this | ||
![alt text](./img/docker_up.png "docker compose up -d") | ||
|
||
note: make sure that your all service were runing | ||
|
||
`let assume your server IP is 192.168.1.10` | ||
|
||
### Creating Agent | ||
for creating agent you have to execute following command mention below | ||
```shell | ||
curl --location 'http://192.168.1.10:5001/agent' \ | ||
--header 'Content-Type: application/json' \ | ||
--data '{ | ||
"agent_config": { | ||
"agent_name": "Alfred", | ||
"agent_type": "other", | ||
"tasks": [ | ||
{ | ||
"task_type": "conversation", | ||
"tools_config": { | ||
"llm_agent": { | ||
"model": "deepinfra/meta-llama/Meta-Llama-3-70B-Instruct", | ||
"max_tokens": 123, | ||
"agent_flow_type": "streaming", | ||
"use_fallback": true, | ||
"family": "llama", | ||
"temperature": 0.1, | ||
"request_json": true, | ||
"provider":"deepinfra" | ||
}, | ||
"synthesizer": { | ||
"provider": "melotts", | ||
"provider_config": { | ||
"voice": "Casey", | ||
"sample_rate": 8000, | ||
"sdp_ratio" : 0.2, | ||
"noise_scale" : 0.6, | ||
"noise_scale_w" : 0.8, | ||
"speed" : 1.0 | ||
}, | ||
"stream": true, | ||
"buffer_size": 123, | ||
"audio_format": "wav" | ||
}, | ||
"transcriber": { | ||
"encoding": "linear16", | ||
"language": "en", | ||
"model": "whisper", | ||
"stream": true, | ||
"task": "transcribe" | ||
}, | ||
"input": { | ||
"provider": "twilio", | ||
"format": "wav" | ||
}, | ||
"output": { | ||
"provider": "twilio", | ||
"format": "wav" | ||
} | ||
}, | ||
"toolchain": { | ||
"execution": "parallel", | ||
"pipelines": [ | ||
[ | ||
"transcriber", | ||
"llm", | ||
"synthesizer" | ||
] | ||
] | ||
} | ||
} | ||
] | ||
}, | ||
"agent_prompts": { | ||
"task_1": { | ||
"system_prompt": "What is the Ultimate Question of Life, the Universe, and Everything?" | ||
} | ||
} | ||
}' | ||
|
||
``` | ||
below given is the response | ||
![alt text](./img/agent_res.png "agent response") | ||
copy the agent_id we have to use in next step | ||
|
||
if you want to [Change voice](#change-voice) | ||
|
||
### Make call | ||
```shell | ||
curl --location 'http://192.168.1.10:8001/call' \ | ||
--header 'Content-Type: application/json' \ | ||
--data '{ | ||
"agent_id": "bf2a9e9c-6038-4104-85c4-b71a0d1478c9", | ||
"recipient_phone_number": "+91XXXXXXXXXX" | ||
}' | ||
``` | ||
it gonna give output `Done` for succees | ||
|
||
note: if you are using trial account use you register phone no | ||
|
||
### Stop Services | ||
```shell | ||
docker compose down | ||
``` | ||
![alt text](./img/docker_dw.png "docker compose up -d") | ||
|
||
|
||
### Changing the voice MeloTTS | ||
<a id="change-voice"></a> | ||
by default we resrtict Melo EN but there were 5 option for voice as mention below | ||
- ['EN-US'](./audio/audio_sample/EN_US.wav) | ||
- ['EN-BR'](./audio/audio_sample/EN-BR.wav) | ||
- ['EN-AU'](./audio/audio_sample/EN-AU.wav) | ||
- ['EN-Default'](./audio/audio_sample/EN-Default.wav) | ||
- ['EN_INDIA'](./audio/audio_sample/EN_INDIA.wav) | ||
|
||
you have to just change the following section mention below | ||
```JSON | ||
"synthesizer": { | ||
"provider": "melo", | ||
"provider_config": { | ||
"voice": "<put your selected voice here>", | ||
"sample_rate": 8000, | ||
"sdp_ratio" : 0.2, | ||
"noise_scale" : 0.6, | ||
"noise_scale_w" : 0.8, | ||
"speed" : 1.0 | ||
}, | ||
"stream": true, | ||
"buffer_size": 123, | ||
"audio_format": "pcm" | ||
} | ||
``` | ||
and rest of the config gonna be same mention above | ||
|
||
### Conservation DENO | ||
This is demo using below prompt to the LLM | ||
```json | ||
"task_1": { | ||
"system_prompt": "You are assistant at Dr. Sharma clinic you have to book an appointment" | ||
} | ||
``` | ||
|
||
|
||
|
||
[chat GPT 3.5 turbo 16k demo](./audio/demo_audio.mp3) | ||
|
||
you can give prompt as per your use case |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
services: | ||
|
||
# main bolna service | ||
bolna-app: | ||
image: bolna-app:latest | ||
build: | ||
context: . | ||
dockerfile: dockerfiles/bolna_server.Dockerfile | ||
ports: | ||
- "5001:5001" | ||
depends_on: | ||
- redis | ||
env_file: | ||
- .env | ||
volumes: | ||
- ../agent_data:/app/agent_data | ||
- $HOME/.aws/credentials:/root/.aws/credentials:ro | ||
- $HOME/.aws/config:/root/.aws/config:ro | ||
|
||
# redis service used as a persistent storage | ||
redis: | ||
image: redis:latest | ||
ports: | ||
- "6379:6379" | ||
|
||
# ngrok for local tunneling | ||
ngrok: | ||
image: ngrok/ngrok:latest | ||
restart: unless-stopped | ||
command: | ||
- "start" | ||
- "--all" | ||
- "--config" | ||
- "/etc/ngrok.yml" | ||
volumes: | ||
- ./ngrok-config.yml:/etc/ngrok.yml | ||
ports: | ||
- 4040:4040 | ||
|
||
### Telephony servers ### | ||
twilio-app: | ||
image: twilio-app:latest | ||
build: | ||
context: . | ||
dockerfile: dockerfiles/twilio_server.Dockerfile | ||
ports: | ||
- "8001:8001" | ||
depends_on: | ||
- redis | ||
env_file: | ||
- .env | ||
|
||
### whisper servers ### | ||
whisper-app: | ||
image: whisper-app:latest | ||
build: | ||
context: . | ||
dockerfile: dockerfiles/whisper_server.Dockerfile | ||
ports: | ||
- "9002:9000" | ||
deploy: | ||
resources: | ||
reservations: | ||
devices: | ||
- driver: nvidia | ||
count: 1 | ||
capabilities: [gpu] | ||
### Melo TTS ### | ||
melo-app: | ||
image: melo-app:latest | ||
build: | ||
context: . | ||
dockerfile: dockerfiles/melo_server.Dockerfile | ||
ports: | ||
- "8002:8000" | ||
deploy: | ||
resources: | ||
reservations: | ||
devices: | ||
- driver: nvidia | ||
count: 1 | ||
capabilities: [gpu] | ||
|
19 changes: 19 additions & 0 deletions
19
examples/whisper-melo-llama3/dockerfiles/bolna_server.Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
FROM python:3.10.13-slim | ||
|
||
WORKDIR /app | ||
COPY ./requirements.txt /app | ||
COPY ./quickstart_server.py /app | ||
|
||
RUN apt-get update && apt-get install libgomp1 git -y | ||
RUN apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg | ||
RUN pip install -r requirements.txt | ||
RUN pip install --force-reinstall git+https://github.com/bolna-ai/bolna@MeloTTS | ||
RUN pip install scipy==1.11.0 | ||
RUN pip install torch==2.0.1 | ||
RUN pip install torchaudio==2.0.1 | ||
RUN pip install pydub==0.25.1 | ||
RUN pip install ffprobe | ||
RUN pip install aiofiles | ||
|
||
EXPOSE 5001 | ||
CMD ["uvicorn", "quickstart_server:app", "--host", "0.0.0.0", "--port", "5001"] |
13 changes: 13 additions & 0 deletions
13
examples/whisper-melo-llama3/dockerfiles/melo_server.Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
FROM python:3.10.13-slim | ||
WORKDIR /app | ||
|
||
RUN apt-get update && apt-get install libgomp1 git -y | ||
RUN apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg | ||
RUN git clone https://github.com/bolna-ai/MeloTTS | ||
RUN pip install fastapi uvicorn torchaudio | ||
RUN cp -a MeloTTS/. . | ||
RUN python -m pip cache purge | ||
RUN pip install --no-cache-dir txtsplit torch torchaudio cached_path transformers==4.27.4 mecab-python3==1.0.5 num2words==0.5.12 unidic_lite unidic mecab-python3==1.0.5 pykakasi==2.2.1 fugashi==1.3.0 g2p_en==2.1.0 anyascii==0.3.2 jamo==0.4.1 gruut[de,es,fr]==2.2.3 g2pkk>=0.1.1 librosa==0.9.1 pydub==0.25.1 eng_to_ipa==0.0.2 inflect==7.0.0 unidecode==1.3.7 pypinyin==0.50.0 cn2an==0.5.22 jieba==0.42.1 langid==1.1.6 tqdm tensorboard==2.16.2 loguru==0.7.2 | ||
RUN python -m unidic download | ||
EXPOSE 8000 | ||
CMD ["python3", "Server.py"] |
11 changes: 11 additions & 0 deletions
11
examples/whisper-melo-llama3/dockerfiles/twilio_server.Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
FROM python:3.10.13-slim | ||
|
||
WORKDIR /app | ||
COPY ./requirements.txt /app | ||
COPY ./telephony_server/twilio_api_server.py /app | ||
|
||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
EXPOSE 8001 | ||
|
||
CMD ["uvicorn", "twilio_api_server:app", "--host", "0.0.0.0", "--port", "8001"] |
16 changes: 16 additions & 0 deletions
16
examples/whisper-melo-llama3/dockerfiles/whisper_server.Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
FROM python:3.10.13-slim | ||
|
||
RUN apt-get update && apt-get install libgomp1 git -y | ||
RUN apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg | ||
RUN apt-get -y install build-essential | ||
RUN apt-get -y install portaudio19-dev | ||
RUN git clone https://github.com/bolna-ai/streaming-whisper-server.git | ||
WORKDIR streaming-whisper-server | ||
RUN pip install -e . | ||
RUN pip install git+https://github.com/SYSTRAN/faster-whisper.git | ||
RUN pip install transformers | ||
|
||
RUN ct2-transformers-converter --model openai/whisper-small --copy_files preprocessor_config.json --output_dir ./Server/ASR/whisper_small --quantization float16 | ||
WORKDIR Server | ||
EXPOSE 9000 | ||
CMD ["python3", "Server.py", "-p", "9000"] |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
region: us | ||
version: '2' | ||
authtoken: <ngrok auth token> | ||
tunnels: | ||
twilio-app: | ||
addr: twilio-app:8001 | ||
proto: http | ||
bolna-app: | ||
addr: bolna-app:5001 | ||
proto: http |
Oops, something went wrong.