Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AvatarChatbot - Adding files to deploy AvatarChatbot application on AMD GPU #1288

Open
wants to merge 89 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
cb0958c
AvatarChatbot - add files for deploy on AMD GPU (only TGI)
Dec 24, 2024
3334190
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
7b2b803
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
0b4f045
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
af72b8a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 24, 2024
215eeb8
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
5597763
Merge remote-tracking branch 'origin/feature/AvatarChatbot_ROCm_clear…
Dec 24, 2024
26f230a
Merge branch 'main' into feature/AvatarChatbot_ROCm_clear
chyundunovDatamonsters Dec 27, 2024
aa2825f
Merge branch 'main' into feature/AvatarChatbot_ROCm_clear
chensuyue Jan 7, 2025
12da060
AvatarChatbot - add files for deploy on AMD GPU (only TGI)
Dec 24, 2024
b33719a
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
b5a3717
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
cc2c8a1
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
6887254
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
496b770
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 24, 2024
f5c13f6
Merge remote-tracking branch 'origin/feature/AvatarChatbot_ROCm_clear…
Jan 15, 2025
df6fc16
AvatarChatbot - fix README
Jan 15, 2025
1c71186
Update Code and README for GenAIComps Refactor (#1285)
chensuyue Jan 2, 2025
f09acb3
Fix changed file detect issue (#1339)
chensuyue Jan 3, 2025
323bd47
fix chatqna benchmark without rerank config issue (#1341)
chensuyue Jan 6, 2025
b81f795
Rename streaming to stream to align with OpenAI API (#1332)
XinyaoWa Jan 6, 2025
36f9f8b
Fix code owner list (#1352)
chensuyue Jan 6, 2025
3842942
Check duplicated dockerfile (#1289)
ZePan110 Jan 6, 2025
ae5b901
refine agent directories. (#1353)
lkk12014402 Jan 6, 2025
a2c4b84
Exclude dockerfile under tests and exclude check Dockerfile under tes…
ZePan110 Jan 7, 2025
95f70c5
Update README.md for quick start guide (#1355)
yinghu5 Jan 7, 2025
4f49222
[pre-commit.ci] pre-commit autoupdate (#1356)
pre-commit-ci[bot] Jan 7, 2025
5133389
Update README.md for support matrix (#983)
yinghu5 Jan 7, 2025
b792fbe
[ChatQNA] Fix K8s Deployment for CPU/HPU (#1274)
theBeginner86 Jan 7, 2025
050cea8
Change license template from 2024 to 2025 (#1358)
ZePan110 Jan 7, 2025
cc8d31b
Disable GMC CI temporarily (#1359)
yongfengdu Jan 8, 2025
346d83c
Adapt refactor comps (#1340)
WenjiaoYue Jan 8, 2025
bf57252
remove chatqna-conversation-ui build in CI test (#1361)
chensuyue Jan 8, 2025
29195d3
Add helm deployment instructions for codegen (#1351)
yongfengdu Jan 8, 2025
ac7dcce
Adapt example code for guardrails refactor (#1360)
lvliang-intel Jan 8, 2025
104a77e
Refactor web retrievers links (#1338)
Spycsh Jan 8, 2025
fc3c2de
fixed build issue (#1367)
jaswanth8888 Jan 8, 2025
f0db9b9
Enable OpenTelemetry Tracing for ChatQnA TGI serving on Gaudi (#1316)
louie-tsai Jan 9, 2025
cd474dd
Update docker file path for feedback management refactor (#1364)
lvliang-intel Jan 9, 2025
d6ef94a
Update example code for prompt registry refactor (#1362)
lvliang-intel Jan 9, 2025
4d89b6d
Update path for finetuning (#1306)
XinyuYe-Intel Jan 9, 2025
4d09d96
Update dockerfile path for text2image (#1307)
XinyuYe-Intel Jan 9, 2025
e2b801d
Update action token for CI (#1374)
chensuyue Jan 9, 2025
2897b51
Add helm deployment instructions for GenAIExamples (#1373)
yongfengdu Jan 10, 2025
17caae0
Fix for animation dockerfile path. (#1371)
yao531441 Jan 10, 2025
765a528
Update example code for embedding dependency moving to 3rd_party (#1368)
lvliang-intel Jan 10, 2025
ece61a9
Update README.md for add K8S cluster link for Gaudi (#1380)
yinghu5 Jan 13, 2025
b869f5d
Refactor Faqgen (#1323)
XinyaoWa Jan 13, 2025
03c81b1
Refactor lvm related examples (#1333)
Spycsh Jan 13, 2025
484c594
Refactor docsum (#1336)
XinyaoWa Jan 13, 2025
cea6c1a
minor bug fix for EC-RAG (#1378)
Yongbozzz Jan 14, 2025
5627db6
Remove vllm hpu commit id limit (#1386)
XinyaoWa Jan 14, 2025
401d2ac
Update check-online-doc-build.yml (#1390)
NeoZhangJianyu Jan 15, 2025
374c388
Fix CI filter issue (#1393)
ZePan110 Jan 15, 2025
d614719
AvatarChatbot - add files for deploy on AMD GPU (only TGI)
Dec 24, 2024
2774b7d
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
ec7a2a5
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
eca354d
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
2054bf2
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
f31f275
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 24, 2024
44c3c24
AvatarChatbot - add files for deploy on AMD GPU (only TGI)
Dec 24, 2024
b5c3f71
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
a3efddb
AvatarChatbot - fix Docker Compose file for deploy on AMD GPU (only TGI)
Dec 24, 2024
3ad4aa3
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
296089c
AvatarChatbot - fix README file for deploy on AMD GPU (only TGI)
Dec 24, 2024
243d224
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 24, 2024
c6c23d0
AvatarChatbot - fix README
Jan 15, 2025
701d6c7
Merge remote-tracking branch 'origin/feature/AvatarChatbot_ROCm_clear…
Jan 15, 2025
d897625
AvatarChatbot - fix tests script
Jan 15, 2025
05df654
AvatarChatbot - fix tests script
Jan 15, 2025
8e53b1e
AvatarChatbot - fix tests script
Jan 15, 2025
79130ea
AvatarChatbot - fix tests script
Jan 15, 2025
6cc6e0d
AvatarChatbot - fix tests script
Jan 15, 2025
9067a44
AvatarChatbot - fix tests script
Jan 15, 2025
7d01870
AvatarChatbot - fix tests script
Jan 15, 2025
f3c0dd6
AvatarChatbot - fix tests script
Jan 15, 2025
420e484
AvatarChatbot - fix tests script
Jan 15, 2025
6a6166a
AvatarChatbot - fix tests script
Jan 15, 2025
018e02c
AvatarChatbot - fix tests script
Jan 15, 2025
371a513
AvatarChatbot - fix tests script
Jan 15, 2025
63ee25c
AvatarChatbot - fix tests script
Jan 15, 2025
a79bec1
AvatarChatbot - fix tests script
Jan 15, 2025
98c0206
AvatarChatbot - fix tests script
Jan 15, 2025
5a01817
AvatarChatbot - fix tests script
Jan 15, 2025
ff222d2
AvatarChatbot - fix tests script
Jan 15, 2025
7fea1e6
AvatarChatbot - fix tests script
Jan 16, 2025
527903b
AvatarChatbot - fix tests script
Jan 16, 2025
b9c217d
AvatarChatbot - fix tests script
Jan 16, 2025
dfb11b2
AvatarChatbot - fix tests script
Jan 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
209 changes: 209 additions & 0 deletions AvatarChatbot/docker_compose/amd/gpu/rocm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
# Build Mega Service of AvatarChatbot on AMD GPU

This document outlines the deployment process for a AvatarChatbot application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server.

## 🚀 Build Docker images

### 1. Source Code install GenAIComps

```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
```

### 2. Build ASR Image

```bash
docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/integrations/dependency/whisper/Dockerfile .


docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/Dockerfile .
```

### 3. Build LLM Image

```bash
docker build --no-cache -t opea/llm-textgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/src/text-generation/Dockerfile .
```

### 4. Build TTS Image

```bash
docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/src/integrations/dependency/speecht5/Dockerfile .

docker build -t opea/tts:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/tts/src/Dockerfile .
```

### 5. Build Animation Image

```bash
docker build -t opea/wav2lip:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/wav2lip/src/Dockerfile .

docker build -t opea/animation:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/animation/src/Dockerfile .
```

### 6. Build MegaService Docker Image

To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below:

```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/AvatarChatbot/
docker build --no-cache -t opea/avatarchatbot:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
```

Then run the command `docker images`, you will have following images ready:

1. `opea/whisper:latest`
2. `opea/asr:latest`
3. `opea/llm-tgi:latest`
4. `opea/speecht5:latest`
5. `opea/tts:latest`
6. `opea/wav2lip:latest`
7. `opea/animation:latest`
8. `opea/avatarchatbot:latest`

## 🚀 Set the environment variables

Before starting the services with `docker compose`, you have to recheck the following environment variables.

```bash
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export host_ip=$(hostname -I | awk '{print $1}')

export TGI_SERVICE_PORT=3006
export TGI_LLM_ENDPOINT=http://${host_ip}:${TGI_SERVICE_PORT}
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"

export ASR_ENDPOINT=http://${host_ip}:7066
export TTS_ENDPOINT=http://${host_ip}:7055
export WAV2LIP_ENDPOINT=http://${host_ip}:7860

export MEGA_SERVICE_HOST_IP=${host_ip}
export ASR_SERVICE_HOST_IP=${host_ip}
export TTS_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export ANIMATION_SERVICE_HOST_IP=${host_ip}

export MEGA_SERVICE_PORT=8888
export ASR_SERVICE_PORT=3001
export TTS_SERVICE_PORT=3002
export LLM_SERVICE_PORT=3007
export ANIMATION_SERVICE_PORT=3008

export DEVICE="cpu"
export WAV2LIP_PORT=7860
export INFERENCE_MODE='wav2lip+gfpgan'
export CHECKPOINT_PATH='/usr/local/lib/python3.11/site-packages/Wav2Lip/checkpoints/wav2lip_gan.pth'
export FACE="assets/img/avatar5.png"
# export AUDIO='assets/audio/eg3_ref.wav' # audio file path is optional, will use base64str in the post request as input if is 'None'
export AUDIO='None'
export FACESIZE=96
export OUTFILE="/outputs/result.mp4"
export GFPGAN_MODEL_VERSION=1.4 # latest version, can roll back to v1.3 if needed
export UPSCALE_FACTOR=1
export FPS=10
```

Warning!!! - The Wav2lip service works in this solution using only the CPU. To use AMD GPUs and achieve operational performance, the Wav2lip image needs to be modified to adapt to AMD hardware and the ROCm framework.

## 🚀 Start the MegaService

```bash
cd GenAIExamples/AvatarChatbot/docker_compose/intel/cpu/xeon/
docker compose -f compose.yaml up -d
```

## 🚀 Test MicroServices

```bash
# whisper service
curl http://${host_ip}:7066/v1/asr \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'

# asr microservice
curl http://${host_ip}:3001/v1/audio/transcriptions \
-X POST \
-d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
-H 'Content-Type: application/json'

# tgi service
curl http://${host_ip}:3006/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'

# llm microservice
curl http://${host_ip}:3007/v1/chat/completions\
-X POST \
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":false}' \
-H 'Content-Type: application/json'

# speecht5 service
curl http://${host_ip}:7055/v1/tts \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'

# tts microservice
curl http://${host_ip}:3002/v1/audio/speech \
-X POST \
-d '{"text": "Who are you?"}' \
-H 'Content-Type: application/json'

# wav2lip service
cd ../../../..
curl http://${host_ip}:7860/v1/wav2lip \
-X POST \
-d @assets/audio/sample_minecraft.json \
-H 'Content-Type: application/json'

# animation microservice
curl http://${host_ip}:3008/v1/animation \
-X POST \
-d @assets/audio/sample_question.json \
-H "Content-Type: application/json"

```

## 🚀 Test MegaService

```bash
curl http://${host_ip}:3009/v1/avatarchatbot \
-X POST \
-d @assets/audio/sample_whoareyou.json \
-H 'Content-Type: application/json'
```

If the megaservice is running properly, you should see the following output:

```bash
"/outputs/result.mp4"
```

The output file will be saved in the current working directory, as `${PWD}` is mapped to `/outputs` inside the wav2lip-service Docker container.

## Gradio UI

```bash
cd $WORKPATH/GenAIExamples/AvatarChatbot
python3 ui/gradio/app_gradio_demo_avatarchatbot.py
```

The UI can be viewed at http://${host_ip}:7861
<img src="../../../../assets/img/UI.png" alt="UI Example" width="60%">
In the current version v1.0, you need to set the avatar figure image/video and the DL model choice in the environment variables before starting AvatarChatbot backend service and running the UI. Please just customize the audio question in the UI.
\*\* We will enable change of avatar figure between runs in v2.0

## Troubleshooting

```bash
cd GenAIExamples/AvatarChatbot/tests
export IMAGE_REPO="opea"
export IMAGE_TAG="latest"
export HUGGINGFACEHUB_API_TOKEN=<your_hf_token>

test_avatarchatbot_on_xeon.sh
```
158 changes: 158 additions & 0 deletions AvatarChatbot/docker_compose/amd/gpu/rocm/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
whisper-service:
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
container_name: whisper-service
ports:
- "7066:7066"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
asr:
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
container_name: asr-service
ports:
- "3001:9099"
ipc: host
environment:
ASR_ENDPOINT: ${ASR_ENDPOINT}
speecht5-service:
image: ${REGISTRY:-opea}/speecht5:${TAG:-latest}
container_name: speecht5-service
ports:
- "7055:7055"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
tts:
image: ${REGISTRY:-opea}/tts:${TAG:-latest}
container_name: tts-service
ports:
- "3002:9088"
ipc: host
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
container_name: tgi-service
ports:
- "${TGI_SERVICE_PORT:-3006}:80"
volumes:
- "./data:/data"
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/:/dev/dri/
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192
llm:
image: ${REGISTRY:-opea}/llm-textgen:${TAG:-latest}
container_name: llm-tgi-server
depends_on:
- tgi-service
ports:
- "3007:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
OPENAI_API_KEY: ${OPENAI_API_KEY}
restart: unless-stopped
wav2lip-service:
image: ${REGISTRY:-opea}/wav2lip:${TAG:-latest}
container_name: wav2lip-service
ports:
- "7860:7860"
ipc: host
volumes:
- ${PWD}:/outputs
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
DEVICE: ${DEVICE}
INFERENCE_MODE: ${INFERENCE_MODE}
CHECKPOINT_PATH: ${CHECKPOINT_PATH}
FACE: ${FACE}
AUDIO: ${AUDIO}
FACESIZE: ${FACESIZE}
OUTFILE: ${OUTFILE}
GFPGAN_MODEL_VERSION: ${GFPGAN_MODEL_VERSION}
UPSCALE_FACTOR: ${UPSCALE_FACTOR}
FPS: ${FPS}
WAV2LIP_PORT: ${WAV2LIP_PORT}
restart: unless-stopped
animation:
image: ${REGISTRY:-opea}/animation:${TAG:-latest}
container_name: animation-server
ports:
- "3008:9066"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
WAV2LIP_ENDPOINT: ${WAV2LIP_ENDPOINT}
restart: unless-stopped
avatarchatbot-backend-server:
image: ${REGISTRY:-opea}/avatarchatbot:${TAG:-latest}
container_name: avatarchatbot-backend-server
depends_on:
- asr
- llm
- tts
- animation
ports:
- "3009:8888"
environment:
no_proxy: ${no_proxy}
https_proxy: ${https_proxy}
http_proxy: ${http_proxy}
MEGA_SERVICE_HOST_IP: ${MEGA_SERVICE_HOST_IP}
MEGA_SERVICE_PORT: ${MEGA_SERVICE_PORT}
ASR_SERVICE_HOST_IP: ${ASR_SERVICE_HOST_IP}
ASR_SERVICE_PORT: ${ASR_SERVICE_PORT}
LLM_SERVICE_HOST_IP: ${LLM_SERVICE_HOST_IP}
LLM_SERVICE_PORT: ${LLM_SERVICE_PORT}
LLM_SERVER_HOST_IP: ${LLM_SERVICE_HOST_IP}
LLM_SERVER_PORT: ${LLM_SERVICE_PORT}
TTS_SERVICE_HOST_IP: ${TTS_SERVICE_HOST_IP}
TTS_SERVICE_PORT: ${TTS_SERVICE_PORT}
ANIMATION_SERVICE_HOST_IP: ${ANIMATION_SERVICE_HOST_IP}
ANIMATION_SERVICE_PORT: ${ANIMATION_SERVICE_PORT}
WHISPER_SERVER_HOST_IP: ${WHISPER_SERVER_HOST_IP}
WHISPER_SERVER_PORT: ${WHISPER_SERVER_PORT}
SPEECHT5_SERVER_HOST_IP: ${SPEECHT5_SERVER_HOST_IP}
SPEECHT5_SERVER_PORT: ${SPEECHT5_SERVER_PORT}
ipc: host
restart: always

networks:
default:
driver: bridge
Loading
Loading