Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Commit

Permalink
[NeuralChat] Integrate photoai backend into restful API (#478)
Browse files Browse the repository at this point in the history
  • Loading branch information
letonghan authored Oct 30, 2023
1 parent d9d0a32 commit d7a1d8d
Show file tree
Hide file tree
Showing 19 changed files with 1,702 additions and 15 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/script/unitTest/run_unit_test_neuralchat.sh
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ function main() {
apt-get update
apt-get install ffmpeg -y
apt-get install lsof
apt-get install libgl1
apt-get install -y libgl1-mesa-glx
apt-get install -y libgl1-mesa-dev
apt-get install libsm6 libxext6 -y
wget http://nz2.archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb
dpkg -i libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb
python -m pip install --upgrade --force-reinstall torch
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Welcome to Photo AI! This example introduces how to deploy the Text Chatbot system and guides you through setting up both the backend and frontend components. You can deploy this chatbot on various platforms, including Intel XEON Scalable Processors, Habana's Gaudi processors (HPU), Intel Data Center GPU and Client GPU, Nvidia Data Center GPU and Client GPU.

| Section | Link |
| ---------------------| --------------------------|
| Backend Setup | [Backend README](./backend/README.md) |
| Frontend Setup | [Frontend README](./frontend/README.md) |


Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
This README is intended to guide you through setting up the backend for a Photo AI demo using the NeuralChat framework. You can deploy it on various platforms, including Intel XEON Scalable Processors, Habana's Gaudi processors (HPU), Intel Data Center GPU and Client GPU, Nvidia Data Center GPU and Client GPU.


# Setup Environment


## Setup Conda

First, you need to install and configure the Conda environment:

```shell
# Download and install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash `Miniconda*.sh`
source ~/.bashrc
```

## Install numactl

Next, install the numactl library:

```shell
sudo apt install numactl
```

## Install Python dependencies

Install the following Python dependencies using Conda:

```shell
conda install astunparse ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses -y
conda install jemalloc gperftools -c conda-forge -y
conda install git-lfs -y
# install libGL.so.1 for opencv
sudo apt-get update
sudo apt-get install -y libgl1-mesa-glx
```

Install other dependencies using pip:

```bash
pip install -r ../../../requirements.txt
```

## Install Models
```shell
git-lfs install
# download llama-2 model for NER plugin
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
# download spacy model for NER post process
python -m spacy download en_core_web_lg
```


# Setup Database
## Install MySQL
```shell
# install mysql
sudo apt-get install mysql-server
# start mysql server
systemctl status mysql
```

## Create Tables
```shell
cd ../../../utils/database/
# login mysql
mysql
source ./init_db_ai_photos.sql
```

## Create Image Database
```shell
mkdir /home/nfs_images
export IMAGE_SERVER_IP="your.server.ip"
```

# Configurate photoai.yaml

You can customize the configuration file 'photoai.yaml' to match your environment setup. Here's a table to help you understand the configurable options:

| Item | Value |
| ------------------- | ---------------------------------------|
| host | 127.0.0.1 |
| port | 9000 |
| model_name_or_path | "./Llama-2-7b-chat-hf" |
| device | "auto" |
| asr.enable | true |
| tts.enable | true |
| ner.enable | true |
| tasks_list | ['voicechat', 'photoai'] |


# Run the PhotoAI server
To start the PhotoAI server, use the following command:

```shell
nohup bash run.sh &
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
from intel_extension_for_transformers.neural_chat import NeuralChatServerExecutor

def main():
server_executor = NeuralChatServerExecutor()
server_executor(
config_file="./photoai.yaml",
log_file="./photoai.log")


if __name__ == "__main__":
main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This is the parameter configuration file for NeuralChat Serving.

#################################################################################
# SERVER SETTING #
#################################################################################
host: 0.0.0.0
port: 9000

model_name_or_path: "meta-llama/Llama-2-7b-chat-hf"
device: "auto"

asr:
enable: true
args:
device: "cpu"
model_name_or_path: "openai/whisper-small"
bf16: false

tts:
enable: true
args:
device: "cpu"
voice: "default"
stream_mode: true
output_audio_path: "./output_audio"

ner:
enable: true
args:
device: "cpu"
model_path: "./Llama-2-7b-chat-hf"
spacy_model: "en_core_web_lg"
bf16: true


tasks_list: ['voicechat', 'photoai']
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Kill the exist and re-run
ps -ef |grep 'photoai' |awk '{print $2}' |xargs kill -9

# KMP
export KMP_BLOCKTIME=1
export KMP_SETTINGS=1
export KMP_AFFINITY=granularity=fine,compact,1,0

# OMP
export OMP_NUM_THREADS=56
export LD_PRELOAD=${CONDA_PREFIX}/lib/libiomp5.so

# tc malloc
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so

# environment variables
export MYSQL_PASSWORD="root"
export MYSQL_HOST="127.0.0.1"
export MYSQL_DB="ai_photos"

numactl -l -C 0-55 python -m photoai 2>&1 | tee run.log
Empty file.
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
TextIteratorStreamer,
AutoConfig,
)
import intel_extension_for_pytorch as intel_ipex
from .utils.utils import (
enforce_stop_tokens,
get_current_time
Expand All @@ -41,11 +40,17 @@ class NamedEntityRecognition():
Set bf16=True if you want to inference with bf16 model.
"""

def __init__(self, model_path="./Llama-2-7b-chat-hf/", spacy_model="en_core_web_lg", bf16: bool=False) -> None:
def __init__(self,
model_path="meta-llama/Llama-2-7b-chat-hf",
spacy_model="en_core_web_lg",
bf16: bool=False,
device="cpu") -> None:
# initialize tokenizer and models
self.nlp = spacy.load(spacy_model)
config = AutoConfig.from_pretrained(model_path, trust_remote_code=True)
config.init_device = 'cuda:0' if torch.cuda.is_available() else "cpu"
self.device = device
self.bf16 = False
self.tokenizer = AutoTokenizer.from_pretrained(
model_path,
use_fast=False if (re.search("llama", model_path, re.IGNORECASE)
Expand All @@ -59,9 +64,15 @@ def __init__(self, model_path="./Llama-2-7b-chat-hf/", spacy_model="en_core_web_
device_map="auto",
trust_remote_code=True
)
self.bf16 = bf16
# make sure ipex is available on current server
try:
import intel_extension_for_pytorch as intel_ipex
self.is_ipex_available = True
except ImportError:
self.is_ipex_available = False
# optimize model with ipex if bf16
if bf16:
if bf16 and self.is_ipex_available:
self.bf16 = bf16
self.model = intel_ipex.optimize(
self.model.eval(),
dtype=torch.bfloat16,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,11 @@ class NamedEntityRecognitionINT():
"""

def __init__(self,
model_path="/home/tme/Llama-2-7b-chat-hf/",
model_path="meta-llama/Llama-2-7b-chat-hf",
spacy_model="en_core_web_lg",
compute_dtype="fp32",
weight_dtype="int8") -> None:
weight_dtype="int8",
device="cpu") -> None:
self.nlp = spacy.load(spacy_model)
config = WeightOnlyQuantConfig(compute_dtype=compute_dtype, weight_dtype=weight_dtype)
self.tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
Expand Down
3 changes: 3 additions & 0 deletions intel_extension_for_transformers/neural_chat/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ tiktoken==0.4.0
lm_eval
accelerate
cchardet
pymysql
deepface
exifread
spacy
neural-compressor==2.3.1
pymysql
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,6 @@ torchaudio==2.1.0
spacy
neural-compressor==2.3.1
pymysql
deepface
exifread

Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,8 @@ tiktoken==0.4.0
lm_eval
spacy
neural-compressor==2.3.1
intel_extension_for_pytorch
pymysql
deepface
exifread

21 changes: 15 additions & 6 deletions intel_extension_for_transformers/neural_chat/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ neuralchat_server help
### Start the server
- Command Line (Recommended)

NeuralChat provides a default chatbot configuration in `./conf/neuralchat.yaml`. User could customize the behavior of this chatbot by modifying the value of these fields in the configuration file to specify which LLM model and plugins to be used.
NeuralChat provides a default chatbot configuration in `./config/neuralchat.yaml`. User could customize the behavior of this chatbot by modifying the value of these fields in the configuration file to specify which LLM model and plugins to be used.

| Fields | Sub Fields | Default Values | Possible Values |
| ------------------------- | ------------------------ | --------------------------------------- | --------------------------------- |
Expand Down Expand Up @@ -42,18 +42,27 @@ NeuralChat provides a default chatbot configuration in `./conf/neuralchat.yaml`.
| | args.process | false | true, false |
| cache | enable | false | true, false |
| | args.config_dir | "../../pipeline/plugins/caching/cache_config.yaml" | A valid directory path |
| | args.embedding_model_dir | "hkunlp/instructor-large" | A valid directory path |
| | args.embedding_model_dir | "hkunlp/instructor-large" | A valid directory path |
| safety_checker | enable | false | true, false |
| tasks_list | | ['textchat', 'retrieval'] | List of task names, including 'textchat', 'voicechat', 'retrieval', 'text2image', 'finetune' |
| ner | enable | false | true, false |
| | args.model_path | "meta-llama/Llama-2-7b-chat-hf" | A valid directory path of llm model |
| | args.spacy_model | "en_core_web_lg" | A valid name of downloaded spacy model |
| | args.bf16 | false | true, false |
| ner_int | enable | false | true, false |
| | args.model_path | "meta-llama/Llama-2-7b-chat-hf" | A valid directory path of llm model |
| | args.spacy_model | "en_core_web_lg" | A valid name of downloaded spacy model |
| | args.compute_dtype | "fp32" | "fp32", "int8" |
| | args.weight_dtype | "int8" | "int8", "int4" |
| tasks_list | | ['textchat', 'retrieval'] | List of task names, including 'textchat', 'voicechat', 'retrieval', 'text2image', 'finetune', 'photoai' |



First set the service-related configuration parameters, similar to `./conf/neuralchat.yaml`. Set `tasks_list`, which represents the supported tasks included in the service to be started.
First set the service-related configuration parameters, similar to `./config/neuralchat.yaml`. Set `tasks_list`, which represents the supported tasks included in the service to be started.
**Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.

Then start the service:
```bash
neuralchat_server start --config_file ./server/conf/neuralchat.yaml
neuralchat_server start --config_file ./server/config/neuralchat.yaml
```

- Python API
Expand All @@ -62,7 +71,7 @@ from neuralchat.server.neuralchat_server import NeuralChatServerExecutor

server_executor = NeuralChatServerExecutor()
server_executor(
config_file="./conf/neuralchat.yaml",
config_file="./config/neuralchat.yaml",
log_file="./log/neuralchat.log")
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,17 +77,19 @@ safety_checker:
ner:
enable: false
args:
device: "cpu"
model_path: "meta-llama/Llama-2-7b-chat-hf"
spacy_model: "en_core_web_lg"
bf16: False

ner_int:
enable: false
args:
device: "cpu"
model_path: "meta-llama/Llama-2-7b-chat-hf"
spacy_model: "en_core_web_lg"
compute_dtype: "fp32"
weight_dtype: "int8"

# task choices = ['textchat', 'voicechat', 'retrieval', 'text2image', 'finetune']
# task choices = ['textchat', 'voicechat', 'retrieval', 'text2image', 'finetune', 'photoai']
tasks_list: ['textchat', 'retrieval']
Loading

0 comments on commit d7a1d8d

Please sign in to comment.