Skip to content

Commit

Permalink
Merge pull request #459 from deeppavlov/dev
Browse files Browse the repository at this point in the history
Release v1.4.0
  • Loading branch information
dilyararimovna authored May 17, 2023
2 parents 8106de6 + 27d9174 commit c28c38c
Show file tree
Hide file tree
Showing 26 changed files with 317 additions and 66 deletions.
16 changes: 9 additions & 7 deletions MODELS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@

Here you may find a list of models that currently available for use in Generative Assistants.

| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description |
|--------------------------|--------------------------|---------------------------------------------------------------------|--------------------------|---------------------------|---------------------------|--------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multulingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| Open-Assistant SFT-1 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. |
| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description |
|---------------------------|--------------------------|-------------------------------------------------------------------------|--------------------------|---------------------------|---------------------------|--------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multulingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| Open-Assistant Pythia 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/pythia-12b-sft-v8-7k-steps) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-4 | openai-api-gpt4 | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 8,192 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| GPT-4 32K | openai-api-gpt4-32k | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 32,768 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. Same capabilities as the base gpt-4 mode but with 4x the context length. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ services:
args:
SERVICE_PORT: 8158
SERVICE_NAME: transformers_lm_oasst12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/oasst-sft-1-pythia-12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/pythia-12b-sft-v8-7k-steps
HALF_PRECISION: 1
context: .
dockerfile: ./services/transformers_lm/Dockerfile
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ services:
args:
SERVICE_PORT: 8158
SERVICE_NAME: transformers_lm_oasst12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/oasst-sft-1-pythia-12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/pythia-12b-sft-v8-7k-steps
HALF_PRECISION: 1
context: .
dockerfile: ./services/transformers_lm/Dockerfile
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ services:
args:
SERVICE_PORT: 8158
SERVICE_NAME: transformers_lm_oasst12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/oasst-sft-1-pythia-12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/pythia-12b-sft-v8-7k-steps
HALF_PRECISION: 1
context: .
dockerfile: ./services/transformers_lm/Dockerfile
Expand Down
12 changes: 12 additions & 0 deletions assistant_dists/universal_prompted_assistant/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,18 @@ services:
- "./common:/src/common"
ports:
- 8131:8131
openai-api-gpt4:
volumes:
- "./services/openai_api_lm:/src"
- "./common:/src/common"
ports:
- 8159:8159
openai-api-gpt4-32k:
volumes:
- "./services/openai_api_lm:/src"
- "./common:/src/common"
ports:
- 8160:8160
dff-universal-prompted-skill:
volumes:
- "./skills/dff_universal_prompted_skill:/src"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ services:
WAIT_HOSTS: "sentseg:8011, ranking-based-response-selector:8002, combined-classification:8087,
sentence-ranker:8128,
transformers-lm-gptj:8130, transformers-lm-oasst12b:8158, openai-api-chatgpt:8145, openai-api-davinci3:8131,
openai-api-gpt4:8159, openai-api-gpt4-32k:8160,
dff-universal-prompted-skill:8147"
WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-1000}

Expand Down Expand Up @@ -109,7 +110,7 @@ services:
args:
SERVICE_PORT: 8158
SERVICE_NAME: transformers_lm_oasst12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/oasst-sft-1-pythia-12b
PRETRAINED_MODEL_NAME_OR_PATH: OpenAssistant/pythia-12b-sft-v8-7k-steps
HALF_PRECISION: 1
context: .
dockerfile: ./services/transformers_lm/Dockerfile
Expand Down Expand Up @@ -164,6 +165,46 @@ services:
reservations:
memory: 100M

openai-api-gpt4:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8159
SERVICE_NAME: openai_api_gpt4
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8159
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M

openai-api-gpt4-32k:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8160
SERVICE_NAME: openai_api_gpt4_32k
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8160
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M

dff-universal-prompted-skill:
env_file: [ .env ]
build:
Expand All @@ -172,6 +213,8 @@ services:
SERVICE_NAME: dff_universal_prompted_skill
GENERATIVE_TIMEOUT: 20
N_UTTERANCES_CONTEXT: 7
DEFAULT_LM_SERVICE_URL: http://transformers-lm-oasst12b:8158/respond
DEFAULT_LM_SERVICE_CONFIG: default_generative_config.json
context: .
dockerfile: ./skills/dff_universal_prompted_skill/Dockerfile
command: gunicorn --workers=1 server:app -b 0.0.0.0:8147 --reload
Expand Down
28 changes: 28 additions & 0 deletions components/jkdhfgkhgodfiugpojwrnkjnlg.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: openai_api_gpt4
display_name: GPT-4
component_type: Generative
model_type: NN-based
is_customizable: false
author: publisher@deeppavlov.ai
description: A multilingual instruction-based large language model
which is capable of code generation and other complex tasks.
More capable than any GPT-3.5 model, able to do more complex tasks,
and optimized for chat. Paid.
You must provide your OpenAI API key to use the model.
Your OpenAI account will be charged according to your usage.
ram_usage: 100M
gpu_usage: null
group: services
connector:
protocol: http
timeout: 20.0
url: http://openai-api-gpt4:8159/respond
dialog_formatter: null
response_formatter: null
previous_services: null
required_previous_services: null
state_manager_method: null
tags: null
endpoint: respond
service: services/openai_api_lm/service_configs/openai-api-gpt4
date_created: '2023-04-16T09:45:32'
27 changes: 27 additions & 0 deletions components/oinfjkrbnfmhkfsjdhfsd.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: openai_api_gpt4_32k
display_name: GPT-4 32k
component_type: Generative
model_type: NN-based
is_customizable: false
author: publisher@deeppavlov.ai
description: A multilingual instruction-based large language model
which is capable of code generation and other complex tasks.
Same capabilities as the base gpt-4 mode but with 4x the context length.
Paid. You must provide your OpenAI API key to use the model.
Your OpenAI account will be charged according to your usage.
ram_usage: 100M
gpu_usage: null
group: services
connector:
protocol: http
timeout: 20.0
url: http://openai-api-gpt4-32k:8160/respond
dialog_formatter: null
response_formatter: null
previous_services: null
required_previous_services: null
state_manager_method: null
tags: null
endpoint: respond
service: services/openai_api_lm/service_configs/openai-api-gpt4-32k
date_created: '2023-04-16T09:45:32'
2 changes: 1 addition & 1 deletion components/sdkajfhsidhf8wfjh2ornfkle.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: transformers_lm_oasst12b
display_name: Open-Assistant SFT-1 12B
display_name: Open-Assistant Pythia 12B
component_type: Generative
model_type: NN-based
is_customizable: false
Expand Down
2 changes: 1 addition & 1 deletion response_selectors/llm_based_response_selector/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
)
ENVVARS_TO_SEND = getenv("ENVVARS_TO_SEND", None)
ENVVARS_TO_SEND = [] if ENVVARS_TO_SEND is None else ENVVARS_TO_SEND.split(",")
sending_variables = {f"{var}_list": [getenv(var, None)] for var in ENVVARS_TO_SEND}
sending_variables = {f"{var}s": [getenv(var, None)] for var in ENVVARS_TO_SEND}
# check if at least one of the env variables is not None
if len(sending_variables.keys()) > 0 and all([var_value is None for var_value in sending_variables.values()]):
raise NotImplementedError(
Expand Down
7 changes: 7 additions & 0 deletions services/openai_api_lm/generative_configs/openai-chatgpt.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"max_tokens": 64,
"temperature": 0.4,
"top_p": 1.0,
"frequency_penalty": 0,
"presence_penalty": 0
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"max_tokens": 256,
"temperature": 0.4,
"top_p": 1.0,
"frequency_penalty": 0,
"presence_penalty": 0
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"max_tokens": 64,
"temperature": 0.4,
"top_p": 1.0,
"frequency_penalty": 0,
"presence_penalty": 0
}
2 changes: 1 addition & 1 deletion services/openai_api_lm/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ sentry-sdk[flask]==0.14.1
healthcheck==1.3.3
jinja2<=3.0.3
Werkzeug<=2.0.3
openai==0.27.0
openai==0.27.6
19 changes: 14 additions & 5 deletions services/openai_api_lm/server.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import json
import logging
import os
import time
Expand All @@ -22,6 +23,13 @@

app = Flask(__name__)
logging.getLogger("werkzeug").setLevel("WARNING")
DEFAULT_CONFIGS = {
"text-davinci-003": json.load(open("generative_configs/openai-text-davinci-003.json", "r")),
"gpt-3.5-turbo": json.load(open("generative_configs/openai-chatgpt.json", "r")),
"gpt-4": json.load(open("generative_configs/openai-chatgpt.json", "r")),
"gpt-4-32k": json.load(open("generative_configs/openai-chatgpt.json", "r")),
}
CHAT_COMPLETION_MODELS = ["gpt-3.5-turbo", "gpt-4", "gpt-4-32k"]


def generate_responses(context, openai_api_key, openai_org, prompt, generation_params, continue_last_uttr=False):
Expand All @@ -31,8 +39,8 @@ def generate_responses(context, openai_api_key, openai_org, prompt, generation_p
openai.api_key = openai_api_key
openai.organization = openai_org if openai_org else None

if PRETRAINED_MODEL_NAME_OR_PATH == "gpt-3.5-turbo":
logger.info("model=gpt-3.5-turbo, use special chat completion endpoint")
if PRETRAINED_MODEL_NAME_OR_PATH in CHAT_COMPLETION_MODELS:
logger.info("Use special chat completion endpoint")
s = len(context) % 2
messages = [
{"role": "system", "content": prompt},
Expand Down Expand Up @@ -71,7 +79,7 @@ def generate_responses(context, openai_api_key, openai_org, prompt, generation_p
elif isinstance(response, str):
outputs = [response.strip()]

if PRETRAINED_MODEL_NAME_OR_PATH != "gpt-3.5-turbo":
if PRETRAINED_MODEL_NAME_OR_PATH not in CHAT_COMPLETION_MODELS:
# post-processing of the responses by all models except of ChatGPT
outputs = [GENERATIVE_ROBOT_TEMPLATE.sub("\n", resp).strip() for resp in outputs]
return outputs
Expand All @@ -88,10 +96,11 @@ def respond():
contexts = request.json.get("dialog_contexts", [])
prompts = request.json.get("prompts", [])
configs = request.json.get("configs", [])
configs = [DEFAULT_CONFIGS[PRETRAINED_MODEL_NAME_OR_PATH] if el is None else el for el in configs]
if len(contexts) > 0 and len(prompts) == 0:
prompts = [""] * len(contexts)
openai_api_keys = request.json.get("OPENAI_API_KEY_list", [])
openai_orgs = request.json.get("OPENAI_ORGANIZATION_list", None)
openai_api_keys = request.json.get("openai_api_keys", [])
openai_orgs = request.json.get("openai_api_organizations", None)
openai_orgs = [None] * len(contexts) if openai_orgs is None else openai_orgs

try:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
SERVICE_PORT: 8160
SERVICE_NAME: openai_api_gpt4_32k
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: openai-api-gpt4-32k
endpoints:
- respond
compose:
env_file:
- .env
build:
args:
SERVICE_PORT: 8160
SERVICE_NAME: openai_api_gpt4_32k
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8160
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M
volumes:
- ./services/openai_api_lm:/src
- ./common:/src/common
ports:
- 8160:8160
proxy: null
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
SERVICE_PORT: 8159
SERVICE_NAME: openai_api_gpt4
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
31 changes: 31 additions & 0 deletions services/openai_api_lm/service_configs/openai-api-gpt4/service.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: openai-api-gpt4
endpoints:
- respond
compose:
env_file:
- .env
build:
args:
SERVICE_PORT: 8159
SERVICE_NAME: openai_api_gpt4
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8159
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M
volumes:
- ./services/openai_api_lm:/src
- ./common:/src/common
ports:
- 8159:8159
proxy: null
Loading

0 comments on commit c28c38c

Please sign in to comment.