Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google-api-skill utilizing LangChain's google api wrapper #377

Merged
merged 28 commits into from
Jun 5, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
88b7264
add: distribution with google api skill
Kpetyxova Apr 4, 2023
97a2662
component.yml, pipeline.yml, codestyle
Kpetyxova Apr 5, 2023
e4a0476
fix: codestyle
Kpetyxova Apr 5, 2023
45288a3
fix: yml files
Kpetyxova Apr 7, 2023
1a40769
Merge branch 'dev' into add/google_api_skill
dilyararimovna Apr 12, 2023
9f9817f
fix: remove readme, fix combined
dilyararimovna Apr 12, 2023
c776579
fix: description
dilyararimovna Apr 12, 2023
4801f9e
fix: dist name
dilyararimovna Apr 12, 2023
f6dff1e
fix: other fixes
dilyararimovna Apr 12, 2023
eba1f58
fix: empty tests
dilyararimovna Apr 12, 2023
9069f46
deleted readme, fixed dist info, fixed state_formatter
Kpetyxova Apr 17, 2023
bfb2a8b
Merge branch 'add/google_api_skill' of https://github.com/deepmipt/dr…
Kpetyxova Apr 24, 2023
5b830ba
fix: debian
Kpetyxova Apr 24, 2023
c24ab9d
Merge branch 'dev' of https://github.com/deeppavlov/dream into add/go…
dilyararimovna Apr 25, 2023
aee9cb9
Merge branch 'add/google_api_skill' of https://github.com/deeppavlov/…
dilyararimovna Apr 25, 2023
bb235bc
Merge branch 'dev' into add/google_api_skill
Kpetyxova May 18, 2023
a2ecf29
added new config files
Kpetyxova May 18, 2023
975fb57
fix: ports, timeouts, parameters
Kpetyxova May 23, 2023
330ef51
Merge branch 'dev' into add/google_api_skill
dilyararimovna May 29, 2023
1d25163
refactor: remove extra components
dilyararimovna May 29, 2023
ba40212
refactor: rename
dilyararimovna May 29, 2023
e6772e2
Merge branch 'dev' into add/google_api_skill
dilyararimovna Jun 2, 2023
852eaed
fix: ports
dilyararimovna Jun 2, 2023
e379ebc
fix: turn on all not prompted skills
dilyararimovna Jun 2, 2023
89dcb14
fix: pipeline and combined
dilyararimovna Jun 2, 2023
de07f73
fix: remove old cards
dilyararimovna Jun 2, 2023
eac0bd5
feat: add google api skill to dream persona openai
dilyararimovna Jun 2, 2023
ec48ba3
fix: remove unused formatter
dilyararimovna Jun 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions assistant_dists/dream_google_api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Dream Prompted Distribution
Kpetyxova marked this conversation as resolved.
Show resolved Hide resolved

**_One may consider this distribution as a TEMPLATE for a prompt-based distribution which may contain any number of
prompt-based skills each of which is conditioned on a single prompt during the whole conversation_**

**Note!** Each Prompt-based Skill utilizes the **same prompt during the whole dialog**!

# What is Dream Prompted Distribution

Dream Prompted distribution is an example of the prompt-based dialogue system which contains one prompt-based skill,
in particular, prompt is a persona description.

Dream Prompted distribution contains the following skills:
* Dummy Skill (`dummy_skill`) is a fallback skill (also it is a part of agent container, so no separate container required)
* DFF Dream Persona Prompted Skill (`dff_dream_persona_prompted_skill`) is a skill created via DFF (Dialog Flow Framework)
which generates a response to the current dialogue context taking into account the given prompt, i.g., bot's persona description.

### DFF Dream Persona Prompted Skill

The **DFF Dream Persona Prompted Skill** is a light-weight container sending requests to the generative service
which utilizes a neural network for prompt-based generation.
DFF Dream Persona Prompted Skill accepts two main environmental variables:
* `PROMPT_FILE` contains a path to a JSON file containing dictionary with prompt,
* `GENERATIVE_SERVICE_URL` contains a URL of the generative service to be used.
The service must utilize the same input-output format as Transformers-LM (`transformers_lm`).
* `N_UTTERANCES_CONTEXT` contains lengths of the considered context in terms of number of dialogue utterances.

**Note!** DFF Dream Persona Prompted Skill utilizes a special universal template `skills/dff_template_prompted_skill`
which do not require creation of the new skill's directory. For your convenience, creating a new skill,
you should utilize the same template folder but specify another prompt file, service port, and specify another container name.

### Prompt Selector

The distribution may contain **several Prompt-based skills.** Therefore, the **Prompt Selector** component is presented.
The Prompt Selector is also a light-weight container utilizing **Sentence Ranker** component
(its URL is given in `.env` file as `SENTENCE_RANKER_SERVICE_URL`) to select `N_SENTENCES_TO_RETURN`
the most relevant prompts (precisely, it returns ordered list of prompt names) among the given ones.
The `,`-joint list of the prompt names to be considered is given as an environmental variable `PROMPTS_TO_CONSIDER`.
Each considered prompt should be located as `dream/common/prompts/<prompt_name>.json`.

**Note!** In the Dream Persona Prompted Distribution we give a list of prompts to the Prompt Selector: `dream_persona,pizza`
separated with semicolon just for the demonstration of the `PROMPTS_TO_CONSIDER`'s input format. Actually,
Dream Persona Prompted Distribution contains only one prompted skill which utilizes Dream Persona prompt.

### Skill Selector

You should not do any changes in the Skill Selector, it would call all the skills with the most relevant prompts
automatically according to the Prompt Selector. If Prompt Selector annotations are detected in the user utterance,
the Skill Selector turns on skills with names `dff_<prompt_name>_prompted_skill` for each prompt_name from
`N_SENTENCES_TO_RETURN` the most relevant prompts detected by Prompt Selector.
Therefore, a prompt name can contain `'_'` but not `'-'`.

**Note!** Pay attention that you may specify to the Prompt Selector prompt names
even if the corresponding skills are not presented in the distribution, so if you, for example, specify 5 prompt names
while your distribution contains only 2 prompted skill, and you assign the number of returned most relevant prompts
(`N_SENTENCES_TO_RETURN`) to 3, you may face a situation when the Prompt Selector will choose all prompts for which
you do not have skills, so the response on that step will be provided by other skills presented in the distribution
(in particular, by Dummy Skill for the current version of Dream Prompted distribution).

# How to Create a New Prompted Distribution

If one wants to create a new prompted distribution (distribution containing prompt-based skill(s)), one should:

1. Copy the `dream/assistant_dists/dream_persona_prompted` directory to `dream/assistant_dists/dream_custom_prompted`
(the name is an example!).
2. **For each prompt-based skill, one needs to**:
1. create a `dream/common/prompts/<prompt_name>.json` files containing a prompt.
**Important!** `<prompt_name>` should only contain letters, numbers and underscores (`_`) but no dashes (`-`)!
2. in `dream/assistant_dists/dream_custom_prompted/` folder in files `docker-compose.override.yml`, `dev.yml`
copy description of container `dream-persona` and replace strings `dream-persona` with `<prompt-name>`
(container names are using dashes) and
`dream_persona` with `<prompt_name>` (component names are using underscores).
If your prompt name is written as a single word
(for example, `spacexfaq` not `spacex_faq`), replace both `dream-persona` and `dream_persona` with your prompt name.
3. for each new container (a new container for each new DFF skill) change the `SERVICE_PORT`
to an unused one.
3. Choose the generative service to be used. For that one needs to:
1. in `dream/assistant_dists/dream_custom_prompted/` folder in files `docker-compose.override.yml`, `dev.yml`
replace `transformers-lm-gptj` container description to a new one.
In particular, one may replace in `PRETRAINED_MODEL_NAME_OR_PATH` parameter
a utilized Language Model (LM) `GPT-J` with another one from `Transformers` library.
Please change a port (`8130` for `transformers-lm-gptj`) to unused ones.
2. in all prompted skills' containers descriptions change `GENERATIVE_SERVICE_URL` to your generative model.
Take into account that the service name is constructed as `http://<container-name>:<port>/<endpoint>`.
4. For each prompted skill, one needs to create an input state formatter. To do that, one needs to:
1. in `dream/dp_formatters/state_formatters.py` duplicate function:
```python
def dff_dream_persona_prompted_skill_formatter(dialog):
return utils.dff_formatter(
dialog, "dff_dream_persona_prompted_skill",
types_utterances=["human_utterances", "bot_utterances", "utterances"]
)
```
2. replace string `dream_persona` with `<prompt_name>` (component names are using underscores) in each duplicated function.
5. In `dream/assistant_dists/dream_custom_prompted/pipeline_conf.json` for each prompt-based skill, one needs to:
1. copy description of DFF Dream Persona Prompted Skill:
```json
"dff_dream_persona_prompted_skill": {
"connector": {
"protocol": "http",
"timeout": 4.5,
"url": "http://dff-dream-persona-gpt-j-prompted-skill:8134/respond"
},
"dialog_formatter": "state_formatters.dp_formatters:dff_dream_persona_prompted_skill_formatter",
"response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service",
"previous_services": [
"skill_selectors"
],
"state_manager_method": "add_hypothesis"
},
```
2. replace strings `dream-persona` with `<prompt-name>` (container names are using dashes) and
`dream_persona` with `<prompt_name>` (component names are using underscores). It will change the container name,
skill name, formatter name
3. replace port (`8134` in the example) to the assigned one in
`dream/assistant_dists/dream_custom_prompted/docker-compose.override.yml`.
6. If one does not want to keep DFF Dream Persona Prompted Skill in their distribution, one should remove all mentions
of DFF Dream Persona Prompted Skill container from `yml`-configs and `pipeline_conf.json` files.

**Note!** Please, take into account that naming skill utilizing <prompt_name> according to the instruction above
is very important to provide Skill Selector automatically turn on the prompt-based skills which are returned as
`N_SENTENCES_TO_RETURN` the most relevant prompts.



14 changes: 14 additions & 0 deletions assistant_dists/dream_google_api/cpu.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
version: '3.7'
services:
combined-classification:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
sentence-ranker:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
transformers-lm-gptj:
Kpetyxova marked this conversation as resolved.
Show resolved Hide resolved
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
6 changes: 6 additions & 0 deletions assistant_dists/dream_google_api/db_conf.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"host": "DB_HOST",
"port": "DB_PORT",
"name": "DB_NAME",
"env": true
}
56 changes: 56 additions & 0 deletions assistant_dists/dream_google_api/dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# С такими volumes удобно дебажить, не нужно пересобирать контейнер каждый раз при изменении кода
services:
agent:
volumes:
- ".:/dp-agent"
ports:
- 4242:4242
sentseg:
volumes:
- "./annotators/SentSeg:/src"
ports:
- 8011:8011
convers-evaluation-no-scripts-selector:
volumes:
- "./response_selectors/convers_evaluation_based_selector:/src"
- "./common:/src/common"
ports:
- 8009:8009
badlisted-words:
volumes:
- "./annotators/BadlistedWordsDetector:/src"
- "./common:/src/common"
ports:
- 8018:8018
spelling-preprocessing:
volumes:
- "./annotators/spelling_preprocessing:/src"
ports:
- 8074:8074
combined-classification:
volumes:
- "./common:/src/common"
- "./annotators/combined_classification:/src"
ports:
- 8087:8087
sentence-ranker:
volumes:
- "./services/sentence_ranker:/src"
- "~/.deeppavlov/cache:/root/.cache"
ports:
- 8128:8128
dialogpt:
volumes:
- "./services/dialogpt:/src"
- "./common:/src/common"
- "~/.deeppavlov/cache:/root/.cache"
ports:
- 8125:8125
dff-google-api-skill:
volumes:
- "./skills/dff_google_api_skill:/src"
- "./common:/src/common"
ports:
- 8156:8158

version: "3.7"
167 changes: 167 additions & 0 deletions assistant_dists/dream_google_api/docker-compose.override.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
services:
agent:
command: sh -c 'bin/wait && python -m deeppavlov_agent.run agent.pipeline_config=assistant_dists/dream_persona_prompted/pipeline_conf.json'
Kpetyxova marked this conversation as resolved.
Show resolved Hide resolved
environment:
WAIT_HOSTS: "sentseg:8011, convers-evaluation-no-scripts-selector:8009, badlisted-words:8018, combined-classification:8087,
spelling-preprocessing:8074, sentence-ranker:8128, dialogpt:8125, dff-google-api-skill:8156"
WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-1000}
Kpetyxova marked this conversation as resolved.
Show resolved Hide resolved

sentseg:
env_file: [ .env ]
build:
context: ./annotators/SentSeg/
command: flask run -h 0.0.0.0 -p 8011
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 1.5G
reservations:
memory: 1.5G

combined-classification:
env_file: [ .env ]
build:
args:
CONFIG: combined_classifier.json
SERVICE_PORT: 8087
context: .
dockerfile: ./annotators/combined_classification/Dockerfile
environment:
- CUDA_VISIBLE_DEVICES=0
deploy:
resources:
limits:
memory: 2G
reservations:
memory: 2G

convers-evaluation-no-scripts-selector:
env_file: [ .env ]
build:
args:
TAG_BASED_SELECTION: 1
CALL_BY_NAME_PROBABILITY: 0.5
PROMPT_PROBA: 0.1
ACKNOWLEDGEMENT_PROBA: 0.3
PRIORITIZE_WITH_REQUIRED_ACT: 0
PRIORITIZE_NO_DIALOG_BREAKDOWN: 0
PRIORITIZE_WITH_SAME_TOPIC_ENTITY: 0
IGNORE_DISLIKED_SKILLS: 0
GREETING_FIRST: 1
RESTRICTION_FOR_SENSITIVE_CASE: 1
PRIORITIZE_PROMTS_WHEN_NO_SCRIPTS: 0
MAX_TURNS_WITHOUT_SCRIPTS: 7
ADD_ACKNOWLEDGMENTS_IF_POSSIBLE: 1
PRIORITIZE_SCRIPTED_SKILLS: 0
CONFIDENCE_STRENGTH: 0.8
CONV_EVAL_STRENGTH: 0.4
PRIORITIZE_HUMAN_INITIATIVE: 1
QUESTION_TO_QUESTION_DOWNSCORE_COEF: 0.8
context: .
dockerfile: ./response_selectors/convers_evaluation_based_selector/Dockerfile
command: flask run -h 0.0.0.0 -p 8009
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M

badlisted-words:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8018
SERVICE_NAME: badlisted_words
context: annotators/BadlistedWordsDetector/
command: flask run -h 0.0.0.0 -p 8018
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 256M
reservations:
memory: 256M

spelling-preprocessing:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8074
SERVICE_NAME: spelling_preprocessing
context: ./annotators/spelling_preprocessing/
command: flask run -h 0.0.0.0 -p 8074
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M

sentence-ranker:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8128
SERVICE_NAME: sentence_ranker
PRETRAINED_MODEL_NAME_OR_PATH: sentence-transformers/bert-base-nli-mean-tokens
context: ./services/sentence_ranker/
command: flask run -h 0.0.0.0 -p 8128
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 3G
reservations:
memory: 3G

dialogpt:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8125
SERVICE_NAME: dialogpt
PRETRAINED_MODEL_NAME_OR_PATH: microsoft/DialoGPT-medium
N_HYPOTHESES_TO_GENERATE: 5
CONFIG_NAME: dialogpt_en.json
MAX_HISTORY_DEPTH: 2
context: .
dockerfile: ./services/dialogpt/Dockerfile
command: flask run -h 0.0.0.0 -p 8125
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 2G
reservations:
memory: 2G

dff-google-api-skill:
env_file: [ .env,.env_secret ]
build:
args:
SERVICE_PORT: 8156
SERVICE_NAME: dff_google_api_skill
ENVVARS_TO_SEND: OPENAI_API_KEY,GOOGLE_CSE_ID,GOOGLE_API_KEY
context: .
dockerfile: ./skills/dff_google_api_skill/Dockerfile
command: gunicorn --workers=1 server:app -b 0.0.0.0:8156 --reload
deploy:
resources:
limits:
memory: 128M
reservations:
memory: 128M

version: '3.7'
25 changes: 25 additions & 0 deletions assistant_dists/dream_google_api/gpu1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
services:
agent:
restart: unless-stopped
volumes:
- "/cephfs/home/ignatov/artifacts:/output"
Kpetyxova marked this conversation as resolved.
Show resolved Hide resolved
- ".:/dp-agent"
ports:
- ${AGENT_PORT}:4242
combined-classification:
restart: unless-stopped
environment:
- CUDA_VISIBLE_DEVICES=1
mongo:
restart: unless-stopped
command: mongod
image: mongo:4.0.0
sentence-ranker:
restart: unless-stopped
environment:
- CUDA_VISIBLE_DEVICES=1
transformers-lm-gptj:
Kpetyxova marked this conversation as resolved.
Show resolved Hide resolved
restart: unless-stopped
environment:
- CUDA_VISIBLE_DEVICES=0
version: '3.7'
Loading