diff --git a/README.md b/README.md index 92d4a641c4..0ba1699756 100644 --- a/README.md +++ b/README.md @@ -72,6 +72,13 @@ and the provided information will be used in LLM-powered reply generation as a p # Quick Start +### System Requirements + +- Operating System: Ubuntu 18.04+, Windows 10+ (через WSL \& WSL2), MacOS Big Sur; +- Version of `docker` from 20 and above; +- Version of `docker-compose` v1.29.2; +- Operative Memory from 2 Gb (using proxy), from 4 Gb (LLM-based prompted distributions) and from 20 Gb (old scripted distributions). + ### Clone the repo ``` diff --git a/README_ru.md b/README_ru.md index 40597b01be..3a43648748 100644 --- a/README_ru.md +++ b/README_ru.md @@ -65,6 +65,15 @@ Deepy GoBot Base содержит аннотатор исправления оп # Quick Start + +### System Requirements + +- Операционная система Ubuntu 18.04+, Windows 10+ (через WSL \& WSL2), MacOS Big Sur; +- Версия docker от 20 и выше; +- Версия docker-compose v1.29.2; +- Оперативная память от 2 гигабайт (при использовании прокси контейнеров), от 4 гигабайт (при использовании дистрибутивов на основе БЯМ) и от 20 гигабайт (при использовании сценарных дистрибутивов). + + ### Склонируйте репозиторий ``` @@ -189,33 +198,35 @@ docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.ove ## Annotators -| Name | Requirements | Description | -|------------------------|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Badlisted Words | 50 MB RAM | detects obscene Russian words from the badlist | -| Entity Detection | 5.5 GB RAM | extracts entities and their types from utterances | -| Entity Linking | 400 MB RAM | finds Wikidata entity ids for the entities detected with Entity Detection | -| Fact Retrieval | 6.5 GiB RAM, 1 GiB GPU | Аннотатор извлечения параграфов Википедии, релевантных истории диалога. | -| Intent Catcher | 900 MB RAM | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps | -| NER | 1.7 GB RAM, 4.9 GB GPU | extracts person names, names of locations, organizations from uncased text using ruBert-based (pyTorch) model | -| Sentseg | 2.4 GB RAM, 4.9 GB GPU | recovers punctuation using ruBert-based (pyTorch) model and splits into sentences | -| Spacy Annotator | 250 MB RAM | token-wise annotations by Spacy | -| Spelling Preprocessing | 8 GB RAM | Russian Levenshtein correction model | -| Toxic Classification | 3.5 GB RAM, 3 GB GPU | Toxic classification model from Transformers specified as PRETRAINED_MODEL_NAME_OR_PATH | -| Wiki Parser | 100 MB RAM | extracts Wikidata triplets for the entities detected with Entity Linking | -| DialogRPT | 3.8 GB RAM, 2 GB GPU | DialogRPT model which is based on [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2) and fine-tuned on Russian Pikabu Comment sequences | +| Name | Requirements | Description | +|----------------------------|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Badlisted Words | 50 MB RAM | detects obscene Russian words from the badlist | +| Entity Detection | 5.5 GB RAM | extracts entities and their types from utterances | +| Entity Linking | 400 MB RAM | finds Wikidata entity ids for the entities detected with Entity Detection | +| Fact Retrieval | 6.5 GiB RAM, 1 GiB GPU | Аннотатор извлечения параграфов Википедии, релевантных истории диалога. | +| Intent Catcher | 900 MB RAM | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps | +| NER | 1.7 GB RAM, 4.9 GB GPU | extracts person names, names of locations, organizations from uncased text using ruBert-based (pyTorch) model | +| Relative Persona Extractor | 50 MB RAM | Annotator utilizing Sentence Ranker to rank persona sentences and selecting `N_SENTENCES_TO_RETURN` the most relevant sentences | +| Sentseg | 2.4 GB RAM, 4.9 GB GPU | recovers punctuation using ruBert-based (pyTorch) model and splits into sentences | +| Spacy Annotator | 250 MB RAM | token-wise annotations by Spacy | +| Spelling Preprocessing | 8 GB RAM | Russian Levenshtein correction model | +| Toxic Classification | 3.5 GB RAM, 3 GB GPU | Toxic classification model from Transformers specified as PRETRAINED_MODEL_NAME_OR_PATH | +| Wiki Parser | 100 MB RAM | extracts Wikidata triplets for the entities detected with Entity Linking | +| DialogRPT | 3.8 GB RAM, 2 GB GPU | DialogRPT model which is based on [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2) and fine-tuned on Russian Pikabu Comment sequences | ## Skills & Services -| Name | Requirements | Description | -|----------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------| -| DialoGPT | 2.8 GB RAM, 2 GB GPU | [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2) | -| Dummy Skill | | a fallback skill with multiple non-toxic candidate responses and random Russian questions | -| Personal Info Skill | 40 MB RAM | queries and stores user's name, birthplace, and location | -| DFF Generative Skill | 50 MB RAM | **[New DFF version]** generative skill which uses DialoGPT service to generate 3 different hypotheses | -| DFF Intent Responder | 50 MB RAM | provides template-based replies for some of the intents detected by Intent Catcher annotator | -| DFF Program Y Skill | 80 MB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot | -| DFF Friendship Skill | 70 MB RAM | **[New DFF version]** DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill | -| DFF Template Skill | 50 MB RAM | **[New DFF version]** DFF-based skill that provides an example of DFF usage | -| Text QA | 3.8 GiB RAM, 5.2 GiB GPU | Навык для ответа на вопросы по тексту. | +| Name | Requirements | Description | +|-----------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| DialoGPT | 2.8 GB RAM, 2 GB GPU | [Russian DialoGPT by DeepPavlov](https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2) | +| Dummy Skill | | a fallback skill with multiple non-toxic candidate responses and random Russian questions | +| Personal Info Skill | 40 MB RAM | queries and stores user's name, birthplace, and location | +| DFF Generative Skill | 50 MB RAM | **[New DFF version]** generative skill which uses DialoGPT service to generate 3 different hypotheses | +| DFF Intent Responder | 50 MB RAM | provides template-based replies for some of the intents detected by Intent Catcher annotator | +| DFF Program Y Skill | 80 MB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot | +| DFF Friendship Skill | 70 MB RAM | **[New DFF version]** DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill | +| DFF Template Skill | 50 MB RAM | **[New DFF version]** DFF-based skill that provides an example of DFF usage | +| Seq2seq Persona-based | 1.5 GB RAM, 1.5 GB GPU | generative service based on Transformers seq2seq model, the model was pre-trained on the PersonaChat dataset to generate a response conditioned on a several sentences of the socialbot's persona | +| Text QA | 3.8 GiB RAM, 5.2 GiB GPU | Навык для ответа на вопросы по тексту. | diff --git a/annotators/IntentCatcherTransformers/README.md b/annotators/IntentCatcherTransformers/README.md index 543948977c..96d4bc1bc4 100644 --- a/annotators/IntentCatcherTransformers/README.md +++ b/annotators/IntentCatcherTransformers/README.md @@ -1,5 +1,7 @@ -## IntentCatcher based on Transformers +## Intent Catcher based on Transformers +Intent Catcher Annotator allows to adapt the dialog system to particular tasks. +The annotator detects intents of the user that are addressed by the DFF Intent Responder Skill. English version was trained on `intent_phrases.json` dataset using `DeepPavlov` library via command: ``` diff --git a/annotators/relative_persona_extractor/README.txt b/annotators/relative_persona_extractor/README.txt index 87b3e77ef9..46941ff9f5 100644 --- a/annotators/relative_persona_extractor/README.txt +++ b/annotators/relative_persona_extractor/README.txt @@ -3,3 +3,5 @@ An annotator that utilizes Sentence Ranker to find the most relevant to the current context sentences from the bot's persona description. The number of returned sentences is given as an environmental variable using `N_SENTENCES_TO_RETURN` in `docker-compose.yml`. + +This annotator allows to adapt the dialog system to particular system's persona. diff --git a/services/seq2seq_persona_based/README.md b/services/seq2seq_persona_based/README.md index adfa543d41..e8d598650e 100644 --- a/services/seq2seq_persona_based/README.md +++ b/services/seq2seq_persona_based/README.md @@ -1,2 +1,8 @@ -### List models -- [bart persona based](./bart_persona_based/) \ No newline at end of file +# Sequence-to-sequence Persona-based Skill + +## Description + +Sequence-to-sequence Persona-based Skill is aimed to provide generted responses taking into account +system's persona description extracted by Relative Persona Extractor Annotator. + +This skill allows to adapt the dialog system to particular system's persona. diff --git a/skills/dff_intent_responder_skill/README.md b/skills/dff_intent_responder_skill/README.md index 6b2d5a4367..7889e20298 100644 --- a/skills/dff_intent_responder_skill/README.md +++ b/skills/dff_intent_responder_skill/README.md @@ -1,44 +1,7 @@ -# dff_template_skill +# DFF Intent Responder Skill ## Description -**dff_template_skill** is a skill to exit the dialogue. There are only answers here, phrases for leaving the dialogue are detected in the ** IntentCatcher ** annotator. +DFF Intent Responder Skill is aimed to provide template-based responses to detected intents by Intent Catcher Annotator. -## Quickstart from docker - -```bash -# create local.yml -python utils/create_local_yml.py -d assistant_dists/dream/ -s dff-template-skill -# build service -docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/local.yml up -d --build dff-template-skill -# run tests -docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/local.yml exec dff-template-skill bash test.sh -# check logs -docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/local.yml logs -f dff-template-skill -# run a dialog with the agent -docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/local.yml exec agent python -m deeppavlov_agent.run -``` - -## Quickstart without docker - -```bash -pip install -r requirements.txt -gunicorn --workers=1 server:app -b 0.0.0.0:${SERVICE_PORT} -``` - -## Resources - -* Execution time: 46 ms -* Starting time: 1.5 sec -* RAM: 45 MB - -## Change history -### Jan 8, 2022 -The dialogue skill **skills\dff-intent-responder-skill** was created based on **skills\IntentResponder** service to refactor old code with the usage of the new dff framework. The new service repeats the previous service logic which is based on the intention detection from the payload of the inbound HTTP request. The intention from the latest human_utterances element with "detected"=1 and the highest confidence is selected and the appropriate response is created; the confidence value is sent to output without change. If no input intention is detected, then a default response with 'dont_understand' logic is sent. -### Jan 15, 2022 -Tests for all input intentions including a default case are added. -### Jan 21, 2022 -The dialogue skill **skills\dff-intent-responder-skill** is moved to **skills\dff_template_skill**. Code review changes applied, tests are recreated in microservice environment. - - -## Depencencies +This skill allows to adapt the dialog system to particular tasks. \ No newline at end of file