AirDialogue

AirDialogue is a benchmark dataset for goal-oriented dialogue generation research. This python library contains a collection of tookits that come with the dataset.

AirDialogue paper
AirDialogue dataset
Reference implementation: AirDialogue Model

What's New

Jul 13,2020: Fixed a bug in BLEU evaluation. The current version gives higher BLEU scores. Support evaluation for different roles and add KL-divergence metric (see --infer_metrics).
Jul 12,2020: We update the AirDialogue dataset to version v1.1. We fixed typos, misalignment between KB file and dialogue file. Please download and use the new data.

Prerequisites

General

python (verified on 3.7)
wget

Python Packages

tensorflow (tested on 1.15.0)
tqdm
nltk
flask (for visualization)

Install

To install the pre-build version from pip, use

pip install airdialogue-essentials

To install the bleeding edge from github, use

python setup.py install

Quick Start

Scoring

The official scoring function evaluates the predictive results for a trained model and compare it to the AirDialogue dataset.

airdialogue score --true_data PATH_TO_DATA_FILE --true_kb PATH_TO_KB_FILE \
    --infer_metrics bleu

Context Generation

Context generator generates a valid context-action pair without conversatoin history.

airdialogue contextgen \
    --output_data PATH_TO_OUTPUT_DATA_FILE \
    --output_kb PATH_TO_OUTPUT_KB_FILE \
    --num_samples 100

Preprocessing

AirDialogue proprocess tookie tokenizes dialogue. Preprocess on AirDialogue data requires 50GB of ram to work. Parameter job_type is a set of 5 bits separted by |, which reqpresents train|eval|infer|sp-train|sp-eval. Parameter input_type can be either context for context only data or dialogue for dialogue data with full history.

airdialogue prepro \
  --data_file PATH_TO_DATA_FILE \
  --kb_file PATH_TO_KB_FILE \
  --output_dir "./data/airdialogue/" \
  --output_prefix 'train' --job_type '0|0|0|1|0' --input_type context

Simulator

Simulator is built on top of context generator that provides not only a context-action pair but also a full conversation history generated by two templated chatbot agents.

airdialogue sim \
    --output_data PATH_TO_OUTPUT_DATA_FILE \
    --output_kb PATH_TO_OUTPUT_KB_FILE \
    --num_samples 100

Visualization

Visualization tool displays the content of the raw json file.

airdialogue vis --data_path ./data/airdialogue/json/

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
airdialogue		airdialogue
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AirDialogue

What's New

Prerequisites

General

Python Packages

Install

Quick Start

Scoring

Context Generation

Preprocessing

Simulator

Visualization

About

Releases

Packages

Contributors 5

Languages

License

google/airdialogue

Folders and files

Latest commit

History

Repository files navigation

AirDialogue

What's New

Prerequisites

General

Python Packages

Install

Quick Start

Scoring

Context Generation

Preprocessing

Simulator

Visualization

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages