The E2E Challenge Dataset

Authors: Jekaterina Novikova, Ondrej Dusek and Verena Rieser

Download Link

Download the full release of the E2E dataset here (ZIP)

Description

The E2E dataset is a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area.

The E2E dataset poses new challenges:

its human reference texts show more lexical richness and syntactic variation, including discourse phenomena;
generating from this set requires content selection.

As such, learning from this dataset promises more natural, varied and less template-like system utterances.

The E2E set was used in the E2E NLG Challenge, which provides an extensive list of results achieved on this data.

Please refer to our SIGDIAL2017 paper for a detailed description of the dataset.

Citing

If you use this dataset in your work, please cite the following paper:

@inproceedings{novikova2017e2e,
  title={The {E2E} Dataset: New Challenges for End-to-End Generation},
  author={Novikova, Jekaterina and Du{\v{s}}ek, Ondrej and Rieser, Verena},
  booktitle={Proceedings of the 18th Annual Meeting of the Special Interest 
             Group on Discourse and Dialogue},
  address={Saarbr\"ucken, Germany},
  year={2017},
  note={arXiv:1706.09254},
  url={https://arxiv.org/abs/1706.09254},
}

License

Distributed under the Creative Commons 4.0 Attribution-ShareAlike license (CC4.0-BY-SA).

Acknowledgements

This research received funding from the EPSRC projects DILiGENt (EP/M005429/1) and MaDrIgAL (EP/N017536/1).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
devset.csv		devset.csv
testset.csv		testset.csv
testset_w_refs.csv		testset_w_refs.csv
trainset.csv		trainset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The E2E Challenge Dataset

Download Link

Description

Contents

Files

CSV Data Fields

Citing

License

Acknowledgements

About

Releases 1

Packages

tuetschek/e2e-dataset

Folders and files

Latest commit

History

Repository files navigation

The E2E Challenge Dataset

Download Link

Description

Contents

Files

CSV Data Fields

Citing

License

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Packages