A model for transforming questions + short answers into full answer sentences.
The dataset and the models are described in the following paper:
Demszky, D., Guu, K., & Liang, P. (2018). Transforming Question Answering Datasets Into Natural Language Inference Datasets. arXiv preprint. arXiv:1809.02922.[link]
This repo contains the code and examples for both the rule-based model and the neural model.
Data available on Codalab.
We illustrate how to use the rule-based model in the designated jupyter notebook. The input sentences have to be dependency parsed. We created our example file in the following manner:
- Save your tokenized (space-separated) questions and short answers in a file, such as
examples.txt
, where each sentence is a line, one example after the other (i.e. question 1 <line-break> short answer 1 <line-break> question 2 <line-break> short answer 2 <line-break> ... question N <line-break> short answer N) - Convert this file into CoNLL-U format,
examples.conllu
, with the tags and labels left empty (_
). - POS tag the file. For ours, we used the parser by Dozat et al. (2017), which can be used as a tagger as well.
- Dependency parse the file. We parsed ours with Dozat et al. (2017).
- Use the resulting, tagged and parsed
examples.conllu
file as an input for the model, as shown in the jupyter notebookRule-based Example.ipynb
.
Coming soon.