This repository contains code for the following paper:
Xinyu Hua and Lu Wang Sentence-Level Content Planning and Style Specification for Neural Text Generation
If you find our work useful, please cite:
@inproceedings{hua-wang-2019-sentence,
title = "Sentence-Level Content Planning and Style Specification for Neural Text Generation",
author = "Hua, Xinyu and
Wang, Lu",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
}
download link: link
arggen test set with target arguments: link
arggen untokenized datasets: link
task | # tokens target | # keyphrase | source |
---|---|---|---|
arggen | 54.87 | 55.80 | changemyview |
wikigen | 70.57/48.60 | 23.56 | Normal/Simple Wikipedia |
absgen | 141.34 | 12.23 | AGENDA |
errata: prior to March 04, 2020, there was a problem with the vocabulary file for absgen. If you are using this model, please replace the old vocab.txt with the new one.
note: all actions below assume src/
to be the working directory.
To train an argument generation model:
python main.py --mode=train \
--exp_name=arggen_exp \
--encode_passage \
--type_conditional_lm \
--task=arggen \
--batch_size=30 \
--num_train_epochs=30 \
--logging_freq=2
--max_src_words=500 \
--max_passage_words=400 \
--max_sent_num=10 \
--max_bank_size=70 \
To train an abstract generation model, which has no sentence level style labels:
python main.py --mode=train \
--exp_name=absgen_exp \
--task=absgen \
--batch_size=30 \
--num_train_epochs=30 \
--max_src_words=1000 \
--max_bank_size=30 \
--logging_freq=2
To train a Wikipedia generation model:
python main.py --mode=train \
--exp_name=wikigen_exp \
--type_conditional_lm \
--task=wikigen \
--batch_size=30 \
--max_bank_size=30 \
--num_train_epochs=30 \
--max_src_words=1000 \
--logging_freq=2
See the LICENSE file for details.