TitleStylist

Source code for our "TitleStylist" paper at ACL 2020: Jin, Di, Zhijing Jin, Joey Tianyi Zhou, Lisa Orii, and Peter Szolovits. "Hooks in the Headline: Learning to Generate Headlines with Controlled Styles." ACL (2020).. If you use the code, please cite the paper:

@inproceedings{jin2020hooks,
  author    = {Di Jin and Zhijing Jin and Joey Tianyi Zhou and Lisa Orii and Peter Szolovits},
  title     = {Hooks in the Headline: Learning to Generate Headlines with Controlled
               Styles},
  booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational
               Linguistics, {ACL} 2020, Online, July 5-10, 2020}, pages = {5082--5093},
  publisher = {Association for Computational Linguistics}, year = {2020},
  url       = {https://www.aclweb.org/anthology/2020.acl-main.456/}
}

Here is a talk that introduces our work.

Requirements

Python packages

Pytorch
fairseq
blingfire

In order to install them, you can run this command:

pip install -r requirements.txt

Bash commands

In order to evaluate the generated headlines by ROUGE scores, you need to install the "files2rouge" package. To do so, run the following commands (provided by this repository):

pip install -U git+https://github.com/pltrdy/pyrouge
git clone https://github.com/pltrdy/files2rouge.git     
cd files2rouge
python setup_rouge.py
python setup.py install

Usage

All data including the combination of CNN and NYT article and headline pairs, and the three style-specific corpora (humor, romance, and clickbait) mentioned in the paper have been placed in the folder "data".
Please download the pretrained model parameters of MASS from this link, unzip it, and put the unzipped files into the folder "pretrained_model/MASS".
To train a headline generation model that can simultaneously generated a facutal and a stylistic headline, you can run the following command:

./train_mix_CNN_NYT_X.sh --style YOUR_TARGET_STYLE

Here the arugment YOUR_TARGET_STYLE specifies any style you would like to have, in this paper, we provide three options: humor, romance, clickbait.

After running this command, the trained model parameters will be saved into the folder "tmp/exp".

If you want to evaluate the trained model and generate headlines (both factual and stylistic) using this model, please run the following command:

./evaluate_mix_CNN_NYT_X.sh --style YOUR_TARGET_STYLE --model_dir MODEL_STORED_DIRCTORY

In this command, the argument MODEL_STORED_DIRCTORY specifies the directory which stores the trained model.

If you want to train and evaluate the headline generation model for more than one style, run the following command:

./train_mix_CNN_NYT_multiX.sh
./evaluate_mix_CNN_NYT_multiX.sh --model_dir MODEL_STORED_DIRCTORY

Extension

For the humorous style, although we used humorous novels, you can also try the following datasets:

16000 One-Liners (16K humorous)
Pun of the Day (16K humorous)
Short Jokes (231K humorous)
Plaintext Jokes (208K humorous)

We suggest that the large dataset Short Jokes is likely to generate good headlines.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
mass		mass
pretrained_model		pretrained_model
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_process.sh		data_process.sh
encode.py		encode.py
evaluate_mix_CNN_NYT_X.sh		evaluate_mix_CNN_NYT_X.sh
evaluate_mix_CNN_NYT_multiX.sh		evaluate_mix_CNN_NYT_multiX.sh
file_utils.py		file_utils.py
files_merger.py		files_merger.py
requirements.txt		requirements.txt
tokenization_bert.py		tokenization_bert.py
tokenization_utils.py		tokenization_utils.py
train.py		train.py
train_mix_CNN_NYT_X.sh		train_mix_CNN_NYT_X.sh
train_mix_CNN_NYT_multiX.sh		train_mix_CNN_NYT_multiX.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TitleStylist

Requirements

Python packages

Bash commands

Usage

Extension

About

Releases

Packages

Contributors 2

Languages

License

jind11/TitleStylist

Folders and files

Latest commit

History

Repository files navigation

TitleStylist

Requirements

Python packages

Bash commands

Usage

Extension

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages