This is our Pytorch implementation of COCON.
CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021)
Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
https://arxiv.org/abs/2006.03535
TL;DR: We propose CoCon to control the content of text generation from LMs by conditioning on content inputs at an interleave layer.
- Python 3.7.6 on Linux
- PyTorch 1.4
Install dependencies with:
pip install -r requirements.txt
- Download COCON's training data from https://github.com/openai/gpt-2-output-dataset
- Place the
medium-345M-k40.${split}.jsonl
files inside thedata/gpt2output/
folder
Train COCON with a GPT-2 language model, with the parameters reported in the paper:
sh train_cocon.sh
After training, the COCON block's weights will be saved as models/COCON/cocon_block_pytorch_model.bin
.
--do_train
: whether to train COCON or not
--output_dir
: directory of COCON weights
--model_name_or_path
: type of language model to train COCON with
--output_hidden_for_cocon_after_block_ind
: index of transformer block whose hidden states are used as input to COCON for content conditioning, value is 6 for results reported in paper, meaning that the output of GPT-2's 7th transformer block is used as COCON block's input.
You can download COCON's pretrained weights here and save it in models/COCON/
to start generating with COCON.
Sample script on how to generate COCON sentiment-controlled text:
sh generation/generate_cocon_sentiments.sh
Sample script on how to generate COCON topic-controlled text:
sh generation/generate_cocon_topics.sh
COCON-generated texts correspond to the cocon_output
key in the output .jsonl
files and Cocon AR output
in the output .txt
files.
--do_cocon_compute
: whether to do COCON generation
--output_dir
: directory of COCON block's weights
--model_name_or_path
: type of language model
--cocon_output_filename
: path of saved generation samples
--cocon_compute_history_source_data_file
: filename of text file containing prompt texts for generation
--cocon_compute_context_source_data_file
: filename of text file containing target content for generation
transformers/
: code for models and optimizerstransformers/modeling_gpt2.py
: code for COCON block and GPT-2 language modelBOW/
: target content tokens used for COCON topic controlattr_markers/
: target content tokens used for COCON sentiment controlprompts/
: prompt text used for text generation
If you find our repository useful, please consider citing our paper:
@inproceedings{
chan2021cocon,
title={CoCon: A Self-Supervised Approach for Controlled Text Generation},
author={Alvin Chan and Yew-Soon Ong and Bill Pung and Aston Zhang and Jie Fu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=VD_ozqvBy4W}
}
Code is based largely on: