Graph Based Network with Contextualized Representations of Turns in Dialogue (EMNLP 2021)

Official PyTorch implementation of our EMNLP paper: Graph Based Network with Contextualized Representations of Turns in Dialogue

Environments

python (3.8.3)
cuda (11.0)
Ubuntu-18.04.5

Requirements

dgl-cu110 (0.5.3)
torch (1.7.0)
numpy (1.19.2)
sklearn
regex
packaging
tqdm

PS: If you use Docker, you can download Docker images that we used in our experiment here.

Usage

run_classifier.py : Code to train and evaluate the model
data.py : Code to define Datasets / Dataloader for TUCORE-GCN
evaluate.py : Code to evaluate the model on DialogRE
models/BERT : The directory containing the TUCORE-GCN for BERT version
models/RoBERTa : The directory containing the TUCORE-GCN for RoBERTa version
datasets/MELD/MELD4TUCOREGCN.py : Code to convert the MELD to DialogRE style as suggested in the paper
datasets/EmoryNLP/EMORY4TUCOREGCN.py : Code to convert the EmoryNLP to DialogRE style as suggested in the paper
datasets/DailyDialog/DailyDialog4TUCOREGCN.py : Code to convert the DailyDialog to DialogRE style as suggested in the paper

Preparation

Dataset

DialogRE

Download data from here
Put train.json, dev.json, test.json from data_v2/en/data/ into the directory datasets/DialogRE/

MELD

Download data from here
Put train_sent_emo.csv, dev_sent_emo.csv.json, test_sent_emo.csv.json from data/MELD/ into the directory datasets/MELD/
In MELD, execute python MELD4TUCOREGCN.py

EmoryNLP

Download data from here
Put emotion-detection-trn.json, emotion-detection-dev.json, emotion-detection-tst.json from json/ into the directory datasets/EmoryNLP/
In EmoryNLP, execute python EMORY4TUCOREGCN.py

DailyDialog

Download and unzip data from here
Put and unziptrain.zip, validation.zip, test.zip from ijcnlp_dailydialog/ into the directory datasets/DailyDialog/
In DailyDialog, execute python DailyDialog4TUCOREGCN.py

Pre-trained Language Models

BERT Base

Download and unzip BERT-Base Uncased from here, and copy the files into the directory pre-trained_model/BERT/
Set up the environment variable for BERT by export BERT_BASE_DIR=/PATH/TO/BERT/DIR.
In pre-trained_model, execute python convert_tf_checkpoint_to_pytorch_BERT.py --tf_checkpoint_path=$BERT_BASE_DIR/bert_model.ckpt --bert_config_file=$BERT_BASE_DIR/bert_config.json --pytorch_dump_path=$BERT_BASE_DIR/pytorch_model.bin.

RoBERTa Large

Download and unzip RoBERTa-large from here, and copy the files into the directory pre-trained_model/RoBERTa/
Download merges.txt and vocab.json from here and put them into the directory pre-trained_model/RoBERTa/
Set up the environment variable for RoBERTa by export RoBERTa_LARGE_DIR=/PATH/TO/RoBERTa/DIR.
In pre-trained_model, execute python convert_roberta_original_pytorch_checkpoint_to_pytorch.py --roberta_checkpoint_path=$RoBERTa_LARGE_DIR --pytorch_dump_folder_path=$RoBERTa_LARGE_DIR.

Training & Evaluation

BERT + DialogRE

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval --encoder_type BERT  --data_dir datasets/DialogRE --data_name DialogRE   --vocab_file $BERT_BASE_DIR/vocab.txt   --config_file $BERT_BASE_DIR/bert_config.json   --init_checkpoint $BERT_BASE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 3e-5   --num_train_epochs 20.0   --output_dir TUCOREGCN_BERT_DialogRE  --gradient_accumulation_steps 2

rm TUCOREGCN_BERT_DialogRE/model_best.pt

python evaluate.py --dev datasets/DialogRE/dev.json --test datasets/DialogRE/test.json --f1dev TUCOREGCN_BERT_DialogRE/logits_dev.txt --f1test TUCOREGCN_BERT_DialogRE/logits_test.txt --f1cdev TUCOREGCN_BERT_DialogRE/logits_devc.txt --f1ctest TUCOREGCN_BERT_DialogRE/logits_testc.txt --result_path TUCOREGCN_BERT_DialogRE/result.txt

BERT + MELD

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval --encoder_type BERT  --data_dir datasets/MELD --data_name MELD   --vocab_file $BERT_BASE_DIR/vocab.txt   --config_file $BERT_BASE_DIR/bert_config.json   --init_checkpoint $BERT_BASE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 3e-5   --num_train_epochs 10.0   --output_dir TUCOREGCN_BERT_MELD  --gradient_accumulation_steps 2

rm TUCOREGCN_BERT_MELD/model_best.pt

BERT + EmoryNLP

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval  --encoder_type BERT --data_dir datasets/EmoryNLP --data_name EmoryNLP   --vocab_file $BERT_BASE_DIR/vocab.txt   --config_file $BERT_BASE_DIR/bert_config.json   --init_checkpoint $BERT_BASE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 3e-5   --num_train_epochs 10.0   --output_dir TUCOREGCN_BERT_EmoryNLP  --gradient_accumulation_steps 2

rm TUCOREGCN_BERT_EmoryNLP/model_best.pt

BERT + DailyDialog

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval  --encoder_type BERT --data_dir datasets/DailyDialog --data_name DailyDialog   --vocab_file $BERT_BASE_DIR/vocab.txt   --config_file $BERT_BASE_DIR/bert_config.json   --init_checkpoint $BERT_BASE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 3e-5   --num_train_epochs 10.0   --output_dir TUCOREGCN_BERT_DailyDialog  --gradient_accumulation_steps 2

rm TUCOREGCN_BERT_DailyDialog/model_best.pt

RoBERTa + DialogRE

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval --encoder_type RoBERTa  --data_dir datasets/DialogRE --data_name DialogRE   --vocab_file $RoBERTa_LARGE_DIR/vocab.json --merges_file $RoBERTa_LARGE_DIR/merges.txt  --config_file $RoBERTa_LARGE_DIR/config.json   --init_checkpoint $RoBERTa_LARGE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 5e-6   --num_train_epochs 30.0   --output_dir TUCOREGCN_RoBERTa_DialogRE  --gradient_accumulation_steps 2

rm TUCOREGCN_RoBERTa_DialogRE/model_best.pt

python evaluate.py --dev datasets/DialogRE/dev.json --test datasets/DialogRE/test.json --f1dev TUCOREGCN_RoBERTa_DialogRE/logits_dev.txt --f1test TUCOREGCN_RoBERTa_DialogRE/logits_test.txt --f1cdev TUCOREGCN_RoBERTa_DialogRE/logits_devc.txt --f1ctest TUCOREGCN_RoBERTa_DialogRE/logits_testc.txt --result_path TUCOREGCN_RoBERTa_DialogRE/result.txt

RoBERTa + MELD

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval --encoder_type RoBERTa  --data_dir datasets/MELD --data_name MELD   --vocab_file $RoBERTa_LARGE_DIR/vocab.json --merges_file $RoBERTa_LARGE_DIR/merges.txt   --config_file $RoBERTa_LARGE_DIR/config.json   --init_checkpoint $RoBERTa_LARGE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 5e-6   --num_train_epochs 10.0   --output_dir TUCOREGCN_RoBERTa_MELD  --gradient_accumulation_steps 2

rm TUCOREGCN_RoBERTa_MELD/model_best.pt

RoBERTa + EmoryNLP

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval --encoder_type RoBERTa  --data_dir datasets/EmoryNLP --data_name EmoryNLP   --vocab_file $RoBERTa_LARGE_DIR/vocab.json --merges_file $RoBERTa_LARGE_DIR/merges.txt   --config_file $RoBERTa_LARGE_DIR/config.json   --init_checkpoint $RoBERTa_LARGE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 5e-6   --num_train_epochs 10.0   --output_dir TUCOREGCN_RoBERTa_EmoryNLP  --gradient_accumulation_steps 2

rm TUCOREGCN_RoBERTa_EmoryNLP/model_best.pt

RoBERTa + DailyDialog

Execute the following commands in TUCORE-GCN:

python run_classifier.py --do_train --do_eval  --encoder_type RoBERTa --data_dir datasets/DailyDialog --data_name DailyDialog   --vocab_file $RoBERTa_LARGE_DIR/vocab.json --merges_file $RoBERTa_LARGE_DIR/merges.txt   --config_file $RoBERTa_LARGE_DIR/config.json   --init_checkpoint $RoBERTa_LARGE_DIR/pytorch_model.bin   --max_seq_length 512   --train_batch_size 12   --learning_rate 5e-6   --num_train_epochs 10.0   --output_dir TUCOREGCN_RoBERTa_DailyDialog  --gradient_accumulation_steps 2

rm TUCOREGCN_RoBERTa_DailyDialog/model_best.pt

Citation

@inproceedings{lee-choi-2021-graph,
    title = "Graph Based Network with Contextualized Representations of Turns in Dialogue",
    author = "Lee, Bongseok  and
      Choi, Yong Suk",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.36",
    pages = "443--455",
    abstract = "Dialogue-based relation extraction (RE) aims to extract relation(s) between two arguments that appear in a dialogue. Because dialogues have the characteristics of high personal pronoun occurrences and low information density, and since most relational facts in dialogues are not supported by any single sentence, dialogue-based relation extraction requires a comprehensive understanding of dialogue. In this paper, we propose the TUrn COntext awaRE Graph Convolutional Network (TUCORE-GCN) modeled by paying attention to the way people understand dialogues. In addition, we propose a novel approach which treats the task of emotion recognition in conversations (ERC) as a dialogue-based RE. Experiments on a dialogue-based RE dataset and three ERC datasets demonstrate that our model is very effective in various dialogue-based natural language understanding tasks. In these experiments, TUCORE-GCN outperforms the state-of-the-art models on most of the benchmark datasets. Our code is available at https://github.com/BlackNoodle/TUCORE-GCN.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
models		models
pre-trained_model		pre-trained_model
Architecture.PNG		Architecture.PNG
LICENSE		LICENSE
README.md		README.md
data.py		data.py
evaluate.py		evaluate.py
optimization.py		optimization.py
run_classifier.py		run_classifier.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph Based Network with Contextualized Representations of Turns in Dialogue (EMNLP 2021)

Environments

Requirements

Usage

Preparation

Dataset

DialogRE

MELD

EmoryNLP

DailyDialog

Pre-trained Language Models

BERT Base

RoBERTa Large

Training & Evaluation

BERT + DialogRE

BERT + MELD

BERT + EmoryNLP

BERT + DailyDialog

RoBERTa + DialogRE

RoBERTa + MELD

RoBERTa + EmoryNLP

RoBERTa + DailyDialog

Citation

About

Releases

Packages

Languages

License

BlackNoodle/TUCORE-GCN

Folders and files

Latest commit

History

Repository files navigation

Graph Based Network with Contextualized Representations of Turns in Dialogue (EMNLP 2021)

Environments

Requirements

Usage

Preparation

Dataset

DialogRE

MELD

EmoryNLP

DailyDialog

Pre-trained Language Models

BERT Base

RoBERTa Large

Training & Evaluation

BERT + DialogRE

BERT + MELD

BERT + EmoryNLP

BERT + DailyDialog

RoBERTa + DialogRE

RoBERTa + MELD

RoBERTa + EmoryNLP

RoBERTa + DailyDialog

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages