This repository modifies and adapts the official mt-dnn repository to multi-dataset stance detection and robustness experiments. Ten stance detection datasets of different domains are trained via single-dataset and multi-dataset learning. Three adversarial attacks to probe and compare the robustness of both settings have been added. The framework can easily be adapted to include more datasets and adversarial attack sets.
The BERT and MT-DNN weights, fine-tuned on all ten stance detection datasets, can be downloaded here. The models can be placed in folder mt_dnn_models and used by modifying the run scripts (see below).
Further details can be found in our publication Stance Detection Benchmark: How Robust Is Your Stance Detection?.
The repository requires Python 3.6+
git clone https://github.com/OpenNMT/OpenNMT-py.git
cd OpenNMT-py/
python3 setup.py install
pip install -r requirements.txt
bash download.sh
Note: three of the datasets have to be manually retrieved due to license reasons. Please follow the instructions at the bottom of the download.sh file
python prepro.py
Note: see parse_args() in prepro.py for additional parameters, e.g.:
- for additional low resource experiment ratios, add/remove ratios to LOW_RESOURCE_DATA_RATES variable
- to create the data for adversarial test sets, add --generate_adversarial 1 --opennmt_gpu 0
- please note that the paraphrasing can take several hours to complete for all datasets
- use a GPU to fasten up translation (export CUDA_VISIBLE_DEVICES=0)
All scripts can be found in the scripts folder.
# Train all stance detection datasets in a multi dataset fashion
bash run_mt_dnn_all_stance_MTL_seed_loop.sh (see bash file for parameters)
E.g.: bash run_mt_dnn_all_stance_MTL_seed_loop.sh 0 mt_dnn_large 100
# Train single datasets
bash run_mt_dnn_ST_seed_loop.sh (see bash file for parameters)
E.g.: bash run_mt_dnn_ST_seed_loop.sh 1 bert_model_base ibmcs 30
# For multi dataset model
bash evaluate_mt_dnn_all_stance_MTL_seed_loop.sh (see bash file for parameters)
# For single dataset model
bash evaluate_mt_dnn_ST_seed_loop.sh (see bash file for parameters)
E.g.: bash evaluate_mt_dnn_ST_seed_loop.sh 1 bert_model_base ibmcs 2019-12-19T2001 30
Note:
- adapt the configuration in this file to your needs, e.g. set stress_tests variable if adversarial attack sets have been created.
- the timestamp can be found in the checkpoint folder for the specific model (folder name)
The predictions for each seed can be found in the checkpoints folder. The scores for each seed can be found in the results folder. To average the results over all seeds, use the following script:
python eval_results/eval_results.py
Note: Adapt the parameters in the script to your needs. A new file for the test and each adversarial attack set will be created in the specified model folder in results. Each file has an additional key all_seed_avg with the averaged results.
For more infos of the underlying Framework itself, please refer to the official mt-dnn repository.
For all steps in the following, please use the other dataset entries for guidance.
- Add your dataset file to the data folder
- Add an entry for your dataset in the prepro.py
- Add your dataset configuration in data_utils/label_map.py into the dictionaties, as well as in train.py and predict.py to dump_result_files().
- Add a dataset reader in data_utils/glue_utils.py
- Add your dataset key (e.g. like "snopes") to scripts/run_mt_dnn_all_stance_MTL_seed_loop.sh and execute the script
For all steps in the following, please use the other adversarial attacks for guidance.
- Add your function in data_utils/glue_utils.py (e.g. like create_adversarial_negation())
- Add the function call to sample_handling() in data_utils/glue_utils.py
- Add your additional adversarial attack as an additional returned parameter for the datasets in prepro.py
- Pass the adversarial attack data into build_handler() in prepro.py and add another entry for your attack in this function
Note: If the attack modifies the length of the original sentences, please consider this for the cutoff that takes place in functions build_data() and build_data_single() in prepro.py in order to avoid information loss.
If you find this work helpful, please cite our publication Stance Detection Benchmark: How Robust Is Your Stance Detection?:
@article{schiller2021stance,
author = {Schiller, Benjamin and Daxenberger, Johannes and Gurevych, Iryna},
year = {2021},
month = {03},
title = {Stance Detection Benchmark: How Robust is Your Stance Detection?},
journal = {KI - Künstliche Intelligenz},
doi = {10.1007/s13218-021-00714-w}
}
Contact person: Benjamin Schiller