Source code and data for the ACL 2021 paper A DQN-based Approach to Finding Precise Evidences for Fact Verification.
More information about the FEVER 1.0 shared task can be found on this website.
- python 3.6.10
- pytorch 1.3.1
- transformers 2.5.1
- prettytable
The structure of the data folder is as follows:
├── data
│ ├── bert
│ │ └── roberta-large
│ ├── dqn
│ ├── fever
│ ├── glue
│ └── retrieved
To replicate the experiments, you need to download these data as follows, or directly obtain them at Google Drive.
Note: due to the large size, you should run the following command to download fever.db
alone and put it into fever
:
# Download the fever database
wget -O data/fever/fever.db https://s3-eu-west-1.amazonaws.com/fever.public/wiki_index/fever.db
bert
: you can download the Roberta pre-trained model with the following commands and put them intobert/roberta-large
.
wget -O pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-pytorch_model.bin
wget -O vocab.json https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json
wget -O merges.txt https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt
wget -O config.json https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json
fever
: you can downloadtrain.jsonl,shared_task_dev.jsonl,shared_task_test.jsonl
from website andfever.db
from GEAR, and then put them infever
.retrieved
: following GEAR, we use the document retrieval results from Athene UKP TU Darmstadt and sentence selection results from GEAR.dqn
: you should first prepare data inretrieved
and then runsh data_propress.sh
to obtain data indqn
.glue
: you should first prepare data inretrieved
and then runsh data_process_for_pretrained.sh
to obtain data inglue
.
Before training, you need to fine-tune the sentence encoding module (i.e., Roberta) first.
Run sh pretrained.sh
first to fine-tune the Roberta and then replace pytorch_model.bin
in data/bert/roberta-large
with pytorch_model.bin
in the best checkpoint.
You can also directly download our fine-tune version at Google Drive.
Run sh train.sh
to train our DQN-based model. All checkpoints of our DQN-based model can be found at Google Drive.
If you train the model at first, it may spend a long time (about 1 day in our machine) for the sentence encoding module to process the sentences into corresponding semantic representations. Due to the large size, we do not upload the processed-ready data to the cloud. You can directly email wanhai@mail.sysu.edu.cn
to obtain the data.
Note: the following commands in train.sh
are to set the version of our DQN-based model. Please choose one before training.
## T-T
export DQN_MODE=transformer # context sub-module
export AGGREGATE=transformer # aggregation sub-module
export ID=TT
## T-A
export DQN_MODE=transformer
export AGGREGATE=attention
export ID=TA
## BiLSTM-T
export DQN_MODE=lstm
export AGGREGATE=transformer
export ID=LT
## BiLSTM-A
export DQN_MODE=lstm
export AGGREGATE=attention
export ID=LA
Run sh dev.sh
/sh test.sh
to evaluate our approach on DEV/TEST set.
After evaluating on TEST, you should submit test_precise_with/without_post_processing.jsonl
to CodaLab to view the blind-test results.
Note: the following commands in dev.sh/test.sh
are to set the version of our DQN-based model. Please note that the CHECKPOINT
in the script should be kept the same as the version.
# context sub-module
export DQN_MODE=transformer
export DQN_MODE=lstm
# aggregation sub-module
export AGGREGATE=transformer
export AGGREGATE=attention
If you use the code, please cite our paper:
@inproceedings{
title={A DQN-based Approach to Finding Precise Evidences for Fact Verification},
author={Hai, Wan and Haicheng, Chen and Jianfeng, Du and Weilin, Luo and Rongzhen, Ye},
booktitle={Proceedings of ACL},
year={2021}
}
if you have questions, suggestions and bug reports, please email:
wanhai@mail.sysu.edu.cn