This is the official repo for the following paper
- Towards Robustifying NLI Models Against Lexical Dataset Biases, Xiang Zhou and Mohit Bansal, ACL 2020 (arxiv)
This code require Python 3.4 and TensorFlow 1.12.0
All the datasets (train/eval) can be downloaded at here. For detailed description of the datasets, please check the README in the downloaded file.
- Download the datasets and put it under the
data
folder. - Download the GloVe embeddings and put it under the
data
folder.
- First train the baseline BiLSTM model by running
bash scripts/baseline.sh
- Train the debiased model by running
bash scripts/hex.sh
The HEX implementation is adapted from https://github.com/HaohanWang/HEX.
The evaluation scripts is at evaluation.py
. When running evaluation, first change the TESTING_DATASETS
in the file. Then run python evaluation.py scripts/TRAININGSCRIPT
. This script will automatically generate and runs the testing scripts with respect to your training script.
More codes, model checkpoints and documentations will come soon.