This is the state-of-the-art model on TabFact Dataset dataset, we leverage the idea proposed in NumGNN into the encoding of tabular data.
- HuggingFace Transformers 2.6.0
- tensorboardX
- pandas 0.25.1
- AllenNLP 0.9.0
- Cross-Attention Between Table and Statement to obtain the representation
- Construct the greater/less mask for the table numeric columns
- Use the the dense greater/less connection to propagate the information in each cell
- Obtain the graph repsentation obtained by the NumGNN
- Finally use it to do the two-way classification.
We demonstrate our results as follows:
Model | Dev | Test |
TableBERT | 66.1 | 65.1 |
GNN | 72.1 | 72.2 |
Creating a folder for saving the model
mkdir models
Downloading the pre-trained model from Amazon S3, also link the folder of all_csv from TabFact dataset.
ln -s TABFACT/data/all_csv .
cd models
Loading the trained GNN Model and reproduce our results:
CUDA_VISIBLE_DEVICES=0 python --model bert-base-multilingual-uncased --do_test --encoding gnn --load_from models/gnn_fp16_numeric/
Retrain your own GNN Model on TabFact:
CUDA_VISIBLE_DEVICES=0 python --model bert-base-multilingual-uncased --do_train --encoding gnn --output_dir models/gnn_fp16_numeric_test --attention cross --lr_default 5e-6