TablERT-CNN: Joint Entity and Relation Extraction Based on Table Labeling Using Convolutional Neural Networks
This is the PyTorch code for the paper 'Joint Entity and Relation Extraction Based on Table Labeling Using Convolutional Neural Networks' accepted at SPNLP2022, an ACL 2022 workshop.
This software is implemented based on SpERT [1] and TablERT [2].
Required
- Python 3.5+ (tested with version 3.7.13)
- PyTorch 1.1.0+ (tested with version 1.11.1)
- transformers 2.2.0+ (tested with version 4.17.0)
- scikit-learn (tested with version 0.21.3)
- tqdm (tested with version 4.64.0)
- numpy (tested with version 1.17.4)
Optional
- jinja2 (tested with version 3.1.1) - if installed, used to export relation extraction examples
- tensorboardX (tested with version 1.6) - if installed, used to save training process to tensorboard
The details are listed in requirements.txt
. It is recommended to prepare the enviroment via the following command to reproduce the results.
pip install -r requirements.txt
We have provided the data template in the folder data/datasets/conll04/
.
We further process the dataset used in SpERT[1]. To obtain the data, first obtain processed SpERT data, then run the pre-processing script as below.
python3 preprocessing/spert2tabcnn.py $SPERT_DATA_DIR data/datasets/conll04/
We have provided the data template in the folder data/datasets/ace05/
.
We further process the dataset used in DyGIE++[5]. To obtain the data, follow the instructions provided by DyGIE++. Then, run the pre-processing script as below.
python3 preprocessing/dygiepp2tabcnn.py $SPERT_DATA_DIR data/datasets/ace05/
We have provided the data template in the folder data/datasets/ade/
.
We further process the dataset used in SpERT[1]. To obtain the data, first obtain processed SpERT data, then run the pre-processing script as below.
python3 preprocessing/spert2tabcnn.py $SPERT_DATA_DIR data/datasets/ade/
Note that the converted dataset has no overlapping entities.
Here we provide examples using the CoNLL04 dataset. Experiments on ACE05 and ADE can be carried out analogously, after putting the pre-processed dataset in corresponding folders.
To train a model on CoNLL04 train set and evaluate on CoNLL04 development set, run
python run.py train --config configs/train_conll04.conf
To evalute a model on CoNLL04 test set, fill in the field model_path
in configs/eval_conll04.conf
with the directory of the model. Then, run
python run.py eval --config config/eval_conll04.conf
If you use the provided code in your work, please cite the following paper:
@inproceedings{Ma:SPNLP2022,
title = "Joint Entity and Relation Extraction Based on Table Labeling Using Convolutional Neural Networks",
author={Youmi Ma and Tatsuya Hiraoka and Naoaki Okazaki},
booktitle = "6th Workshop on Structured Prediction for NLP (SPNLP)",
month = may,
year = "2022",
pages = "(to appear)",
address = "Dublin",
publisher = "Association for Computational Linguistics",
}
[1]Markus Eberts and Adrian Ulges, 2020, 'Span-based joint entity and relation extraction with transformerpre-training' In 24th European Conference on Artifi-cial Intelligence (ECAI).
[2]Youmi Ma, Tatsuya Hiraoka, and Naoaki Okazaki. Named Entity Recognition and Relation Extraction Using Enhanced Table Filling by Contextualized Representations. 自然言語処理, 29(1):187–223, March 2022. (doi: 10.5715/jnlp.29.187).
[3]Dan Roth and Wen-tau Yih, 2004, ‘A Linear Programming Formulation forGlobal Inference in Natural Language Tasks’, in Proc. of CoNLL 2004 at HLT-NAACL 2004, pp. 1–8.
[4]Pankaj Gupta, Hinrich Schütze, and Bernt Andrassy, 2016, ‘Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction’, in Proc. of COLING 2016, pp. 2537–2547.
[5]David Wadden, Ulme Wennberg, Yi Luan, and Hannaneh Hajishirzi. 2019. Entity, relation, and event extraction with contextualized span representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5784–5789.