Skip to content

A NeuroSymbolic AI technique for extracting relations from documents.

License

Notifications You must be signed in to change notification settings

kracr/document-level-relation-extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RecLink

Revisiting Document-Level Relation Extraction with Context-Guided Knowledge Graph Link Prediction

Sample Image

Acknowledgments

This repository contains code adapted from the following research papers for the purpose of document-level relation extraction. We extend our gratitude to the authors for generously sharing their clean and valuable code implementations.

1. Discriminative Reasoning for Document-level Relation Extraction

Code implementation can be found here: GitHub - xwjim/DRN

2. Modeling Relational Data with Graph Convolutional Networks

Code implementation can be found here: GitHub - JinheonBaek/RGCN

3. Multi-Hop Knowledge Graph Reasoning with Reward Shaping

Code implementation can be found here: GitHub - kingsaint/InductiveExplainableLinkPrediction

Environment

This code is lastly tested with:

  • Python 3.7.x
  • PyTorch 1.7.x
  • CUDA (11.0)
  • torch_geometric 1.7.x, with torch_scatter 2.0.6 and torch_sparse 0.6.9

Libraries

  • numpy (1.19.1)
  • matplotlib (3.3.1)
  • torch (1.7.1)
  • transformers (4.1.1)
  • scikit-learn (0.23.2)
  • wandb (0.10.12)
  • tqdm (4.9.0)

Directory structure

In this directory structure, you have a folder named "Reclink" containing a subdirectory "code." Within the "code" directory, there are several files and subdirectories:

  • link prediction: Files for link prediction module
  • `reasoning_path``: files for generating explanations.
  • Context: Files for creating context.
  • checkpoint/: Directory to store model checkpoints.
  • logs/: Directory to store logs related to the code.
  • models/: Directory to store trained models.
  • config.py: Configuration file for the code.
  • data.py: File containing code related to data processing.
  • test.py: File for testing the code.
  • train.py: File for training the code.
  • utils.py: Utility functions used in the code.

Datasets

This project utilizes the following datasets:

  • DocRED Dataset: The DocRED dataset can be accessed here place in data/docred directory

  • Re-DocRED Dataset: The Re-DocRED dataset is available here place in data/REDocred directory

  • DWIE Dataset: The DWIE dataset's repository can be found here place in data/DWIE directory

    To convert DWIE dataset into docred style: follow the guideline from this code: https://github.com/rudongyu/LogiRE

    Please make sure to review the terms and conditions of each dataset before use.

Training

Follow the steps below to start the training process:

  1. Train reasoning module: Navigate to the code directory using the following command:

    cd code ./runBERT.sh gpu_id

  2. Train link prediction module: python3 RGCN/main.py

Testing

Navigate to the code directory using the following command:

cd code ./evalBERT.sh gpu_id

Explanation

Navigate to the reasoning_path folder And download the respective dataset models from the link: https://drive.google.com/drive/folders/1j0ArOF9mJDvsYCgLmr022VRb51YxJ9np Get the explanation using: For DocRED dataset: ./experiment-rs.sh configs/DOCRED-rs.sh --inference --save_beam_search_paths For DWIE dataset: ./experiment-rs.sh configs/DWIE-rs.sh --inference --save_beam_search_paths

Context creation

Navigate to context directory: Select appropriate datasets and files:

  1. Create REBEL triples using REBEL.py

  2. Create triples of entity context using entity_context.py (give entity file here)

  3. Create triples of context path using context_path.py (provide both entities to calculate path)

  4. You can check these files align with dataset using similarity.py and remove noise.

Citation

Please cite:

@article{Jain_Mutharaju_Kavuluru_Singh_2024,
    title={Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction},
    volume={38},
    url={https://ojs.aaai.org/index.php/AAAI/article/view/29792},
    DOI={10.1609/aaai.v38i16.29792},
    number={16},
    journal={Proceedings of the AAAI Conference on Artificial Intelligence},
    author={Jain, Monika and Mutharaju, Raghava and Kavuluru, Ramakanth and Singh, Kuldeep},
    year={2024},
    month={Mar.},
    pages={18327-18335}
}
    

About

A NeuroSymbolic AI technique for extracting relations from documents.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published