Skip to content

Code for our SIGIR 2023 paper. EXPLAIGNN provides a pipeline for conversational question answering (ConvQA) over heterogeneous sources, and code for iterative graph neural networks (GNNs). Such iterative GNNs can help to causally explaignn GNN outputs.

License

Notifications You must be signed in to change notification settings

PhilippChr/EXPLAIGNN

Repository files navigation

EXPLAIGNN

Explainable Conversational Question Answering over Heterogeneous Sources via Iterative Graph Neural Networks

Description

This repository contains the code and data for our SIGIR 2023 paper on "Explainable Conversational Question Answering over Heterogeneous Sources via Iterative Graph Neural Networks", and builds upon the CONVINSE code for our SIGIR 2022 paper.

Our new approach, EXPLAIGNN follows the following general pipeline:

  1. Question Understanding (QU) -- creating an intent-explicit structured representation of a question and its conversational context
  2. Evidence Retrieval (ER) -- harnessing this frame-like representation to uniformly capture relevant evidences from different sources
  3. Heterogeneous Answering (HA) -- deriving the answer from this set of evidences from heterogeneous sources.

The focus of this work is on the answering phase. In this stage, a heterogeneous graph is constructed from the retrieved evidences and corresponding entities, as the basis for applying graph neural networks (GNNs). The GNNs are iteratively applied for computing the best answers and supporting evidences in a small number of steps.

Further details can be found on the EXPLAIGNN website and in the corresponding paper pre-print. An interactive demo will also follow soon!

If you use this code, please cite:

@article{christmann2023explainable,
  title={Explainable Conversational Question Answering over
Heterogeneous Sources via Iterative Graph Neural Networks},
  author={Christmann, Philipp and Roy, Rishiraj Saha and Weikum, Gerhard},
  journal={SIGIR},
  year={2023}
}

Code

System requirements

All code was tested on Linux only.

Installation

Clone the repo via: We recommend the installation via conda, and provide the corresponding environment file in conda-explaignn.yml:

    git clone https://github.com/PhilippChr/EXPLAIGNN.git
    cd EXPLAIGNN/
    conda env create --file conda-explaignn.yml
    conda activate explaignn
    pip install -e .

Alternatively, you can also install the requirements via pip, using the requirements.txt file (not tested). In this case, for running the code on a GPU, further packages might be required.

Install dependencies

EXPLAIGNN makes use of CLOCQ for retrieving relevant evidences. CLOCQ can be conveniently integrated via the publicly available API, using the client from the repo. If efficiency is a primary concern, it is recommended to directly run the CLOCQ code on the local machine (details are given in the repo).
In either case, it can be installed via:

    make install_clocq

Optional: If you want to use or compare with QuReTeC or FiD please follow the installation guides in the CONVINSE repo.

To initialize the repo (download data, benchmark, models), run:

    bash scripts/initialize.sh

Reproduce paper results

If you would like to reproduce the results of EXPLAIGNN for all sources (Table 1 in the SIGIR 2023 paper), or a specific source combination, run:

    bash scripts/pipeline.sh --gold-answers config/convmix/explaignn.yml kb_text_table_info

the last parameter (kb_text_table_info) specifies the sources to be used, separated by an underscore. E.g. "kb_text_info" would evaluate EXPLAIGNN using evidences from KB, text and infoboxes. Note, that EXPLAIGNN retrieves evidences on-the-fly by default. Given that the evidences in the information sources can change quickly