GTA_ACLED_TKG

This is the code to the paper Dynamic Representations of Global Crises: A Temporal Knowledge Graph For Conflicts, Trade and Value Networks

Please Cite our Paper: Julia Gastinger, Timo Sztyler, Nils Steinert, Sabine Gruender-Fahrer, Michael Martin, Anett Schuelke, Heiner Stuckenschmidt. Dynamic Representations of Global Crises: A Temporal Knowledge Graph For Conflicts, Trade and Value Networks. Proceedings of the Third Learning on Graphs Conference (LoG 2024), PMLR 269, Virtual Event, November 26–29, 2024. Link

Authors: Julia Gastinger (julia.gastinger (at) uni-mannheim.de), Timo Sztyler (timo.sztyler (at) neclab.eu), Nils Steinert, Sabine Gruender-Fahrer, Michael Martin, Anett Schuelke, Heiner Stuckenschmidt

In the following we describe the steps needed to reproduce our results. It is split in two parts, 1. Dataset Preprocessing and 2. TKG Forecasting. It is not required to re-run the Dataset Preprocessing Steps. We provide the output of Dataset Preprocessing in /data/crisis2023. These files can be used for TKG Forecasting.

1. Dataset Preprocessing

Sparql Queries:

Please see README.md in folder queries

Timestep Assignment:

Run python3 ./data_preprocessing/ts_assignment_gta_star.py and python3 /data_preprocessing/ts_assignment_acled.py to read the .nt files and assign timesteps
This requires the rdflib package, that can be downloaded here https://github.com/XuguangSong98/rdflib and put into the data_preprocessing folder. Processing data with this package very slow and can take hours to days.
The output are csv files that can be found in /data/acled and /data/gta respectively

Merge the two datasets:

Run python3 ./data_preprocessing/merke-tkg-from-gta-acled.py to merge both subsets and create train, valid, test.txt
What it does:
Specify timerange of interest. In our case this is 2023-01-01 – 2023-12-31
Split dataset based on timesteps. Specify train/valid/test split. In our case it is 80/10/10
Automatically stores the resulting files in /data/crisis2023
It produces various files:
- train.txt, valid.txt, test.txt: one line per quadruple, quadruples as subject_id, relation_id, object_id, timestamp (from 0 to num_timesteps), original_dataset_id (0: gta, 1: acled)
- train_names.txt, valid_names.txt, test_names.txt: one line per quadruple with string description for each node and relation; subject_string, relation_string, object_string, original_dataset_id, timestamp (from 0 to num_timesteps)
- id_to_node.json and id_to_rel.json: contains dicts with mappings from "node_id" to node_string, and "relation_id" to relation string.
- node_to_id.json and rel_to_id.json: contains dicts with mappings from node_string to "node_id", and relation string to "relation_id" .
- stat.txt: two entries, number of nodes, number of distinct relations

2. TKG Forecasting:

All models for TKG Forecasting are in the folder models. Follow the instructions in the respective README.md.

3. Result Evaluation:

The code for evaluating the results for TKG Forecasting are in the folder result_evaluation.py. Follow the instructions in the respective README.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GTA_ACLED_TKG

1. Dataset Preprocessing

Sparql Queries:

Timestep Assignment:

Merge the two datasets:

2. TKG Forecasting:

3. Result Evaluation:

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
data_preprocessing		data_preprocessing
models		models
queries		queries
result_evaluation		result_evaluation
README.md		README.md
README.md.save		README.md.save

JuliaGast/GTA_ACLED_TKG

Folders and files

Latest commit

History

Repository files navigation

GTA_ACLED_TKG

1. Dataset Preprocessing

Sparql Queries:

Timestep Assignment:

Merge the two datasets:

2. TKG Forecasting:

3. Result Evaluation:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages