-
Notifications
You must be signed in to change notification settings - Fork 22
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #574 from KnowledgeCaptureAndDiscovery/dev
Dev
- Loading branch information
Showing
7 changed files
with
298 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
<img src="img/combine.png" style="zoom:100%;" /> | ||
|
||
<p align="center"><a href="https://github.com/THUDM/SelfKG/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/THUDM/SelfKG" /></a> | ||
|
||
# SelfKG: Self-Supervised Entity Alignment in Knowledge Graphs | ||
|
||
Original implementation for paper SelfKG: Self-Supervised Entity Alignment in Knowledge Graphs. | ||
|
||
This paper is accepted and **nominated as a best paper** by [The Web Conference2022](https://www2022.thewebconf.org/)! :satisfied: | ||
|
||
SelfKG is the **first** **self-supervised** entity alignment method **without label supervision**, which can **match or achieve comparable results with state-of-the-art supervised baselines**. The performance of SelfKG suggests self-supervised learning offers great potential for entity alignment in Knowledge Graphs. | ||
|
||
[SelfKG: Self-Supervised Entity Alignment in Knowledge Graphs](https://arxiv.org/abs/2203.01044) | ||
|
||
https://doi.org/10.1145/3485447.3511945 | ||
|
||
- [Installation](#installation) | ||
- [Requirements](#requirements) | ||
- [Quick Start](#quick-start) | ||
- [Data Preparation](#data-preparation) | ||
- :star:[Run Experiments](#run-experiments) | ||
- [❗ Common Issues](#-common-issues) | ||
- [Citing SelfKG](#citing-selfkg) | ||
|
||
## Installation | ||
|
||
### Requirements | ||
|
||
```txt | ||
torch==1.9.0 | ||
faiss-cpu==1.7.1 | ||
numpy==1.19.2 | ||
pandas==1.0.5 | ||
tqdm==4.61.1 | ||
transformers==4.8.2 | ||
torchtext==0.10.0 | ||
``` | ||
|
||
You can use [`setup.sh`](https://github.com/THUDM/SelfKG/blob/main/setup.sh) to set up your Anaconda environment by | ||
|
||
```bash | ||
bash setup.sh | ||
``` | ||
|
||
|
||
|
||
## Quick Start | ||
|
||
### Data Preparation | ||
|
||
You can download the our data from [here](https://zenodo.org/record/6326870#.YiI2K6tBxPY), and the final structure our project should be: | ||
|
||
```bash | ||
├── data | ||
│ ├── DBP15K | ||
│ │ ├── fr_en | ||
│ │ ├── ja_en | ||
│ │ └── zh_en | ||
│ ├── DWY100K | ||
│ │ ├── dbp_wd | ||
│ │ └── dbp_yg | ||
│ └── LaBSE | ||
│ ├── bert_config.json | ||
│ ├── bert_model.ckpt.index | ||
│ ├── checkpoint | ||
│ ├── config.json | ||
│ ├── pytorch_model.bin | ||
│ └── vocab.txt | ||
│ └── getdata.sh | ||
├── loader | ||
├── model | ||
├── run.sh # Please use this bash to run the experiments! | ||
├── run_DWY_LaBSE_neighbor.py # SelfKG on DWY100k | ||
├── run_LaBSE_neighbor.py # SelfKG on DBP15k | ||
... # run_LaBSE_*.py # Ablation code will be available soon | ||
├── script | ||
│ └── preprocess | ||
├── settings.py | ||
└── setup.sh # Can be used to set up your Anaconda environment | ||
``` | ||
|
||
You can also use the following scripts to download the datasets directly: | ||
|
||
```bash | ||
cd data | ||
bash getdata.sh # The download speed is decided by your network connection. If it's pretty slow, please directly download the datasets from the website as mentioned before. | ||
``` | ||
|
||
### :star:Run Experiments | ||
|
||
**Please use** | ||
|
||
**`bash run.sh`** | ||
|
||
to reproduce our experiments results. For more details, please refer to [`run.sh`](https://github.com/THUDM/SelfKG/blob/main/run.sh) and our code. | ||
|
||
## ❗ Common Issues | ||
|
||
<details> | ||
<summary> | ||
"XXX file not found" | ||
</summary> | ||
<br/> | ||
Please make sure you've downloaded all the dataset according to README. | ||
</details> | ||
|
||
|
||
to be continued ... | ||
|
||
|
||
## Citing SelfKG | ||
|
||
If you use SelfKG in your research or wish to refer to the baseline results, please use the following BibTeX. | ||
|
||
``` | ||
@article{DBLP:journals/corr/abs-2203-01044, | ||
author = {Xiao Liu and | ||
Haoyun Hong and | ||
Xinghao Wang and | ||
Zeyi Chen and | ||
Evgeny Kharlamov and | ||
Yuxiao Dong and | ||
Jie Tang}, | ||
title = {SelfKG: Self-Supervised Entity Alignment in Knowledge Graphs}, | ||
journal = {CoRR}, | ||
volume = {abs/2203.01044}, | ||
year = {2022}, | ||
url = {https://arxiv.org/abs/2203.01044}, | ||
eprinttype = {arXiv}, | ||
eprint = {2203.01044}, | ||
timestamp = {Mon, 07 Mar 2022 16:29:57 +0100}, | ||
biburl = {https://dblp.org/rec/journals/corr/abs-2203-01044.bib}, | ||
bibsource = {dblp computer science bibliography, https://dblp.org} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# MultiDepth | ||
|
||
Source code for MultiDepth, our single-image depth estimation method based on joint regression and classification in a multi-task setup. | ||
This work was presented at the IEEE Intelligent Transportation Systems Conference (ITSC) 2019. | ||
|
||
If you make use of our code or approach, please consider citing [our paper](https://arxiv.org/abs/1907.11111) as: | ||
|
||
@InProceedings{, | ||
author = {Lukas Liebel and Marco K\"orner}, | ||
title = {{MultiDepth}: Single-Image Depth Estimation via Multi-Task Regression and Classification}, | ||
booktitle = {IEEE Intelligent Transportation Systems Conference (ITSC)}, | ||
year = {2019} | ||
} | ||
|
||
Check out the [KITTI leaderboard](http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction) for exemplary results I got using this concept. | ||
|
||
> I'm confident that you should be able to achieve better results with more training and some minor tweaks. | ||
|
||
This implementation is heavily based on [pytorch-semseg](https://github.com/meetshah1995/pytorch-semseg), a brilliant project maintained by Meet Pragnesh Shah (released under [MIT license](https://github.com/meetshah1995/pytorch-semseg/blob/master/LICENSE)). | ||
Please check out and contribute to their project and feel free to ignore certain parts in my code that are just unused parts of the *ptsemseg* codebase. | ||
|
||
|
||
## Step-by-step Instructions | ||
|
||
### A Word of Warning | ||
|
||
I originally wrote this code for a different project. | ||
While most of the unnecessary (and some of the confusing) pieces have already been removed, it might still contain some cryptic lines. | ||
Just ignore them and you should be fine ;) | ||
|
||
> *Sorry for the mess :D If you are/were/know a PhD student you know the drill...* | ||
|
||
### Docker Container | ||
|
||
This repository comes with a Dockerfile allowing you to build an image that can be used to optionally run training inside a container. | ||
Just skip the respective steps in the following instructions if you do not wish to use docker. | ||
|
||
> Please note that I highly recommend using docker and never tried to run the provided code outside of a container. | ||
|
||
0. *(optional)* Adjust the [Dockerfile](docker/Dockerfile) if needed (e.g., add helpful utils, such as tmux, htop, etc.). | ||
> To change this later just stop running containers, re-build the image and restart the container. | ||
|
||
1. Go to the [docker dir](docker). | ||
Build the MultiDepth docker image by running the respective [script](docker/build_image.sh): `./build_image.sh` | ||
|
||
2. Adjust the mount parameters of your container in the provided [script](docker/start_container.sh), such that the directories containing your training data are mounted to `/root/data/kitti/rgb` and `/root/data/kitti/depth`. | ||
>Feel free to change this if you want to use a different dir tree. | ||
Keep in mind that it will be necessary to adjust the paths in other places accordingly. | ||
|
||
You can also mount an external directory to `/root/logs` in order to save tensorboard logs and checkpoints outside of the container. | ||
|
||
3. Start your container by running the [script](docker/start_container.sh): `./start_container.sh` | ||
|
||
4. Connect to the running container, e.g., by running `docker exec -it multidepth bash` or by simply calling the provided minimal [script](docker/connect_to_container.sh): `./connect_to_container.sh` | ||
|
||
5. To stop the container simply disconnect from the container (e.g., by pressing [Ctrl] + [D]) and kill it: `docker kill multidepth`. | ||
|
||
> If you are familiar with docker, you probably know better ways of starting and stopping containers as well as running scripts within them :) | ||
|
||
|
||
### Set Training Parameters | ||
|
||
You can adjust training behavior and numerous other options using a [YAML configuration file](configs/example_config.yml). | ||
Most of the parameters in the example script should be self-explanatory and they are already set to useful values. | ||
|
||
> I might add a more detailed explanation in the future. | ||
Until then, feel free to message me if you have trouble with understanding their effect and I will update this section accordingly. | ||
|
||
|
||
### Run Training | ||
|
||
Run the [main training script](train.py) which expects a single parameter `--config` specifying the path to a configuration file, e.g.: `python train.py --config configs/example_config`. | ||
|
||
> Note that it might take a while for the actual training process to start depending on the size of your dataset. | ||
|
||
|
||
### Visualize Training Progress | ||
|
||
The training script will write Tensorboard logs to the directory specified in the [config file](configs/example_config.yml). | ||
Display the results by starting Tensorboard and directing it to the respective log dir. | ||
|
||
You could do this by starting another docker container with TensorFlow: `docker run --rm -it -p 6006:6006 -v ~/path/to/my/logs:/root/logs tensorflow/tensorflow` | ||
|
||
Make sure to mount the correct data dir and map a different port if necessary (6006 is Tensorboard's standard port). | ||
This will allow you to access the web interface of Tensorboard running on a server from your local machine. | ||
|
||
> This works for me in certain settings but your mileage will vary depending on your network configuration! | ||
|
||
Start tensorboard: `tensorboard --logdir /root/logs` and navigate to [your server's ip/localhost]:6006 to access the web-interface in your favorite web browser. | ||
|
||
|
||
### Evaluate Results | ||
|
||
Mid-training validation will be carried out from time to time according to your [config file](configs/example_config.yml). | ||
|
||
|
||
## Hardware Requirements | ||
|
||
Even though a CUDA-capable GPU is not strictly required to run this training script, it is highly recommended for obvious reasons. | ||
Adjust the batch size if you run out of memory. | ||
Successfully tested on 1080 and 1080Ti GPUs. | ||
**Multi-GPU training with batch-splitting will be used if you provide multiple GPUs!** | ||
|
||
|
||
## Contribute | ||
|
||
If you encounter any errors or unexpected behavior feel free to message me. | ||
You are also welcome to file pull requests if you want to help to improve or fix any part of this. | ||
|
||
**Thank you!** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters