Code for the PASSLEAF paper.
The implementation is a fork of UKGE.
- python3 (>=3.6)
- tensorflow-gpu (==1.14)
For other dependencies, please see requirements.txt
. (Some of them may be UNNECESSARY!)
The base datasets are from UKGE.
Datasets for PASSLEAF models:
- PPI5K (ppi5k_no_psl)
- NL27K (nl27k_no_psl)
- CN15K (cn15k_no_psl)
The datasets locate in the data
folder.
In the following script examples, we use a batch size of 512, embedding dimension of 512, and save the checkpoint every 40 epochs.
- ComplEx_m5_4: Uncertain ComplEx
- RotatE_m5: Uncertain RotatE
- RotatE_m3_3: Simplified Uncertain Rotate
- UKGE_logi_m2: UKGE + pool-based semi-supervised learning
Scripts:
-
Train:
python3 run/run.py --data [dataset] --batch_size 512 --epoch 2000 -d 512 --no_psl --models_dir [base path to save model] -m [model name] --semisupervised_v2 --save_freq 40
-
Test:
python3 run/test.py --data [dataset] --batch_size 512 -d 512 --no_psl --resume_model_path [saved model directory path] -m [model name] --start [starting epoch] --to [ending epoch] --step 40
-
Example:
python3 run/run.py --data cn15k_no_psl --batch_size 512 --epoch 3000 -d 512 --no_psl --models_dir ./trained_model_batch512_dim_512_semisupervised_v2 -m ComplEx_m5_4 --semisupervised_v2 --save_freq 40 python3 run/test.py --data cn15k_no_psl --batch_size 512 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/cn15k_no_psl/ComplEx_m5_4_0930 -m ComplEx_m5_4 --start 10 --to 3000 --step 20
* the model saving path name varies according to the current date.
- ComplEx_m5_1: Uncertain ComplEx (no SS)
- RotatE_m5_1: Uncertain RotatE (no SS)
- RotatE_m3_1: Simplified Uncertain Rotate (no SS)
- logi: UKGE
Scripts:
-
Train:
Same as above but without the
--semisupervised_v2
flagpython3 run/run.py --data [dataset] --batch_size 512 --epoch 2000 -d 512 --no_psl --models_dir [base path to save model] -m [model name] --save_freq 40
-
Test:
Same as above.
TODO. Not available in the repo yet
The PASSLEAF paper. Chen, Z.-M., Yeh, M.-Y., & Kuo, T.-W. (2021). PASSLEAF: A Pool-bAsed Semi-Supervised LEArning Framework for Uncertain Knowledge Graph Embedding. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4019-4026.
@article{
Chen_Yeh_Kuo_2021,
title={PASSLEAF: A Pool-bAsed Semi-Supervised LEArning Framework for Uncertain Knowledge Graph Embedding},
volume={35},
url={https://ojs.aaai.org/index.php/AAAI/article/view/16522},
number={5},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Chen, Zhu-Mu and Yeh, Mi-Yen and Kuo, Tei-Wei},
year={2021},
month={May},
pages={4019-4026}
}
By default, run/test.py
generates the following files based on saved models per training step (checkpoints):
Validation:
val%s_mean_rank_accurate.csv
: mean rank (TEP) and hit@K (TEP) for each specified training steps.val_loss_accurate.csv
: MSE (CSP) and nDCG (TEP) for each specified training steps.val_detail_[epoch].csv
: detailed predictions. One file per training steps.
Testing: for each best checkpoint according to the validation MSE, nDCG(linear), and Hit@20
test_mean_rank_accurate.csv
: mean rank (TEP) and hit@K (TEP) on testing set.test_loss_accurate.csv
: MSE (CSP) and nDCG (TEP) on testing set.test_test_only_detail_[epoch].csv
: detailed predictions on testing set .test_mean_rank_training_included.csv
: mean rank (TEP) and hit@K (TEP) on testing set. (training set candidates INCLUDED)test_loss_training_included.csv
: MSE (CSP) and nDCG (TEP) on testing set. (training set candidates INCLUDED)test_detail_[epoch]_training_included.csv
: detailed predictions on testing set. (training set candidates INCLUDED)
To automatize the analysis, use the autorecord_openke+ukge.py
script.
python3 autorecord_openke+ukge.py > ./records.csv
Please see the script for details about the options.
Pretrained model files are missing :(
Trying to fix that.
python3 run/test.py --data ppi5k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/ppi5k_no_psl/RotatE_m3_3_0306 -m RotatE_m3_3 --start 10 --to 2000 --step 40
python3 run/test.py --data nl27k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/nl27k_no_psl/RotatE_m3_3_0304 -m RotatE_m3_3 --start 10 --to 2000 --step 40
python3 run/test.py --data cn15k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/cn15k_no_psl/RotatE_m3_3_0306 -m RotatE_m3_3 --start 10 --to 3000 --step 40
python3 run/test.py --data ppi5k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/ppi5k_no_psl/UKGE_logi_m2_0306 -m UKGE_logi_m2 --start 10 --to 2000 --step 40
python3 run/test.py --data nl27k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/nl27k_no_psl/UKGE_logi_m2_0301 -m UKGE_logi_m2 --start 10 --to 2000 --step 40
python3 run/test.py --data cn15k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/cn15k_no_psl/UKGE_logi_m2_0301 -m UKGE_logi_m2 --start 10 --to 3000 --step 20
python3 run/test.py --data ppi5k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/ppi5k_no_psl/ComplEx_m5_4_0306 -m ComplEx_m5_4 --start 10 --to 2000 --step 40
python3 run/test.py --data nl27k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/nl27k_no_psl/ComplEx_m5_4_0301 -m ComplEx_m5_4 --start 10 --to 2000 --step 40
python3 run/test.py --data cn15k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512_semisupervised_v2/cn15k_no_psl/ComplEx_m5_4_0301 -m ComplEx_m5_4 --start 10 --to 3000 --step 20
python3 run/test.py --data ppi5k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/ppi5k_no_psl/RotatE_m3_1_0116 -m RotatE_m3_1 --start 10 --to 2000 --step 40
python3 run/test.py --data nl27k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/nl27k_no_psl/RotatE_m3_1_0116 -m RotatE_m3_1 --start 10 --to 2000 --step 40
python3 run/test.py --data cn15k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/cn15k_no_psl/RotatE_m3_1_0116 -m RotatE_m3_1 --start 10 --to 2000 --step 40
python3 run/test.py --data cn15k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/cn15k_no_psl/ComplEx_m5_1_0117 -m ComplEx_m5_1 --start 10 --to 2000 --step 40
python3 run/test.py --data nl27k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/nl27k_no_psl/ComplEx_m5_1_0117 -m ComplEx_m5_1 --start 10 --to 2000 --step 40
python3 run/test.py --data ppi5k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/ppi5k_no_psl/ComplEx_m5_1_0117 -m ComplEx_m5_1 --start 10 --to 2000 --step 40
python3 run/test.py --data cn15k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/cn15k_no_psl/logi_0117 -m logi --start 10 --to 2000 --step 40
python3 run/test.py --data nl27k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/nl27k_no_psl/logi_0117 -m logi --start 10 --to 2000 --step 40
python3 run/test.py --data ppi5k_no_psl --batch_size 512 --epoch 400 -d 512 --no_psl --resume_model_path ./trained_model_batch512_dim_512/ppi5k_no_psl/logi_0117 -m logi --start 10 --to 2000 --step 40