PyTorch implementation of the paper "Less Can be More: Unsupervised Graph Pruning for Large-scale Dynamic Graphs".
- torch == 1.8.1
- pytorch-lightning == 1.6.4
- torch_scatter == 2.0.8
- scikit-learn == 1.0.2
- scipy == 1.7.3
Create a folder 'dataset' to store data file.
We use the data processing method of the reference TGAT, repo.
We use the dense npy format to save the features in binary format. If edge features or nodes features are absent, it will be replaced by a vector of zeros.
python build_dataset_graph.py --data wikipedia --bipartite
python build_dataset_graph.py --data reddit --bipartite
Training the Graph pruning network based on an unsupervised task.
python train_gsn.py --data_set wikipedia --prior_ratio 0.5 --learning_rate 1e-3
Pruning the edge data in the database inductively according to the trained Graph pruning network above.
python edge_pruning.py --data_set wikipedia --output_edge_txt ./result/edge_pred.txt --ckpt_file ./lightning_logs_gsn/lightning_logs/version_0/checkpoints/epoch=10.ckpt
Using a gnn to evaluate the performance of graph pruning.(this requires a trained gnn model from the supervised task, eg. runing the following commands on dynamic node classification).
python train_gnn.py --mode origin --data_set wikipedia
python eval_gnn.py --data_set wikipedia --mode gsn --pruning_ratio 0.5 --mask_edge --output_edge_txt ./result/edge_pred.txt --ckpt_file ./lightning_logs_gnn/lightning_logs/version_0/checkpoints/epoch=10.ckpt