Skip to content

mrhhyu/Graph-Embedding_vs_Linkbased-Measures

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Graph Embedding Methods vs Link-based Similarity Measures in Task of Similarity Computation of Nodes in Graphs

This repository provides:

  1. Python implementations of the following similarity measures:
  1. Datasets:
  • BlogCatalog
  • Cora
  • Wikipedia

The following packages are required:

Python       >= 3.8
networkx     =2.6.*
numpy        =1.21.*
scikit-learn =1.0.*

Notes

  1. All the codes are implemented in Python 3.7 by Eclipse PyDev.
  2. The codes can be easily migrated to other Python IDs and it is also possible to use them via command line by applying small changes.
  3. The implementations of link-based similarity measures are based on their matrix forms, which are significantly faster than their component forms.
  4. The provided codes for link-based similarity measures can be applied to both directed and undirected graphs.
  5. The Cosine implementation is based on a matrix/vector multiplication technique, which is significantly faster than its conventional implementation.

Datasets and Graph Structure:

  1. Each dataset has a “ground_truth” folder containing a text file per each label where each line indicates a node id.
  2. A graph is represented as a text file under the edge list format in which, each line corresponds to an edge in the graph, tab is used as the separator, and the node index is started from 0.

Citing:

If you find the provided source codes and datasets useful for your research, please consider citing the following paper:

Hamedani, M.R.; Kim, S-W. On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs. Applied Sciences. 2021, 11, 162. DOI: https://dx.doi.org/10.3390/app11010162