Skip to content

A python wrapper for the ParSeMiS project (Parallel and Sequential Graph Mining Suite)

License

Notifications You must be signed in to change notification settings

ArkasDev/parsemis-python-wrapper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ParSeMiS Python Wrapper

This script is a wrapper for the frequent subgraph mining library ParSeMiS.

The original project can be found: https://www2.informatik.uni-erlangen.de/EN/research/zold/ParSeMiS/index.html

The source code for the java lib can be found: https://github.com/tomkdickinson/parsemis_wrapper

Instructions

The script includes a class called Parsemis, which you can pass the same parameters too if you were using ParSeMiS on the command line. The input for the graph method are graphs built using NetworkX.

Requriements

  • NetworkX

We use NetworkX due to it's large number of input formats that it can read and write to.

See: http://networkx.readthedocs.io/en/networkx-1.11/reference/readwrite.html

Example

An example of this wrapper in action can be found in example.py, with a sample dataset in example_dataset/.

The dataset was collected from IMDB and contains all classic episodes of Doctor Who (every file is an episode), with the characters who appeared in that episode.

Running the example will generate a list of the frequent subgraphs in the dataset, including cooccuring characters.

The example is ran with CloseGraph to remove any subgraphs of a subgraph that has the same frequency.

The top 20 results are below. As can be seen, the most frequent nodes are different aliases for Dr Who, followed by groups of companions of the Doctor.

count Nodes\Edges
681 ['episode']
297 [('episode', 'Dr._Who')]
254 [('episode', 'Doctor_Who')]
131 [('episode', 'The_Doctor')]
83 [('episode', 'Sarah_Jane_Smith')]
81 [('Sarah_Jane_Smith', 'episode'), ('episode', 'Doctor_Who')]
78 [('Jo_Grant', 'episode')]
77 [('Ian_Chesterton', 'episode'), ('episode', 'Dr._Who')]
77 [('episode', 'Brigadier_Lethbridge_Stewart')]
76 [('Barbara_Wright', 'episode'), ('Ian_Chesterton', 'episode'), ('episode', 'Dr._Who')]
72 [('Guard', 'episode')]
69 [('episode', 'Romana')]
68 [('episode', 'Jamie_McCrimmon')]
67 [('episode', 'Romana'), ('episode', 'Doctor_Who')]
66 [('episode', 'Dr._Who'), ('episode', 'Jamie_McCrimmon')]
65 [('episode', 'Tegan')]
61 [('episode', 'Tegan'), ('episode', 'The_Doctor')]
54 [('episode', 'Doctor_Who'), ('episode', 'Brigadier_Lethbridge_Stewart')]
52 [('Susan_Foreman', 'episode')]
51 [('Barbara_Wright', 'episode'), ('Susan_Foreman', 'episode'), ('episode', 'Dr._Who'), ('episode', 'Ian_Chesterton')]

About

A python wrapper for the ParSeMiS project (Parallel and Sequential Graph Mining Suite)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%