Skip to content
View unexpatterns's full-sized avatar

Block or report unexpatterns

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
unexpatterns/README.md

Unexpected Attributed Subgraphs: a Mining Algorithm

UnexPatterns is a an algorithm that mines unexpected patterns from large attributed graphs.

It relies on the combination of information-theoretic based measure of Unexpectedness [1] as an interestingness measure to select relevant patterns and pruning techniques to reduce the search space.

[1] Dessalles, J.L.: Algorithmic simplicity and relevance. Algorithmic Probability and Friends - LNAI 7070, pp. 119–130, Springer (2013).

Install

Clone this project and install dependencies:

git clone
cd UnexPatterns
pip install -r requirements.txt

Data

The five real-world datasets used in our work are provided either in data/ directory or accessible through Netset:

  • 'wikivitals'
  • 'wikivitals-fr'
  • 'wikischools'
  • 'sanFranciscoCrimes'
  • 'ingredients'

Pattern mining

Parameter file

Use parameters.txt to specify the dataset(s) and parameter(s) value. As an example:

datasets: wikivitals, wikivitals-fr
s: 8
beta: 4
delta: 0
patterns_path: output

where:

  • datasets: Dataset name (if several datasets, names are coma-separated)
  • s: Minimum number of nodes in pattern
  • beta: Minimum number of attributes in pattern
  • delta: Minimum amount of unexpectedness difference between two patterns
  • patterns_path: Path to output directory

Usage

To mine patterns according to parameters.txt, use the following command:

python unexpatterns.py parameters.txt

Output format

Patterns are stored in binary format, within output/patterns/ directory. To extract them, you can use:

with open(f'output/patterns/<patterns>.bin', 'rb') as data:
    patterns = pickle.load(data)

The naming convention of each file is the following: result_<dataset>_<beta>_<s>_<delta>.bin.

The variable patterns then contains a list of patterns, each in the form of a tuple (Xi,Qi) where Xi is the set of node indexes and Qi the set of attribute indexes of the ith pattern.

Experiments

Code for experiments and comparisons with baseline and state-of-the-art aglorithms are provided in experiments/ directory.

Popular repositories Loading

  1. UnexPatterns UnexPatterns Public

    [ASONAM23] Mine unexpected patterns from attributed graphs.

    Python 1