Extracting Interpretable Rules from Bayesian Networks

Based on the 2010 paper: "Bayesian rule learning for biomedical data mining" by Vanathi Gopalakrishnan, Jonathan L. Lustgarten, Shyam Visweswaran, and Gregory F. Cooper.

Notebook Demos

Some demos are implemented as Jupyter notebooks:

Notebook	Colab Link	View on GitHub
Mitchell Tennis Dataset		`tennis.ipynb`
Adult Dataset		`adult.ipynb`

Overview

Given a Bayesian Network structure and parameters in the form of conditional probability tables, Gopalakrishnan et al. proposed to extract if/then rules based on the edges and the ratio between outcome cases (from Fig. 3: "CF is expressed as the likelihood ratio of the conditional probability of the target value given the value of its parent variables.")

For example, this structure (CPTs not shown):

Can be turned into the following rules:

Probabilities:
- Outlook
  P( Outlook = sunny ) = 0.36
  P( Outlook = overcast ) = 0.29
  P( Outlook = rain ) = 0.36
- Temperature
  P( Temperature = hot ) = 0.29
  P( Temperature = mild ) = 0.43
  P( Temperature = cool ) = 0.29
- Wind
  P( Wind = weak ) = 0.57
  P( Wind = strong ) = 0.43

IF (Outlook = overcast ^ Wind = strong) THEN (PlayTennis = yes)
	CF = inf
IF (Outlook = overcast ^ Wind = weak) THEN (PlayTennis = yes)
	CF = inf
IF (Outlook = rain ^ Wind = strong) THEN (PlayTennis = no)
	CF = inf
IF (Outlook = rain ^ Wind = weak) THEN (PlayTennis = yes)
	CF = inf
IF (Outlook = sunny ^ Wind = strong) THEN (PlayTennis = no)
	CF = 1.00
IF (Outlook = sunny ^ Wind = strong) THEN (PlayTennis = yes)
	CF = 1.00
IF (Outlook = sunny ^ Wind = weak) THEN (PlayTennis = no)
	CF = 2.00
IF (Temperature = cool) THEN (Humidity = normal)
	CF = inf
IF (Temperature = hot) THEN (Humidity = high)
	CF = 3.00
IF (Temperature = mild) THEN (Humidity = high)
	CF = 2.00

Getting Started

Working with the Python package

Install requirements:

pip install git+https://github.com/hayesall/bn-rule-extraction.git

The bayes_rule_extraction package exposes two functions: print_rules and ordinal_encode.

Here's a minimal working example:

from bayes_rule_extraction import ordinal_encode, print_rules
from pomegranate import BayesianNetwork
import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/hayesall/bn-rule-extraction/main/toy_decision.csv")

encoded, mapping = ordinal_encode(df.columns, df)

# Encode a constraint that "PlayTennis" cannot be the parent of any other node.
excluded_edges = [tuple([0, i]) for i in range(1, len(df.columns))]

model = BayesianNetwork().from_samples(
    encoded,
    exclude_edges=excluded_edges,
    state_names=df.columns,
)

print_rules(model, df.columns, mapping)

Notes

This is implemented as an "explanation method" to help explain a Bayesian Network. It's not currently possible to use the extracted rules directly for classification.
Gopalakrishnan 2010 used a modified version of K2 for structure learning.
The include_edges parameter in the pomegranate.BayesianNetwork.from_samples method seems to be required to learn "interesting" or "useful" rules, especially if there is a specific outcome variable (like PlayTennis) you are interested in. This might be explained by differences in structure learning methods—variable ordering in K2 provides some control over influence between possible parents and children.

Acknowledgements

The Toy Decision data set is lifted from Tom Mitchell's Machine Learning book, see section 3.4.2 (page 59 in my edition).

BibTex

@article{gopalakrishnan2010bayesian,
  author = {Gopalakrishnan, Vanathi and Lustgarten, Jonathan L. and Visweswaran, Shyam and Cooper, Gregory F.},
  title = "{Bayesian rule learning for biomedical data mining}",
  journal = {Bioinformatics},
  volume = {26},
  number = {5},
  pages = {668-675},
  year = {2010},
  month = {01},
  issn = {1367-4803},
  doi = {10.1093/bioinformatics/btq005},
  url = {https://doi.org/10.1093/bioinformatics/btq005},
  eprint = {https://academic.oup.com/bioinformatics/article-pdf/26/5/668/16897540/btq005.pdf},
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
bayes_rule_extraction		bayes_rule_extraction
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
toy_decision.csv		toy_decision.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extracting Interpretable Rules from Bayesian Networks

Notebook Demos

Overview

Getting Started

Working with the Python package

Notes

Acknowledgements

BibTex

About

Languages

License

hayesall/bn-rule-extraction

Folders and files

Latest commit

History

Repository files navigation

Extracting Interpretable Rules from Bayesian Networks

Notebook Demos

Overview

Getting Started

Working with the Python package

Notes

Acknowledgements

BibTex

About

Topics

Resources

License

Stars

Watchers

Forks

Languages