Skip to content

sepehr78/RecursiveCausalDiscovery.jl

Repository files navigation

RecursiveCausalDiscovery.jl

A Julia implementation of Recursive Causal Discovery algorithms. Recursive Causal Discovery (RCD) is an efficient approach for causal discovery (i.e., learning a causal graph from data).

⚠️ This package is still under development! ⚠️

CI codecov

Overview

Comparison of RCD with the PC algorithm implemented in CausalInference.jl

The following plots show the F1 score (computed using true causal graph), and number of conditional independency (CI) tests done.

F1 score and #CI tests of RSL versus PC

Implemented algorithms

  • Recursive Structure Learning (RSL)
    • Learning v-structures
    • Meek's rules
  • MArkov boundary-based Recursive Variable ELimination (MARVEL)
  • Latent MARVEL (L-MARVEL)
  • Removable Order Learning (ROL)

Installation

Requires at least Julia 1.10

julia> ]add RecursiveCausalDiscovery

How to use

The package so far has only one algorithm implemented: RSL-D, which can be called using the rsld function. The function takes the data matrix/table and a conditional independence test function as input, and returns the completed partially oriented directed acyclic graph (CPDAG) as output.

Simple example

In this example, a csv file named data.csv is loaded, and the RSL-D algorithm is used to learn the CPDAG. The conditional independence test is based on the Fisher's Z-test.

using RecursiveCausalDiscovery
using CSV
using Tables

# load data (columns are variables and rows are samples)
data = CSV.read("data.csv", Tables.matrix)

# use a Gaussian conditional independence test
sig_level =  0.01
ci_test = (x, y, cond_vec, data) ->  fisher_z(x, y, cond_vec, data, sig_level)

# learn the oriented causal graph using RSL
cpdag =  rsld(data, ci_test)

See the examples/rsl_example_wo_data.jl for a complete example.

Generating random data from DAG and learning from it

See the examples/rsl_example_wo_data.jl for an example on how to generate a random DAG using the Erdos-Renyi model, generate random Gaussian data from it, and learning the CPDAG using RSL-D.

Citation

If you do use this package, please cite our work.

@misc{mokhtarian2024rcd,
      title={Recursive Causal Discovery}, 
      author={Ehsan Mokhtarian and Sepehr Elahi and Sina Akbari and Negar Kiyavash},
      year={2024},
      eprint={2403.09300},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2403.09300}, 
}

Acknowledgement

Thanks to Felix Wechsler for helping me with Julia and pushing me to implement RCD in Julia.