Skip to content

MRCIEU/epigraphdb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EpiGraphDB: A database and data mining platform for health data science

Twitter Follow Open In Colab Binder

EpiGraphDB is an analytical platform and database to support data mining in epidemiology. The platform incorporates a graph of causal estimates generated by systematically applying Mendelian randomization to a wide array of phenotypes, and augments this with a wealth of additional data from other bioinformatic sources. EpiGraphDB aims to support appropriate application and interpretation of causal inference in systematic automated analyses of many phenotypes.

This repository contains example use cases to demonstrate the functionalities of EpiGraphDB, via the API using Jupyter notebooks in Python. The following table lists the main components of EpiGraphDB and their sources.

EpiGraphDB Component Source
Data integration pipeline Github
Web UI Github
API Github
R package Github
Example use cases (this repo) Github (this repo)
*We will gradually open source our components in the future months.

Example use cases

Below are the example use case notebooks. You can either use Google Colab or Binder to play with these notebooks online, or clone the repo and set up a local Jupyter lab environment.

Case studies for the EpiGraphDB paper
Case study 1: Distinguishing vertical and horizontal pleiotropy for SNP-protein associations Open In Colab Binder
Case study 2: Identification of potential drug targets Open In Colab Binder
Case study 3: Triangulating causal estimates with literature evidence Open In Colab Binder
General examples
Getting started with EpiGraphDB in Python Open In Colab Binder
Functionalities in getting metadata, entity search, and using Cypher queires Open In Colab Binder

The example notebooks above are done in Python using the EpiGraphDB API. R users can visit the package vignettes for equivalent functionalities using the epigraphdb R package.

Set up

We provide a conda environment configuration file for readers to run the notebooks locally.

To do this first you will need to install conda. Then follow the steps below to set up the conda environment.

# Bootstrap the environment
conda env create -f environment.yml

# Activate the environment in your shell session.
conda activate epigraphdb-notebooks

# Open Jupyter lab and you should be able to run the code examples!
jupyter lab

Citation

Please cite EpiGraphDB as

Yi Liu, Benjamin Elsworth, Pau Erola, Valeriia Haberland, Gibran Hemani, Matt Lyon, Jie Zheng, Oliver Lloyd, Marina Vabistsevits, Tom R Gaunt, EpiGraphDB: a database and data mining platform for health data science, Bioinformatics, btaa961, https://doi.org/10.1093/bioinformatics/btaa961

@article{epigraphdb2020bioinformatics,
    author = {Liu, Yi and Elsworth, Benjamin and Erola, Pau and Haberland, Valeriia and Hemani, Gibran and Lyon, Matt and Zheng, Jie and Lloyd, Oliver and Vabistsevits, Marina and Gaunt, Tom R},
    title = {{EpiGraphDB}: a database and data mining platform for health data science},
    journal = {Bioinformatics},
    year = {2020},
    month = {11},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btaa961},
    url = {https://doi.org/10.1093/bioinformatics/btaa961},
    note = {btaa961},
    eprint = {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaa961/34178613/btaa961.pdf}
}

Contact

Please get in touch with us for issues, comments, suggestions, etc. via the following methods: