Skip to content

A tool to import SnpEff annotated files to a Neo4j Graph database

License

Notifications You must be signed in to change notification settings

COMBAT-TB/vcf2neo

Repository files navigation

vcf2neo

Build Status Coverage Status DOI

A tool to import and map SnpEff annotated VCF files to COMBAT-TB NeoDB Graph database.

Prerequisites:

Usage

Clone repository:

$ git clone https://github.com/SANBI-SA/vcf2neo.git
...
$ cd vcf2neo

Build COMBAT-TB NeoDB:

$ docker-compose up --build -d
...

Install and run vcf2neo:

  • Using pip
$ pip install -i https://test.pypi.org/simple/ vcf2neo
...
  • or via setup in virtualenv
$ virtualenv envname
...
$ source envname/bin/activate
$ pip install -r requirements.txt
$ python setup.py install

Import and map SnpEff annotated VCF files to genes and drugs in NeoDB:

You change the default database location (localhost) by setting the DATABASE_URL environment variable to remote.

$ vcf2neo load_vcf --help
Usage: vcf2neo load_vcf [OPTIONS] VCF_DIR

  Load SnpEff annotated VCF files to genes and drugs in NeoDb.

Options:
  --owner TEXT                    Specify owner.  [default: $USER; required]
  -p, --phenotype [XDR|MDR|SUSCEPTIBLE|UNKNOWN]
                                  Specify phenotype.  [required]
  -a, --antibiotic TEXT           Specify antibiotic. E.g. Rifampicin
                                  [required]
  --help                          Show this message and exit.
$ vcf2neo load_vcf -p UNKNOWN -a UNKNOWN PATH/TO/VCF_DIR
...

Exploring variant data:

Point your browser to localhost:7474 to access the Neo4j browser.

To view the schema, run:

call db.schema.visualization

Sample Cypher query:

MATCH(g:Gene)--(v:Variant)--(cs:CallSet)
RETURN g.name as gene, v.consequence as variant, cs.name as file
LIMIT 25