Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

Latest commit

 

History

History
16 lines (12 loc) · 938 Bytes

README.md

File metadata and controls

16 lines (12 loc) · 938 Bytes

What's this?

This repository includes scripts to obtain files to have fun with the vg tutorial (https://github.com/Pfern/PANGenomics) day 3. Once you have all the files necessary, you can play around bacterial pan-genome.

Requirements

How to start

  1. Go to scripts/ and run ./fetch_data.sh to obtain E. coli complete genomes, gene tables, and one fastq file as well as minia.
  2. Then run run_minia.sh to generate a contig fasta file from the fastq reads.
  3. Use extract_gene.ipynb with Jupyter Notebook as necessary. It helps you extract genome regions corresponding to a certain gene from multiple genomes.
  4. Now you're free to do whatever you want. Check the original tutorial and have fun.