This collection of scripts are used to build 1KGenomes project-informed major-allele reference genomes for Bowtie 1 and Bowtie 2. The current scripts build linear references with major alleles. We provide SNP-only and SNP-and-indel indexes for both Bowtie versions. The inclusion of indels change the genomic coordinate system. We recommend using levioSAM to perform accurate and scalable lift-over for alignments against the SNP-and-indel indexes.
Pre-built major-allele-SNP reference indexes are available:
Aligner | Reference | Index zip |
---|---|---|
Bowtie and Bowtie 2 | GRCh38 + major SNPs | https |
Bowtie and Bowtie 2 | hg19 + major SNPs | https |
Pre-built major-allele SNP-and-indel reference indexes are available below.
The pre-built levioSAM index (.lft
) is also included.
We provide a quick example and a detailed tutorial of using levioSAM in a major-allele alignment workflow.
Aligner | Reference | Index zip |
---|---|---|
Bowtie and Bowtie 2 | GRCh38 + major SNP-and-indels | https |
Bowtie and Bowtie 2 | hg19 + major SNP-and-indels | https |
Those links will also appear on the Bowtie web page and Bowtie 2 web page in the right-hand sidebar.
The FASTA files with major-allele SNPs inserted are also available:
Reference | FASTA file | LevioSAM index |
---|---|---|
GRCh38 + major SNPs | https | N/A |
GRCh38 + major SNP-and-indels | https | https |
hg19 + major SNPs | https | N/A |
hg19 + major SNP-and-indels | https | https |
Workflow:
cd
to appropriate subdir./buildXX.sh
to build index- Use
sbatch buildXX_marcc.sh
if on JHU MARCC cluster
- Use
./testXX.sh
to test- You may see a limited number of warnings, usually due to VCF formatting issues
- Use
sbatch testXX_marcc.sh
if on JHU MARCC cluster
./index_bt_marcc.sh
and./index_bt2_marcc.sh
to build Bowtie and Bowtie 2 indexes- Use
sbatch
if on JHU MARCC cluster
- Use
./zip_bt.sh
and./zip_bt2.sh
to zip indexes- Archives all include
README.md
- Archives all include
./scp_bt.sh
and./scp_bt2.sh
to copy over to FTP server- Assumes you're on MARCC or other JHU cluster with access to
gwln1
- Assumes you're on MARCC or other JHU cluster with access to
Requirements for building major-allele FASTAs (first 3 steps above):
Requirements for building genome indexes (4th step above):
Requirement for building major-allele FASTAs including indels (2th step above):
- Nae-Chyun Chen
- Ben Langmead