http://bahlolab.github.io/moimix/
- Estimate multiplicity of infection from massively parallel sequencing data
- Estimate heterzygosity and within-isolate diversity directly from read-counts
- Call major alleles within isolates from B-allele frequencies
- Prepare SNP barcode data for use in COIL
- Simulate single nucleotide variant data with known multiplicity of infection
There are plans to put moimix on Bioconductor in the future, however it is currently only available to install as a development version from Github:
# install using devtools packages
# first install bioc dependencies
install.packages("BiocManager")
BiocManager::install("bahlolab/moimix", build_vignettes = TRUE)
moimix makes use of the Genomic Data Storage (GDS) format used by the Bioconductor package SeqArray to provide fast access to VCF files in R.
To convert a VCF file to the GDS:
library(SeqArray)
seqVCF2GDS("isolate_snps.vcf.gz", "isolate_snps.gds")
It is also possible to estimate MOI from a matrix of read counts where the first column is the number of reads supporting the reference allele and the second column is the number of reads supporting the alternate allele.
See the introduction vignette for usage examples.