Single Nucleotide Variant Filtering
- Filter on MQ
- Select chromosomes you want to analyze
- Possibility to remove indels
- Filter on maximum number of alleles per site
Individual releases can be downloaded from:
Alternatively use git clone:
SNVFI is configured using a config file, and an ini file for each filtering run. In most scenarios you'll create the config file once and create an ini file per filtering run.
SNVFI_ROOT=<path to SNFVI install directory>
BIOVCF_PREFIX=<path to bio-vcf executable>
TABIX_PREFIX=<path to tabix executable>
VCFTOOLS_PREFIX=<path to vcftools executable>
R_PREFIX=<path to R executable>
RSCRIPT=<path to SNVFI_filtering_R.R R-script>
MAX_THREADS=<maximum number of threads used by SNFVI>
SGE=<YES|NO> #Use Sun Grid Engine yes or no
SNV=<Path to input vcf>
SUB=<Subject column in vcf>
CON=<Control column in vcf>
OUT_DIR=<Output directory>
SUB_GQ=<Minimum Genotype Quality in subject sample>
CON_GQ=<Minimum Genotype Quality in control sample>
QUAL=<Minimum quality threshold>
COV=<Minimum coverage threshold>
FILTER=<Select either ALL variants or only PASS>
VAF=<Variant Allele Frequency threshold>
MQ=<Minimum MQ quality>
CHR="1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X" #Specifiy chromosomes with spaces
CHR_NAM=<Name for the chosen selection of chromosomes. For example 'autosomal'>
SNV_only=<YES | NO>
max_alleles=<Maximum number of alleles that can be present at a given site>
MAIL=<Mail address for qsub>
sh <config> <ini>
The full path needs to be given for both the config and ini file.
- GNU/Linux (tested on CentOS Linux release 7.6.1810)
- (optional) Sun Grid Engine (tested on SGE 8.1.9)
- R 3.5.0 (
- bio-vcf 0.9.2 (
- htslib 1.8 (
- vcftools 0.1.15 (
- zgrep, grep 3.1
- VariantAnnotation 1.26.1
- ggplot2 3.0.0
- reshape2 1.4.3