Skip to content

2.1. Bioinformatics tools and commands used

Natasha Pavlovikj edited this page Feb 21, 2021 · 1 revision

ProkEvo is dependent on many well-developed bioinformatics tools and databases. A table of the exact versions of the bioinformatics tools and commands used in ProkEvo is shown below.

Program Version Description Link Command
parallel-fastq-dump 0.6 Parallel wrapper for SRA Toolkit https://github.com/rvalieris/parallel-fastq-dump prefetch <sra_id> && parallel-fastq-dump --sra-id <sra_id> --threads 1 --split-3
Trimmomatic 0.38 Trimming tool for Illumina NGS reads https://github.com/timflutre/trimmomatic trimmomatic PE -threads 1 <input_read_pair_1.fq> <input_read_pair_2.fq> <input_read_pair_1_trimmed.fastq> <input_read_unpair_1_trimmed.fastq> <input_read_pair_2_trimmed.fastq> <input_read_unpair_2_trimmed.fastq> HEADCROP:15 CROP:200 LEADING:10 TRAILING:10 SLIDINGWINDOW:5:20 MINLEN:50
FastQC 0.11 Tool to quality control for sequencing data https://github.com/s-andrews/FastQC fastqc <input_read_pair_1_trimmed.fastq> <input_read_pair_2_trimmed.fastq> --extract
SPAdes 3.13 Genome assembler https://github.com/ablab/spades spades.py -t 1 -1 <input_read_pair_1_trimmed.fastq> -2 <input_read_pair_2_trimmed.fastq> --careful --cov-cutoff auto -o <spades_output> --phred-offset 33
QUAST 5.0 Evaluation tool for genome assembly https://github.com/ablab/quast quast --fast -o <quast_output> <spades_output/contigs.fasta>
PlasmidFinder 2.0 Tool for detection and characterization of plasmid sequences https://bitbucket.org/genomicepidemiology/plasmidfinder/src/master/ plasmidfinder.py -p $PLASMID_DB -i <spades_output/contigs.fasta> -o <plasmidfinder_output>
SISTR 1.0 Tool for Salmonella In Silico Typing https://github.com/phac-nml/sistr_cmd sistr --qc -vv --alleles-output <allele_results.json> --novel-alleles <novel_alleles.fasta> --cgmlst-profiles <cgmlst_profiles.csv> -f csv -o <sistr_output.csv> <spades_output/contigs.fasta>
Prokka 1.13 Prokaryotic genome annotation tool https://github.com/tseemann/prokka prokka --kingdom Bacteria --locustag <sra_id> --outdir <prokka_output> --prefix <sra_id> --force <spades_output/contigs.fasta>
Roary 3.12 Pan-genome and core-genome alignment tool https://github.com/sanger-pathogens/Roary roary -s -e --mafft -p 4 -cd 99.0 -i 95 -f <prokka_output/sra_id.gff>
fastbaps 1.0 Improved version of the BAPS clustering method https://github.com/gtonkinhill/fastbaps Rscript fastbaps.R <roary_output/core_gene_alignment.aln> <fastbaps_output.csv>
MLST 2.16 Tool for multilocus-sequence typing https://github.com/tseemann/mlst mlst --legacy --scheme senterica --csv <spades_output/contigs.fasta>
ABRicate 1.0 Tool for screening of contigs for AMR and virulence genes https://github.com/tseemann/abricate abricate --db <abricate_db> --csv <spades_output/contigs.fasta>