Skip to content

Latest commit

 

History

History
executable file
·
220 lines (140 loc) · 15.7 KB

programs.md

File metadata and controls

executable file
·
220 lines (140 loc) · 15.7 KB

Programs installed locally on Maria's account: 😄

packages:

Using R version 3.2.2 installed locally: export PATH=/home/armita/prog/R/R-3.2.2/bin:${PATH} and libraries stored in export R_LIBS=/home/sobczm/R/x86_64-pc-linux-gnu-library/3.2:$R_LIBS

- If aiming to use those libraries, append the path to them in the following way:
R
.libPaths( c( .libPaths(), "/home/sobczm/R/x86_64-pc-linux-gnu-library/3.2") )
.libPaths( c( .libPaths(), "/home/armita/prog/R/R-3.2.2/library") )

Programs:

Source file with all dependencies for the programs below. If in doubt, load all of them into your current shell instance prior to execution of any pipeline by adding the line below to the top of your script:

source /home/sobczm/bin/marias_profile

Alternatively, export each dependency to PATH individually by hand

Type nano ~/.profile to start editing your BASH profile. Press Alt and / to navigate until the end of the file and paste the export command on a new line, for instance: export PATH=/home/sobczm/bin/mcl-14-137/bin:${PATH}. Save changes and exit by pressing Ctrl and x followed by y and finally Return (i.e. Enter). For the changes to take place, either type source ~/.profile or close and re-open the terminal window.

ade4: Analysis of Ecological Data : Exploratory and Euclidean Methods in Environmental Sciences

adegenet: a R package for the multivariate analysis of genetic markers

ggplot2: graphing package implemented on top of the R statistical package

PCAdapt: pcadapt implements a genome scan for detecting genes involved in local adaptation

pegas: Population and Evolutionary Genetics Analysis System

PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses

poppr: Genetic Analysis of Populations with Mixed Reproduction

SNPRelate: Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data

vcfR: Manipulate and Visualize VCF Data

WhopGenome: High-Speed Processing of VCF, FASTA and Alignment Data

Standalone

4P: 4P (Parallel Processing of Polymorphism Panels) is a software for computing population genetics statistics from large SNPs dataset. /home/sobczm/bin/4p/bin

ABRA ver. 0.97: Improved coding indel detection via assembly-based realignment /home/sobczm/bin/abra/bin

BayeScan ver. 2.1: detecting natural selection from population-based genetic data /home/sobczm/bin/bayescan2.1/binaries/bayescan_2.1

Beagle ver. 4.1 Beagle is a software package that performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection. /home/sobczm/bin/beagle

BEAST requires Java 8 - downloaded it to a local directory and changed path for default Java in my profile:

export JAVA_HOME=/home/sobczm/bin/jre1.8.0_101
export PATH="$JAVA_HOME/bin:$PATH"

BEAST ver. 1.8.3 package - Bayesian analysis of molecular sequences using MCMC. Includes: BEAST, BEAUti, LogCombiner, TreeAnnotator. /home/sobczm/bin/beast/BEASTv1.8.3/bin

BEAST ver. 2.4.2 package. Includes: BEAST, BEAUti, LogCombiner, TreeAnnotator, DensiTree. /home/sobczm/bin/beast/BEASTv2.4.2/bin

BEASTGen ver. 1.0.2 Creates BEAST XML input files. /home/sobczm/bin/beast/BEASTGenv1.0.2/bin

bioawk BWK awk modified for biological data. /home/sobczm/bin/bioawk

BUSCO ver 2.0: Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs /home/sobczm/bin/BUSCO_v2

Dependencies:

export PATH=/home/sobczm/bin/hmmer-3.1b2/binaries:${PATH}
export PATH=/home/armita/prog/python3/Python-3.3.5/bin:${PATH}
export PATH=/home/armita/prog/ncbi-rmblastn-2.2.28/bin:${PATH}
export AUGUSTUS_CONFIG_PATH=/home/sobczm/bin/augustus-3.1/config
export PATH=/home/sobczm/bin/augustus-3.1/bin:${PATH}
export PATH=/home/sobczm/bin/augustus-3.1/scripts:${PATH}
export PATH=/home/armita/prog/emboss/EMBOSS-4.0.0/bin:${PATH}

CodonW ver.1.3 Multivariate analysis (correspondence analysis) of codon and amino acid usage. /home/sobczm/bin/codonW

CViT ver. 1.2.1 CViT - Chromosome Viewing Tool /home/sobczm/bin/cvit.1.2.1

DAGchainer DAGchainer: Computing Chains of Syntenic Genes in Complete Genomes /home/sobczm/bin/DAGCHAINER

DendroPy ver. 4.1.0 Python library for phylogenetic computing. /home/sobczm/bin/DendroPy *Need to export it to PYTHONPATH

DivStat: A User-Friendly Tool for Single Nucleotide Polymorphism Analysis of Genomic Diversity /home/sobczm/bin/DivStat

DuoHMM ver. 0.1.7: duoHMM is a software package for post-processing haplotypes estimated by SHAPEIT. It incorporates pedigree information into the haplotype estimates in a post-hoc manner. /home/sobczm/bin/duohmm_v0.1.7

EIGENSOFT ver. 6.1.2 A set of population structure dectection methods. /home/sobczm/bin/EIG/bin

FALCON Falcon: a set of tools for fast aligning long reads for consensus and assembly /home/sobczm/bin/FALCON-integrate on Triticum

FastTree ver. 2.1.9 Approximately Maximum-Likelihood Trees for Large Alignments /home/sobczm/bin/FastTree2.1.9

FigTree ver. 1.4.2 Viewing of phylogenetic trees and production of publication-ready figures. /home/sobczm/bin/FigTree_v1.4.2/bin

freebayes ver. v1.0.2 Bayesian haplotype-based polymorphism discovery and genotyping. /home/sobczm/bin/freebayes/bin

GATK ver. 3.6 Genome Analysis Toolkit - Variant Discovery in High-Throughput Sequencing Data. /home/sobczm/bin/GenomeAnalysisTK-3.6

GeneProteinViz ver. 1.2.8 Dynamic visualization of genomic regions and variants affecting protein domains. /home/sobczm/bin/GPViz

gffread The program gffread can be used to validate, filter, convert and perform various other operations on GFF files /home/sobczm/bin/gffread/gffread/

LMAP ver. 1.0 A collection of perl scripts to automate PAML use. Requires a number of dependencies /home/sobczm/bin/LMAPv1.0.0/LMAP

LoRMA ver. 0.4 LoRMA is a tool for correcting sequencing errors in long reads such those produced by Pacific Biosciences and Oxford Nanopore sequencing machines /home/sobczm/bin/LoRMA-0.4

LUMPY A probabilistic framework for structural variant discovery /home/sobczm/bin/lumpy-sv/bin

MAFFT ver. 7.222 Rapid multiple sequence alignment based on fast Fourier transform /home/sobczm/bin/mafft-7.222/bin

MEGA ver. 7 Sophisticated and user-friendly software suite for analyzing DNA and protein sequence data from species and populations. /home/sobczm/bin/mega

MEME Suite ver. 4.11.2 MEME SUITE: tools for motif discovery and searching /home/sobczm/bin/meme_4.11.2/bin

MinorSeq Minor Variant Calling and Phasing Tools for PacBio reads /home/sobczm/bin/minorseq on Triticum

NGMLR ver. 0.2.3 Ngmlr is a long-read mapper designed to align PacBilo or Oxford Nanopore to a reference genome with a focus on reads that span structural variations. Generates read mappings used by PacBio/Nanopore SV caller sniffles. /home/sobczm/bin/ngmlr/bin

NLR-Parser A tool to rapidly annotate the NLR complement from sequenced plant genomes. home/sobczm/bin/NLR-Parser Requires meme 4.9.1 in /home/sobczm/bin/meme_4.9.1/bin

OrthoFinder v1.0.7 OrthoFinder: Accurate inference of orthogroups, orthologues, gene trees and rooted species tree made easy /home/sobczm/bin/OrthoFinder-1.0.7/orthofinder

Dependencies:

export PATH=/home/sobczm/bin/mcl-14-137/bin:${PATH}
export PATH=/home/sobczm/bin/fastme-2.1.5/bin:${PATH}
export PATH=/home/sobczm/bin/dlcpar-1.0/bin:${PATH}
export PATH=/home/sobczm/bin/mafft-7.222/bin:${PATH}
export PATH=/home/sobczm/bin/FastTree2.1.9:${PATH}

PAL2NAL: Robust conversion of protein sequence alignments into the corresponding codon alignments /home/sobczm/bin/pal2nal.v14

PAML ver. 4.8 A package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood. /home/sobczm/bin/paml4.8/bin

PartitionFinder ver. 1.1.1 A program to select best-fit partitioning schemes and models of molecular evolution for phylogenetic analyses /home/sobczm/bin/PartitionFinder1.1.1 *Needs to be run with Anaconda Python distribution installed in /home/sobczm/bin/anaconda2/bin

PGDSpider ver. 2.1.0.3 An automated data conversion tool for connecting population genetics and genomics programs /home/sobczm/bin/PGDSpider_2.1.0.3

PicardTools ver. 2.5.0 A set of command line tools for manipulating formats such as SAM/BAM/CRAM and VCF. /home/sobczm/bin/picard-tools-2.5.0

PhyloNet ver. 3.5.5 & [PhyloNetHMM ver. 0.1] (http://bioinfo.cs.rice.edu/software/phmm) Bayesian inference of reticulate phylogenies under the multispecies network coalescent /home/sobczm/bin/phmm

Phyutility ver. 2.6.6 Phyutility provides a set of phyloinformatics tools for summarizing and manipulating phylogenetic trees, manipulating molecular data and retrieving data from NCBI. /home/sobczm/bin/phyutility

popoolation ver. 1.2.2 PoPoolation is a pipeline for analysing pooled next generation sequencing data. home/sobczm/bin/popoolation_1.2.2

RAxML ver. 8.2.9: a ML a tool for phylogenetic analysis and post-analysis of large phylogenies /home/sobczm/bin/RAxML8.2.9

RGAugury: A pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. Multiple dependencies, including own copy of Phobius. Phobius output not saved correctly when running via qsub /home/sobczm/bin/rgaugury

Sambamba ver. 0.6.5 Sambamba is a high performance modern robust and fast tool (and library) for working with SAM and BAM files. /home/sobczm/bin/sambamba/

Sniffles Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). /home/sobczm/bin/Sniffles/bin

snpEff & snpSift ver. 4.3 Genetic variant annotation, effect prediction, VCF filtering and manipulation toolbox. /home/sobczm/bin/snpEff

SplitsTree ver. 4 Program for computing unrooted phylogenetic networks from molecular sequence data. /home/sobczm/bin/splitstree4

STRUCTURE ver. 2.3.4 A package for using multi-locus genotype data to investigate population structure /home/sobczm/bin/structure

structure Harvester ver. 0.6.93 Downstream processing of STRUCTURE results to calculate Evanno’s Δk value and prepares input file for CLUMPP /home/sobczm/bin/structureHarvester

CLUMPP ver. 1.1.2 Permutes the clusters output by independent runs of STRUCTURE, so that they match up as closely as possible. /home/sobczm/bin/CLUMPP_Linux64.1.1.2

distruct ver. 1.1 A program to graphically display results produced by STRUCTURE or by other similar programs. /home/sobczm/bin/distruct1.1

Stampy ver. 1.0.29 Sensitive mapping of Illumina reads. /home/sobczm/bin/stampy-1.0.29

Tracer ver. 1.6 A program for analysing the trace files generated by Bayesian MCMC runs. /home/sobczm/bin/beast/Tracer_v1.6/bin

transAlign ver. 1.2 An open-source Perl script that aligns protein-coding DNA sequences via their amino-acid translations /home/sobczm/bin/transalign Dependency: /home/sobczm/bin/clustalw1.83

Treemix ver. 1.12: estimation of population trees with admixture. /home/sobczm/bin/treemix-1.12

Trinity ver 2.2: RNA-Seq assembly /home/sobczm/bin/trinityrnaseq-2.2.0

vawk An awk-like VCF parser /home/sobczm/bin/vawk

vcflib: a simple C++ library for parsing and manipulating VCF files, + many command-line utilities. /home/sobczm/bin/vcflib/bin

VCFtools: another set of C++ and Perl libraries for analysing VCF files. /home/sobczm/bin/vcftools/bin

#SET PERL PATH
export PERL5LIB=/home/sobczm/bin/vcftools/share/perl/5.14.2

PyVCF A VCF v. 4.0 and 4.1 parser for Python. /home/sobczm/bin/PyVCF/bin

#SET PYTHON PATH
export PYTHONPATH="$PYTHONPATH:/home/sobczm/bin/PyVCF/lib/python2.7/site-packages/"

USEARCH v. 9.0 High-throughput search and clustering /home/sobczm/bin/usearch

Weeder ver. 2.0 Discovery of transcription factor binding sites in a set of sequences from co-regulated genes /home/sobczm/bin/weeder

Nanopore sequencing

poretools ver. 0.6 A toolkit for working with nanopore sequencing data from Oxford Nanopore /home/sobczm/bin/poretools/poretools Usage: python ./poretools

marginAlign The marginAlign package can be used to align reads to a reference genome and call single nucleotide variations (SNVs). It is specifically tailored for Oxford Nanopore Reads. /home/sobczm/bin/marginAlign

NanoOK Flexible, multi-reference software for pre- and post-alignment analysis of nanopore sequencing data, quality and error profiles /home/sobczm/bin/NanoOK/bin

minion A small utility program to infer or test the presence of 3' adapter sequence in sequencing data. /home/sobczm/bin/minion