This directory contains workflows used by the Spang team for their phylogentic analyses.
Currently two worfklows are available:
-
A workflow for 48 Marker proteins selected by Zaremba et al., 2017 (Zaremba-Niedzwiedzka K, Caceres EF, Saw JH et al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature 2017;541:353–8.)
-
A manually curated marker set specifically designed to place Undinarchaeota (members of the DPANN archaea) into the archaeal tree by Dombrowski et al., 2020 (Dombrowski N, Williams TA, Sun J et al. Undinarchaeota illuminate DPANN phylogeny and the impact of gene transfer on archaeal evolution. Nature Communications 2020;11:3939.).
Inside the folders you will find the following:
- The unrendered rmd file
- the hmtl file can be found in the docs folder
- the files folder contains files that might be needed to run the pipeline
The larger databases will not be uploaded here, if you are interested in running the pipeline you can find the hmm profiles for the 51 marker selection here (https://zenodo.org/record/3839790#.YCOX5DHPylN) and for the 48 markers you can run an hmm search against the arcog database (contact nd.microbiota@gmail.com or get the sequences here: ftp://ftp.ncbi.nih.gov/pub/wolf/COGs/arCOG/).