Skip to content

Functionality

trvinh edited this page Feb 12, 2018 · 10 revisions

Table of Contents

Capability

PhyloProfile can dynamically visualize and explore multi-layered phylogenetic profiles.

Two addtional layers of information can be integrated into a presence/absence phylogenetic profile could be any comparable value between seed protein and its ortholog, e.g. sequence similarity, domain architecture similarity, semantic similartiy of Gene Ontology terms, taxonomic distances, 3D structure similarity, etc.

Dynamic visualization

Users can:

  • dynamically change the resolution of the analysis from invidual species to entire classes or phyla by collapsing the input taxa into higher systematic rank (*).
  • dynamically filter data by applying different thresholds to the integrated information.
  • dynamically modify the apperance of profile with diverse plot configuration options.

(*) PhyloProfile is able to represent co-orthologs (in-paralogs), if the working taxonomic rank is the deepest one can be found in the input taxa.

Users can visualize the complete profile (Main profile) or only a subset of genes and taxa for a detailed study (Customized profile).

Besides, PhyloProfile's UI will be automatically varied according to user's input files, such as the names of two additional information layers or list of input taxa.

Dynamic analysis functions

Implemented with interactive ability, PhyloProfile provides several useful function for analyzing phylogenetic profiles.

  1. Profile clustering: cluster genes arccording to the distance of their phylogenetic profiles in order to bring similar profiles together. The similarity of profiles can indicate the novel functional relation between proteins.

  1. Gene age estimation: estimate the evolutionary age of genes using an LCA algorithm, i.e. the last common ancestor of the two most distantly related species in the ortholog group serves as the minimal gene age of that group.

  1. Core gene identification: find genes that are shared in all selected taxa. The core gene set can be used for e.g. phylogenetic tree reconstruction.

  2. Distribution analysis: from the distribution of the values of two integrated information layers and the percentage taxa summerized at the choosen taxonomic rank, users can decide a reasonable filtering threshold.

The phylogenetic profile plots (Main profile & Customized profile) and analysis functions can be interact with each other. The filtering thresholds applied on the Main profile will affect the result of the Core gene identification or Distribution analysis. A set of similar genes chosen from Profile clustering, all genes that have the same evolutionary age selected from Gene age estimation, or core genes of an interested list of taxa can be directly submitted to Customized profile for detailed analysis.

Optional data representation

In adddition to display the basic information of a protein including the protein ID, the taxon it belongs to and the values of two integrated information layers, PhyloProfile is able to represent its FASTA sequence as well as a plot (Domain architecture plot) showing the feature architecture comparison between two proteins (seed and ortholog) (*).

(*) if only the architecture of selected protein is present, the Domain architecture plot will show only the domain annotation for that protein.

Interoperable output

All plots generated in PhyloProfile can be exported as PDF files.

Filtered data of Main profile and Customized profile can be downloaded for further downstream analysis, e.g. phylogenomic tree reconstruction or metabolic pathway reconstruction.

More

Read the walkthrough slides to explore the full functionality of PhyloProfile.