MetaEvoMining

What is MetaEvoMining?

MetaEvoMining is a tool for the exploration of unknown enzymes using an evolutionary approach. MetaEvoMining detects, through sequence homology, those genes that have undergone expansion and identifies potential candidates for enzymes recruited through natural selection for new/novel biosynthetic pathways. MetaEvoMining searches for differences among homologous sequences from organisms that share a common evolutionary lineage.

Quick Start

Installation

This package is setting in github.

Before installing it, you need to do the following:

You need to install devtools first.


install.packages("devtools")

Once devtools is installed, we need to load it


library(devtools)

Then you can install the MetaEvoMining package with install_packages


install_github("andrespan/MetaEvoMining")

Module 1(optional)

|Get a central Database| We can create a database in two steps. The first is the search for orthologous sequences by get_homologues.


#run_get_homologs

The second searches from the directories resulting sequences that are present in more than half of the database, and groups them in a database ready to use in the program.

The csv_matrix is a pangenome matrix that results from get_homologs
The path is the output directory of alg_intersection of get_homologs


search_shell_enzymes_DB("pangenome_matrix_t0.tr.csv",path)

Module 2

|Get the EvoMining files| This module takes protein assemblies and generates a functional annotation table and sequence file to run EvoMining.

The annotation_dirpath is the path of annotation output directory created by KofamScan.
The genome_dirpath is the path of all FASTA protein files of all the bins.
The gtdbK_report is a tsv file wich was create by gtdbtk program.


make_EvoFiles(annotation_dirpath,
                          genome_dirpath,
                          IDs_table)

Module 3

|Run EvoMining trees| This module runs EvoMining with the files resulting from the previous module. You can use the central database generated in module 1 or any database created with the EvoMining guidelines.

This function generates a table that reports the copy counts in the enzyme families.


#run_EvoMining

The generated table can be filtered with the following function. This function searches the EvoMining table and looks for columns (enzymes) where the counts in the input genomes are above the mode. It reports those columns in a list to run the trees.

The EvoMinining_heat_table is a copy count table where the columns are the enzyme families and the rows are the input genomes.


filter_interest_families(EvoMinining_heat_table)

After that you can run the selected trees in the list and observe the predictions. Each branch has the id of the sequence so you can search it.


#run_trees

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
R		R
inst/extdata		inst/extdata
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
MetaEvoMining.Rproj		MetaEvoMining.Rproj
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetaEvoMining

What is MetaEvoMining?

Quick Start

Installation

Module 1(optional)

Module 2

Module 3

About

Releases

Packages

Languages

andrespan/MetaEvoMining

Folders and files

Latest commit

History

Repository files navigation

MetaEvoMining

What is MetaEvoMining?

Quick Start

Installation

Module 1(optional)

Module 2

Module 3

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages