bugphyzzExports

Weekly export status:

bugphyzzExports

bugphyzz is a database that harmonizes physiological and other microbial trait annotations from different sources using a controlled vocabulary and ontology terms. Furthermore, these annotations are propagated to uncharacterized microbes through Ancestral State Reconstruction (ASR).

You can learn more about this project here.

This repository contains the code for resolving conflicting annotations and run the ASR step. It also contains the devel version of the annotations (before being released on Zenodo) distributed across different text files. The *.csv files contain the data in tabular format and are imported through the bugphyzz::importBugphyzz function in R. The *gmt files contain lists of microbial signatures in GMT format created with the bugphyz::makeSignatures function.

The data schema is described here

The devel files are generated weekly.

File creation

If desired, anyone can generate the *.csv and *.gmt files.

1. Install required R packages

The first step is downloading the repo:

git clone https://github.com/waldronlab/bugphyzzExports.git
cd bugphyzzExports

The following packages need to be installed in the R environment:

This could be accomplished for example with:

## Inside an R session
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
dependencies <- c(
    "waldronlab/bugphyzz",
    "bugsigdbr",
    "castor",
    "dplyr",
    "logr",
    "phytools"
    "purrr",
    "rlang",
    "sessioninfo",
    "stringr",
    "waldronlab/taxPPro",
    "tibble",
    "tidyr"
)
BiocManager::install(dependencies)

Or running devtools::install_deps(dependencies=TRUE) in an R session within the main directory.

2. Run the inst/scripts/export_bugphyzz.R script

Run the script, which will produce the files in the directory where the script is run. Preferably run inside the main directory of the project.

On a linux-like terminal:

Rscript inst/scripts/export_bugphyzz.R

On supermicro (for internal use):

/usr/bin/Rscript --vanilla inst/scripts/export_bugphyzz.R

License

The files are available under Creative Commons Attribution 4.0 International.

Zenodo

Find this dataset on Zenodo (latest realease version): https://zenodo.org/doi/10.5281/zenodo.10980653

Versioning (recommended)

Some recommendations about versioning for relase.

Format: x.y.z
Example: 1.0.2

The third digit (z) should be used to fix typos or any other minor trouble with the annotations. Essentially these are the same annotations, but with minor adjustments.

The second digit (y) should be used for major adjustments such as fixing the way conflicting annotations are handled or adjusting ASR methods/parameters, say choosing a different phylogenetic tree or using a different package for running ASR.

The first digit (x) should be reserved for major changes, such as adding new datasets or using a completely different approach for propagating annotations, etc.

Validation of the ASR method

A 10-fold cross-validation approach was used to estimate how good our ASR method did with each attribute/physiology in the dataset. These validation results are not really part of the annotations, so they're not provided here. You can find these results on: https://github.com/waldronlab/taxPProValidation/. to select the attributes with the best results. The validation values are also attached when importing the files with the bugphyzz package in R.

Name		Name	Last commit message	Last commit date
Latest commit History 286 Commits
.github/workflows		.github/workflows
inst/scripts		inst/scripts
.Rbuildignore		.Rbuildignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.zenodo.json		.zenodo.json
DESCRIPTION		DESCRIPTION
README.md		README.md
bugphyzz-genus-NCBI_ID.gmt		bugphyzz-genus-NCBI_ID.gmt
bugphyzz-genus-Taxon_name.gmt		bugphyzz-genus-Taxon_name.gmt
bugphyzz-mixed-NCBI_ID.gmt		bugphyzz-mixed-NCBI_ID.gmt
bugphyzz-mixed-Taxon_name.gmt		bugphyzz-mixed-Taxon_name.gmt
bugphyzz-species-NCBI_ID.gmt		bugphyzz-species-NCBI_ID.gmt
bugphyzz-species-Taxon_name.gmt		bugphyzz-species-Taxon_name.gmt
bugphyzz-strain-NCBI_ID.gmt		bugphyzz-strain-NCBI_ID.gmt
bugphyzz-strain-Taxon_name.gmt		bugphyzz-strain-Taxon_name.gmt
bugphyzz_binary.csv		bugphyzz_binary.csv
bugphyzz_multistate.csv		bugphyzz_multistate.csv
bugphyzz_numeric.csv		bugphyzz_numeric.csv
log_file.log		log_file.log
log_file.msg		log_file.msg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bugphyzzExports

File creation

1. Install required R packages

2. Run the inst/scripts/export_bugphyzz.R script

License

Zenodo

Versioning (recommended)

Validation of the ASR method

About

Releases 5

Packages

Contributors 5

Languages

waldronlab/bugphyzzExports

Folders and files

Latest commit

History

Repository files navigation

bugphyzzExports

File creation

1. Install required R packages

2. Run the inst/scripts/export_bugphyzz.R script

License

Zenodo

Versioning (recommended)

Validation of the ASR method

About

Resources

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 5

Languages

Packages