Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create new generic converter #506

Closed
wants to merge 66 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
1bc1d14
new generic function convert
thpral Mar 18, 2024
8d1f351
keep the original functions in their original files
thpral Mar 18, 2024
efeb240
up
thpral Mar 19, 2024
fca1fbd
up
thpral Mar 19, 2024
10d353b
convert documentation
thpral Mar 20, 2024
2c27585
up
thpral Mar 20, 2024
24366d7
up
thpral Mar 20, 2024
a5898c6
correct documentation
thpral Mar 21, 2024
bb48891
correct internal functions naming
thpral Mar 21, 2024
7eb4616
up
thpral Mar 21, 2024
ef58f1b
up
thpral Mar 21, 2024
813f6e6
up
thpral Mar 21, 2024
ecd88df
up
thpral Mar 21, 2024
06b7dfa
up
thpral Mar 21, 2024
6fcfcc0
up
thpral Mar 22, 2024
4a937fc
correct convert man page
thpral Mar 25, 2024
6afdc7e
correct BiocCheck warnings
thpral Mar 26, 2024
3f31882
up
thpral Mar 26, 2024
5c57a91
Merge branch 'devel' into draft_converters
TuomasBorman Apr 3, 2024
580ad1b
change param tree_name to tree.name for convert
thpral Apr 3, 2024
2f7db75
up
thpral Apr 3, 2024
1f7cf4b
Merge branch 'draft_converters' of https://github.com/microbiome/mia …
thpral Apr 3, 2024
c8b6242
new parameter for choosing output type when converting SE and TSE
thpral Apr 3, 2024
08277cb
up
thpral Apr 4, 2024
c5847e4
merge devel and resolve conflicts
thpral Apr 4, 2024
ee4d240
up
TuomasBorman Apr 10, 2024
84d4976
changes after review
thpral Apr 10, 2024
837c4a7
merge remote (resolve conflicts)
thpral Apr 10, 2024
d368db2
merge devel and resolve conflicts
thpral Apr 10, 2024
d285bf9
correct arguments loadFromBiom
thpral Apr 10, 2024
c4f6265
correct deprecate and NAMESPACE file
thpral Apr 15, 2024
204b774
merge devel
thpral Apr 17, 2024
fca7a9c
correct importBIOM default arguments after merging devel
thpral Apr 17, 2024
5599459
move loadFromBiom arguments (doc)
thpral Apr 17, 2024
5e95935
export loadFromBiom
thpral Apr 25, 2024
ea0e066
export converters in deprecate file
thpral Apr 25, 2024
ccb0bec
update DESCRIPTION
thpral Apr 25, 2024
fc9d5e7
merge devel
thpral Apr 25, 2024
86ea4ab
rename man page loadFromBiom to importBIOM to avoid duplicated aliases
thpral Apr 26, 2024
9e193d0
resolve conflicts and merge devel
thpral May 6, 2024
26d1a4b
add import tags for phyloseq and dada classes
thpral May 6, 2024
c9956d0
merge devel
thpral May 9, 2024
2d415a1
merge devel and remove import tags for dada and phyloseq classes
thpral May 28, 2024
167cb37
fix lost braces
thpral May 28, 2024
b833e14
try to fix R CMD check note
thpral May 29, 2024
4cdfc9c
up
thpral May 29, 2024
97f40d2
reverse last 2 commits
thpral May 29, 2024
70016dd
Revert "fix lost braces"
thpral Jun 3, 2024
b977a1c
Merge branch 'draft_converters' of https://github.com/microbiome/mia …
thpral Jun 3, 2024
332acfa
merge devel
thpral Jun 3, 2024
7129b67
replace convert with new names in deprecate, biom and phyloseq
thpral Jun 5, 2024
67db5f2
merge devel
thpral Jun 10, 2024
99ec6cd
replace convert with new functions
thpral Jun 10, 2024
f400ab4
rename converters files
thpral Jun 10, 2024
93afb81
rearrange convert man page
thpral Jun 10, 2024
383adea
Merge branch 'devel' into draft_converters
thpral Jun 10, 2024
da81e8d
fix errors and warnings
thpral Jun 12, 2024
d360e0e
Merge branch 'draft_converters' of https://github.com/microbiome/mia …
thpral Jun 12, 2024
49a2a26
fix convertToPhyloseq
thpral Jun 12, 2024
e0590b8
try to fix R CMD check note
thpral Jun 12, 2024
dd43f9a
merge devel
thpral Jun 19, 2024
9cc92a2
merge devel
thpral Jun 19, 2024
d236775
small fix
thpral Jun 19, 2024
fdbd8af
renaming files
thpral Jun 20, 2024
a30c9fd
renaming files
thpral Jun 20, 2024
2f906a1
resolve R CMD check warning
thpral Jun 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@ export(calculateOverlap)
export(calculateRDA)
export(calculateUnifrac)
export(cluster)
export(convertFromBIOM)
export(convertFromDADA2)
export(convertFromPhyloseq)
export(convertToPhyloseq)
export(countDominantFeatures)
export(countDominantTaxa)
export(estimateDivergence)
Expand Down Expand Up @@ -158,6 +162,7 @@ exportMethods(calculateOverlap)
exportMethods(calculateUnifrac)
exportMethods(checkTaxonomy)
exportMethods(cluster)
exportMethods(convertToPhyloseq)
exportMethods(countDominantFeatures)
exportMethods(countDominantTaxa)
exportMethods(estimateDivergence)
Expand Down
5 changes: 4 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,10 @@ Changes in version 1.13.x
includes the newly added subsampled assay.
+ Fix bug in mergeFeaturesByPrevalence
+ new aliases calculateDPCoA to getDPCoA, calculateNMDS to getNMDS, calculateRDA to getRDA,
calculateCCA to getCCA
calculateCCA to getCCA
+ add informative error message in rarefyAssay on assays with strictly-negative values
Comment on lines +138 to +139
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem with news, same line is added

+ add informative error message in rarefyAssay on assays with strictly-negative values
+ Use rbiom package in unifrac implementation
+ Updated parameter names to follow naming convention "parameter.name"
+ rename makeTreeSEFrom* and makePhyloseqFromTreeSE functions and combine their
documentation under new manual page convert
84 changes: 45 additions & 39 deletions R/makeTreeSummarizedExperimentFromBiom.R → R/convertFromBIOM.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
#' Loading a biom file
#' Converters
#'
#' For convenience a few functions are available to convert data from a
#' \sQuote{biom} file or object into a
#' For convenience a few functions are available to convert BIOM, DADA2 and
#' phyloseq objects to
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#' objects and
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#' objects to phyloseq objects.
#'
#' @param file biom file location
#' @param file BIOM file location
#'
#' @param obj BIOM object to be converted to a
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#'
#' @param prefix.rm \code{TRUE} or \code{FALSE}: Should
#' taxonomic prefixes be removed? The prefixes is removed only from detected
Expand All @@ -27,22 +33,31 @@
#'
#' @param ... additional arguments
#' \itemize{
#' \item \code{patter}: \code{character} value specifying artifacts
#' \item{\code{pattern}}{\code{character} value specifying artifacts
#' to be removed. If \code{patterns = "auto"}, special characters
#' are removed. (default: \code{pattern = "auto"})
#' are removed. (default: \code{pattern = "auto"})
#' }
#' }
#'
#' @return An object of class
#' @details
#' \code{convertFromBIOM} coerces a BIOM object to a
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#' object.
#'
#' \code{importBIOM} loads a BIOM file and creates a
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#' object from the BIOM object contained in the loaded file
#' @return
#' \code{importBIOM} returns an object of class
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#'
#' \code{convertFromBIOM} returns an object of class
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#'
#' @name makeTreeSEFromBiom
#' @seealso
#' \code{\link[=makeTreeSEFromPhyloseq]{makeTreeSEFromPhyloseq}}
#' \code{\link[=makeTreeSEFromDADA2]{makeTreeSEFromDADA2}}
#' \code{\link[=importQIIME2]{importQIIME2}}
#' \code{\link[=importMothur]{importMothur}}
#' @name convert
#'
#' @examples
#' ### Load and convert BIOM results to a TreeSE
#' # Load biom file
#' library(biomformat)
#' biom_file <- system.file("extdata", "rich_dense_otu_table.biom",
Expand All @@ -53,7 +68,7 @@
#'
#' # Make TreeSE from biom object
#' biom_object <- biomformat::read_biom(biom_file)
#' tse <- makeTreeSEFromBiom(biom_object)
#' tse <- convertFromBIOM(biom_object)
#'
#' # Get taxonomyRanks from prefixes and remove prefixes
#' tse <- importBIOM(biom_file,
Expand All @@ -69,31 +84,30 @@
#' artifact.rm = TRUE)
NULL

#' @rdname makeTreeSEFromBiom
#'
#' \code{importBIOM} loads a BIOM file and creates a TreeSE from the BIOM object
#' contained in the loaded file.
#' @rdname convert
#' @export
importBIOM <- function(file, ...) {
.require_package("biomformat")
biom <- biomformat::read_biom(file)
makeTreeSEFromBiom(biom, ...)
convertFromBIOM(biom,...)
}

#' @rdname makeTreeSEFromBiom
#'
#' @param x object of type \code{\link[biomformat:read_biom]{biom}}
#'
#' @export
#' \code{convertFromBIOM} creates a TreeSE object from a BIOM object.
#' @importFrom S4Vectors make_zero_col_DFrame DataFrame
#' @importFrom dplyr %>% bind_rows
makeTreeSEFromBiom <- function(
x, prefix.rm = removeTaxaPrefixes,
#' @rdname convert
#' @export
convertFromBIOM <- function(
obj, prefix.rm = removeTaxaPrefixes,
removeTaxaPrefixes = FALSE, rank.from.prefix = rankFromPrefix,
rankFromPrefix = FALSE,
artifact.rm = remove.artifacts, remove.artifacts = FALSE, ...){
# input check
.require_package("biomformat")
if(!is(x,"biom")){
stop("'x' must be a 'biom' object", call. = FALSE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of object is changed from obj to x indicating that your branch is out of sync

if(!is(obj,"biom")){
stop("'obj' must be a 'biom' object", call. = FALSE)
}
if( !.is_a_bool(prefix.rm) ){
stop("'prefix.rm' must be TRUE or FALSE.", call. = FALSE)
Expand All @@ -105,9 +119,9 @@ makeTreeSEFromBiom <- function(
stop("'artifact.rm' must be TRUE or FALSE.", call. = FALSE)
}
#
counts <- as(biomformat::biom_data(x), "matrix")
sample_data <- biomformat::sample_metadata(x)
feature_data <- biomformat::observation_metadata(x)
counts <- as(biomformat::biom_data(obj), "matrix")
sample_data <- biomformat::sample_metadata(obj)
feature_data <- biomformat::observation_metadata(obj)

# colData is initialized with empty tables with rownames if it is NULL
if( is.null(sample_data) ){
Expand Down Expand Up @@ -185,8 +199,8 @@ makeTreeSEFromBiom <- function(
}

# Adjust row and colnames
rownames(counts) <- rownames(feature_data) <- biomformat::rownames(x)
colnames(counts) <- rownames(sample_data) <- biomformat::colnames(x)
rownames(counts) <- rownames(feature_data) <- biomformat::rownames(obj)
colnames(counts) <- rownames(sample_data) <- biomformat::colnames(obj)

# Convert into DataFrame
sample_data <- DataFrame(sample_data)
Expand All @@ -202,14 +216,6 @@ makeTreeSEFromBiom <- function(
return(tse)
}

####################### makeTreeSummarizedExperimentFromBiom ###################
#' @param x object of type \code{\link[biomformat:read_biom]{biom}}
#' @rdname makeTreeSEFromBiom
#' @export
makeTreeSummarizedExperimentFromBiom <- function(x, ...){
makeTreeSEFromBiom(x, ...)
}

################################ HELP FUNCTIONS ################################
# This function removes prefixes from taxonomy names
.remove_prefixes_from_taxa <- function(
Expand Down
46 changes: 17 additions & 29 deletions R/makeTreeSummarizedExperimentFromDADA2.R → R/convertFromDADA2.R
Original file line number Diff line number Diff line change
@@ -1,44 +1,41 @@
#' Coerce \sQuote{DADA2} results to \code{TreeSummarizedExperiment}
#'
#' \code{makeTreeSEFromDADA2} is a wrapper for the
#' \code{mergePairs} function from the \code{dada2} package.
#'
#' @param ... See \code{mergePairs} function for
#' more details.
#' @param ... Addional arguments. For \code{convertFromDADA2}, see
#' \code{mergePairs} function for more details.
#'
#' @details
#' A count matrix is constructed via \code{makeSequenceTable(mergePairs(...))}
#' and rownames are dynamically created as \code{ASV(N)} with \code{N} from
#' 1 to \code{nrow} of the count tables. The colnames and rownames from the
#' output of \code{makeSequenceTable} are stored as \code{colnames} and in the
#' \code{referenceSeq} slot of the \code{TreeSummarizedExperiment},
#' respectively.
#' \code{convertFromDADA2} is a wrapper for the
#' \code{mergePairs} function from the \code{dada2} package.
#' A count matrix is constructed via
#' \code{makeSequenceTable(mergePairs(...))} and rownames are dynamically
#' created as \code{ASV(N)} with \code{N} from 1 to \code{nrow} of the count
#' tables. The colnames and rownames from the output of \code{makeSequenceTable}
#' are stored as \code{colnames} and in the \code{referenceSeq} slot of the
#' \code{TreeSummarizedExperiment},respectively.
#'
#' @return An object of class \code{TreeSummarizedExperiment}
#' @return \code{convertFromDADA2} returns an object of class
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#'
#' @importFrom S4Vectors SimpleList
#' @importFrom Biostrings DNAStringSet
#'
#' @name makeTreeSEFromDADA2
#' @seealso
#' \code{\link[=makeTreeSEFromPhyloseq]{makeTreeSEFromPhyloseq}}
#' \code{\link[=makeTreeSEFromBiom]{makeTreeSEFromBiom}}
#' \code{\link[=importQIIME2]{importQIIME2}}
#' \code{\link[=importMothur]{importMothur}}
#' @rdname convert
#'
#' @export
#'
#' @examples
#'
#' ### Coerce DADA2 results to a TreeSE object
#' if(requireNamespace("dada2")) {
#' fnF <- system.file("extdata", "sam1F.fastq.gz", package="dada2")
#' fnR = system.file("extdata", "sam1R.fastq.gz", package="dada2")
#' dadaF <- dada2::dada(fnF, selfConsist=TRUE)
#' dadaR <- dada2::dada(fnR, selfConsist=TRUE)
#'
#' tse <- makeTreeSEFromDADA2(dadaF, fnF, dadaR, fnR)
#' tse <- convertFromDADA2(dadaF, fnF, dadaR, fnR)
#' tse
#' }
makeTreeSEFromDADA2 <- function(...) {
convertFromDADA2 <- function(...) {
# input checks
.require_package("dada2")
.require_package("stringr")
Expand All @@ -62,12 +59,3 @@ makeTreeSEFromDADA2 <- function(...) {
rownames(output) <- rName
output
}

#################### makeTreeSummarizedExperimentFromDADA2 #####################
#' @param ... See \code{mergePairs} function for
#' more details.
#' @name makeTreeSEFromDADA2
#' @export
makeTreeSummarizedExperimentFromDADA2 <- function(...) {
makeTreeSEFromDADA2(...)
}
77 changes: 77 additions & 0 deletions R/convertFromPhyloseq.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
#' @param phy a \code{phyloseq} object
#'
#' @details
#' \code{convertFromPhyloseq} converts \code{phyloseq}
#' objects into
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}} objects.
#' All data stored in a \code{phyloseq} object is transferred.
#'
#' @return
#' \code{convertFromPhyloseq} returns an object of class
#' \code{\link[TreeSummarizedExperiment:TreeSummarizedExperiment-class]{TreeSummarizedExperiment}}
#'
#' @importFrom S4Vectors SimpleList DataFrame make_zero_col_DFrame
#' @importFrom SummarizedExperiment colData colData<-
#'
#' @export
#'
#' @rdname convert
#' @seealso
#' \code{\link[=importQIIME2]{importQIIME2}}
#' \code{\link[=importMothur]{importMothur}}
#'
#' @examples
#'
#' ### Coerce a phyloseq object to a TreeSE object
#' if (requireNamespace("phyloseq")) {
#' data(GlobalPatterns, package="phyloseq")
#' convertFromPhyloseq(GlobalPatterns)
#' data(enterotype, package="phyloseq")
#' convertFromPhyloseq(enterotype)
#' data(esophagus, package="phyloseq")
#' convertFromPhyloseq(esophagus)
#' }
convertFromPhyloseq <- function(phy, ...) {
# input check
.require_package("phyloseq")
if(!is(phy,"phyloseq")){
stop("'phy' must be a 'phyloseq' object")
}
#
# Get the assay
counts <- phy@otu_table@.Data
# Check the orientation, and transpose if necessary
if( !phy@otu_table@taxa_are_rows ){
counts <- t(counts)
}
# Create a list of assays
assays <- SimpleList(counts = counts)

if(!is.null(phy@tax_table@.Data)){
rowData <- DataFrame(data.frame(phy@tax_table@.Data))
} else{
rowData <- S4Vectors::make_zero_col_DFrame(nrow(assays$counts))
rownames(rowData) <- rownames(assays$counts)
}
if(!is.null(phy@sam_data)){
colData <- DataFrame(data.frame(phy@sam_data))
} else{
colData <- S4Vectors::make_zero_col_DFrame(ncol(assays$counts))
rownames(colData) <- colnames(assays$counts)
}
if(!is.null(phy@phy_tree)){
rowTree <- phy@phy_tree
} else {
rowTree <- NULL
}
if (!is.null(phy@refseq)) {
referenceSeq <- phy@refseq
} else {
referenceSeq <- NULL
}
TreeSummarizedExperiment(assays = assays,
rowData = rowData,
colData = colData,
rowTree = rowTree,
referenceSeq = referenceSeq)
}
Loading
Loading