Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement request: streamline multiblock functions by removing scheme and init args #336

Closed
4 of 5 tasks
evaham1 opened this issue Oct 29, 2024 · 1 comment · Fixed by #339
Closed
4 of 5 tasks
Assignees
Labels
feature-request Can be implemented if there's enough interest

Comments

@evaham1
Copy link
Collaborator

evaham1 commented Oct 29, 2024

Is your feature request related to a problem? Please describe.

In the multiblock functions you can choose the scheme and for the horst scheme you can choose the init (svd.single or svd). As discussed with Kim-Anh today users only use the horst scheme so lets hardcode this scheme into the function. init is by default svd.single but lets hardcode it as whichever runs faster.

This will streamline this function and prevent confusion as these additional arguments are not described/explained extensively in documentation.

Functions that this affects:
block.pls()
block.spls()
block.plsda()
block.splsda() == wrapper.sgccda()
mint.block.pls()
mint.block.spls()
mint.block.plsda()
mint.block.splsda()
wrapper.rgcca()
wrapper.sgcca()

Process:

✅ Check whether svd.single or svd runs faster for each function

  • Hardcode scheme = horst and init = svd.single
  • Update docs
  • Update unit tests
  • Propagate changes through other functions e.g. perf() and tune()
  • Check if there is anything in website/book about the schemes and update there
@evaham1 evaham1 added the feature-request Can be implemented if there's enough interest label Oct 29, 2024
@evaham1 evaham1 self-assigned this Oct 29, 2024
@evaham1
Copy link
Collaborator Author

evaham1 commented Nov 8, 2024

Comparing run time of svd versus svd.single

Basically shows very little difference in runtime so will stick to svd.single.
Note for block.pls and block.spls the default was svd.single and for block.plsda and block.splsda the default was svd.

library(mixOmics)
library(microbenchmark)

## block.pls
data("breast.TCGA")
data = list(mrna = breast.TCGA$data.train$mrna, mirna = breast.TCGA$data.train$mirna)
design = matrix(1, ncol = length(data), nrow = length(data),
                dimnames = list(names(data), names(data)))
diag(design) =  0
design

system.time(
  block.pls(X = data, Y = breast.TCGA$data.train$protein, 
                           ncomp = 2, design = design, init = "svd.single")
)
# user  system elapsed 
# 0.013   0.001   0.013 

system.time(block.pls(X = data, Y = breast.TCGA$data.train$protein, 
                           ncomp = 2, design = design, init = "svd")
)
# user  system elapsed 
# 0.020   0.001   0.021 

## block.plsda
data(nutrimouse)
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid, Y = nutrimouse$diet)
design = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3,
                byrow = TRUE, dimnames = list(names(data), names(data)))

system.time(
  block.plsda(X = data, indY = 3, init = "svd.single")
)
# user  system elapsed 
# 0.006   0.000   0.006 
system.time(
  block.plsda(X = data, indY = 3, init = "svd")
)
# user  system elapsed 
# 0.005   0.000   0.005 

## mint.block.pls
study <- c(rep("study1", 100), rep("study2", 50))
data <- list(mrna = breast.TCGA$data.train$mrna, mirna = breast.TCGA$data.train$mirna)
design <- matrix(1, ncol = length(data), nrow = length(data), dimnames = list(names(data), names(data)))
diag(design) <-  0

system.time(
  mint.block.pls(data, Y = breast.TCGA$data.train$protein, study = study, ncomp = 2, init = "svd.single")
)
# user  system elapsed 
# 0.014   0.001   0.015 
system.time(
  mint.block.pls(data, Y = breast.TCGA$data.train$protein, study = study, ncomp = 2, init = "svd")
)
# user  system elapsed 
# 0.022   0.001   0.022 

## mint.block.plsda
study <- c(rep("study1",150), rep("study2",70))
mrna <- rbind(breast.TCGA$data.train$mrna, breast.TCGA$data.test$mrna)
mirna <- rbind(breast.TCGA$data.train$mirna, breast.TCGA$data.test$mirna)
Y <- c(breast.TCGA$data.train$subtype, breast.TCGA$data.test$subtype)
data <- list(mrna = mrna, mirna = mirna)

system.time(
  mint.block.plsda(data, Y, study = study, ncomp = 2, init = "svd.single")
)
# user  system elapsed 
# 0.013   0.000   0.013 
system.time(
  mint.block.plsda(data, Y, study = study, ncomp = 2, init = "svd")
)
# user  system elapsed 
# 0.011   0.001   0.011 

## session info
sessionInfo()
# R version 4.4.1 (2024-06-14)
# Platform: aarch64-apple-darwin20
# Running under: macOS 15.1
# 
# Matrix products: default
# BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
# LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
# 
# locale:
#   [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# time zone: Australia/Melbourne
# tzcode source: internal
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] mixOmics_6.29.3 ggplot2_3.5.1   lattice_0.22-6  MASS_7.3-61    
# 
# loaded via a namespace (and not attached):
#   [1] Matrix_1.7-1        gtable_0.3.6        dplyr_1.1.4         compiler_4.4.1      tidyselect_1.2.1    Rcpp_1.0.13        
# [7] ellipse_0.5.0       stringr_1.5.1       parallel_4.4.1      gridExtra_2.3       tidyr_1.3.1         rARPACK_0.11-0     
# [13] scales_1.3.0        BiocParallel_1.39.0 plyr_1.8.9          R6_2.5.1            generics_0.1.3      igraph_2.1.1       
# [19] ggrepel_0.9.6       tibble_3.2.1        munsell_0.5.1       pillar_1.9.0        RColorBrewer_1.1-3  rlang_1.1.4        
# [25] utf8_1.2.4          stringi_1.8.4       pkgload_1.4.0       cli_3.6.3           withr_3.0.2         magrittr_2.0.3     
# [31] grid_4.4.1          rstudioapi_0.17.1   lifecycle_1.0.4     vctrs_0.6.5         RSpectra_0.16-2     glue_1.8.0         
# [37] corpcor_1.6.10      codetools_0.2-20    fansi_1.0.6         colorspace_2.1-1    purrr_1.0.2         reshape2_1.4.4     
# [43] matrixStats_1.4.1   tools_4.4.1         pkgconfig_2.0.3    

@evaham1 evaham1 linked a pull request Nov 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Can be implemented if there's enough interest
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant