Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow admm_MADMMplasso() under ADMMplasso() #17

Closed
wleoncio opened this issue Dec 4, 2023 · 21 comments · Fixed by #22 or #44
Closed

Slow admm_MADMMplasso() under ADMMplasso() #17

wleoncio opened this issue Dec 4, 2023 · 21 comments · Fixed by #22 or #44
Assignees
Labels
bug Something isn't working

Comments

@wleoncio
Copy link
Member

wleoncio commented Dec 4, 2023

It seems the C++ is now very slower than the R when I run with the example on github (setting p=500 and nlambda=50). I don't know wether it is due to the back and forth between C++ and R for each lambda.
You can check this by calling the MADMMplasso function and set legacy=T or F (I have included this in the call).

Originally posted by @Theo-qua in #16 (comment)

@wleoncio
Copy link
Member Author

wleoncio commented Dec 4, 2023

Can reproduce. See example below:

# Train the model
# generate some data
library(MADMMplasso)
set.seed(1235)
N <- 100
p <- 50
nz <- 4
K <- nz
X <- matrix(rnorm(n = N * p), nrow = N, ncol = p)
mx <- colMeans(X)
sx <- sqrt(apply(X, 2, var))
X <- scale(X, mx, sx)
X <- matrix(as.numeric(X), N, p)
Z <- matrix(rnorm(N * nz), N, nz)
mz <- colMeans(Z)
sz <- sqrt(apply(Z, 2, var))
Z <- scale(Z, mz, sz)
beta_1 <- rep(x = 0, times = p)
beta_2 <- rep(x = 0, times = p)
beta_3 <- rep(x = 0, times = p)
beta_4 <- rep(x = 0, times = p)
beta_5 <- rep(x = 0, times = p)
beta_6 <- rep(x = 0, times = p)

beta_1[1:5] <- c(2, 2, 2, 2, 2)
beta_2[1:5] <- c(2, 2, 2, 2, 2)
beta_3[6:10] <- c(2, 2, 2, -2, -2)
beta_4[6:10] <- c(2, 2, 2, -2, -2)
beta_5[11:15] <- c(-2, -2, -2, -2, -2)
beta_6[11:15] <- c(-2, -2, -2, -2, -2)

Beta <- cbind(beta_1, beta_2, beta_3, beta_4, beta_5, beta_6)
colnames(Beta) <- c(1:6)

theta <- array(0, c(p, K, 6))
theta[1, 1, 1] <- 2
theta[3, 2, 1] <- 2
theta[4, 3, 1] <- -2
theta[5, 4, 1] <- -2
theta[1, 1, 2] <- 2
theta[3, 2, 2] <- 2
theta[4, 3, 2] <- -2
theta[5, 4, 2] <- -2
theta[6, 1, 3] <- 2
theta[8, 2, 3] <- 2
theta[9, 3, 3] <- -2
theta[10, 4, 3] <- -2
theta[6, 1, 4] <- 2
theta[8, 2, 4] <- 2
theta[9, 3, 4] <- -2
theta[10, 4, 4] <- -2
theta[11, 1, 5] <- 2
theta[13, 2, 5] <- 2
theta[14, 3, 5] <- -2
theta[15, 4, 5] <- -2
theta[11, 1, 6] <- 2
theta[13, 2, 6] <- 2
theta[14, 3, 6] <- -2
theta[15, 4, 6] <- -2

library(MASS)
pliable <- matrix(0, N, 6)
for (e in 1:6) {
  pliable[, e] <- compute_pliable(X, Z, theta[, , e])
}

esd <- diag(6)
e <- MASS::mvrnorm(N, mu = rep(0, 6), Sigma = esd)
y_train <- X %*% Beta + pliable + e
y <- y_train

colnames(y) <- c(paste("y", 1:(ncol(y)), sep = ""))
TT <- tree_parms(y)
plot(TT$h_clust)

gg1 <- matrix(0, 2, 2)
gg1[1, ] <- c(0.02, 0.02)
gg1[2, ] <- c(0.02, 0.02)

nlambda <- 1
e.abs <- 1E-4
e.rel <- 1E-2
alpha <- .2
tol <- 1E-3

fit <- MADMMplasso(
  X, Z, y,
  alpha = alpha, my_lambda = matrix(rep(0.2, dim(y)[2]), 1),
  lambda_min = 0.001, max_it = 5000, e.abs = e.abs, e.rel = e.rel, maxgrid = nlambda,
  nlambda = nlambda, rho = 5, tree = TT, my_print = FALSE, alph = 1, parallel = FALSE,
  pal = 1, gg = gg1, tol = tol, cl = 6, legacy = FALSE
)
#> Using C++
#> Time difference of 2.135376 secs
#> [1]  1.000000 14.000000 29.000000  3.108636
fit_2 <- MADMMplasso(
  X, Z, y,
  alpha = alpha, my_lambda = matrix(rep(0.2, dim(y)[2]), 1),
  lambda_min = 0.001, max_it = 5000, e.abs = e.abs, e.rel = e.rel, maxgrid = nlambda,
  nlambda = nlambda, rho = 5, tree = TT, my_print = FALSE, alph = 1, parallel = FALSE,
  pal = 1, gg = gg1, tol = tol, cl = 6, legacy = TRUE
)
#> Warning in admm_MADMMplasso(beta0, theta0, beta, beta_hat, theta, rho1, : Using
#> legacy R code for MADMMplasso.This functionality will be removed in a future
#> release.Please consider using legacy = FALSE instead.
#> Convergence reached after 37 iterations
#> Time difference of 1.288508 secs
#> [1]  1.000000 14.000000 33.000000  1.417865
print(packageVersion("MADMMplasso"))
#> [1] '0.0.0.9008'

Created on 2023-12-04 with reprex v2.0.2

Since one call of the C++ version is much faster than the legacy, I agree with your suspicion that the back-and-forth is causing the slowdown. I'll perform some profiling and work on this ASAP.

@Theo-qua
Copy link
Collaborator

Theo-qua commented Dec 4, 2023 via email

@Theo-qua
Copy link
Collaborator

Theo-qua commented Dec 20, 2023 via email

@wleoncio
Copy link
Member Author

Start by translating this loop into C++:

MADMMplasso/R/MADMMplasso.R

Lines 233 to 307 in 062fd9e

while (hh <= nlambda) {
res_dual <- 0 # dual residual
res_pri <- 0 # primal residual
lambda <- lam[hh, ]
start_time <- Sys.time()
if (pal == 1) {
my_values <- admm_MADMMplasso(
beta0, theta0, beta, beta_hat, theta, rho1, X, Z, max_it, my_W_hat, XtY,
y, N, e.abs, e.rel, alpha, lambda, alph, svd.w, tree, my_print, invmat,
gg[hh, ],legacy
)
beta <- my_values$beta
theta <- my_values$theta
converge <- my_values$converge
my_obj[[hh]] <- list(my_values$obj)
beta0 <- my_values$beta0
theta0 <- my_values$theta0 ### iteration
beta_hat <- my_values$beta_hat
y_hat <- my_values$y_hat
}
cost_time <- Sys.time() - start_time
print(cost_time)
if (parallel == T & pal == 0) {
beta <- my_values[hh, ]$beta
theta <- my_values[hh, ]$theta
converge <- my_values[hh, ]$converge
my_obj[[hh]] <- list(my_values[hh, ]$obj)
beta0 <- my_values[hh, ]$beta0
theta0 <- my_values[hh, ]$theta0 ### iteration
beta_hat <- my_values[hh, ]$beta_hat
y_hat <- my_values[hh, ]$y_hat
} else if (parallel == F & pal == 0) {
beta <- my_values[[hh]]$beta
theta <- my_values[[hh]]$theta
converge <- my_values[[hh]]$converge
my_obj[[hh]] <- list(my_values[[hh]]$obj)
beta0 <- my_values[[hh]]$beta0
theta0 <- my_values[[hh]]$theta0 ### iteration
beta_hat <- my_values[[hh]]$beta_hat
y_hat <- my_values[[hh]]$y_hat
}
beta1 <- as(beta * (abs(beta) > tol), "sparseMatrix")
theta1 <- as.sparse3Darray(theta * (abs(theta) > tol))
beta_hat1 <- as(beta_hat * (abs(beta_hat) > tol), "sparseMatrix")
n_interaction_terms <- count_nonzero_a((theta1))
n_main_terms <- (c(n_main_terms, count_nonzero_a((beta1))))
obj1 <- (sum(as.vector((y - y_hat)^2))) / (D * N)
obj <- c(obj, obj1)
non_zero_theta <- (c(non_zero_theta, n_interaction_terms))
lam_list <- (c(lam_list, lambda))
BETA0[[hh]] <- beta0
THETA0[[hh]] <- theta0
BETA[[hh]] <- as(beta1, "sparseMatrix")
BETA_hat[[hh]] <- as(beta_hat1, "sparseMatrix")
Y_HAT[[hh]] <- y_hat
THETA[[hh]] <- as.sparse3Darray(theta1)
if (hh == 1) {
print(c(hh, (n_main_terms[hh]), non_zero_theta[hh], obj1))
} else {
print(c(hh, (n_main_terms[hh]), non_zero_theta[hh], obj[hh - 1], obj1))
}
hh <- hh + 1
} ### lambda

@wleoncio wleoncio self-assigned this Jan 12, 2024
@wleoncio
Copy link
Member Author

This was accidentally closed because I incorrectly linked PR #22 to this issue instead of #19. Reopening and starting work on it.

@wleoncio wleoncio reopened this Jan 26, 2024
@wleoncio wleoncio added the bug Something isn't working label Jan 26, 2024
wleoncio added a commit that referenced this issue Jan 26, 2024
Regression introduced by me on 072b64a.
wleoncio added a commit that referenced this issue Jan 26, 2024
wleoncio added a commit that referenced this issue Jan 26, 2024
Because `admm_MADMMplasso_cpp()` is a subfunction of `hh_nlambda_loop_cpp()`.
wleoncio added a commit that referenced this issue Jan 26, 2024
wleoncio added a commit that referenced this issue Jan 26, 2024
Because it is only recreated if `pal = 1`
wleoncio added a commit that referenced this issue Jan 26, 2024
This reverts commit 92eda0139517ad26680072f172dcdb7f3ad16a44.
wleoncio added a commit that referenced this issue Jan 26, 2024
wleoncio added a commit that referenced this issue Jan 26, 2024
wleoncio added a commit that referenced this issue Jan 26, 2024
`[]` is technically possible, but has no bounds check. See <https://arma.sourceforge.net/docs.html#element_access> for more details.
wleoncio added a commit that referenced this issue Jan 26, 2024
wleoncio added a commit that referenced this issue Feb 6, 2024
This should streamline processing through `hh_nlambda_loop_cpp()`.
wleoncio added a commit that referenced this issue Feb 6, 2024
wleoncio added a commit that referenced this issue Feb 14, 2024
The only missing part now are bits related to and dependend on the translation of `count_nonzero_a()`.
@wleoncio
Copy link
Member Author

Hi Theo,

I've done a bit more work on this, and I think I've finally reached a point where the C++ code is outperforming R for matrices of any size. It's not a huge improvement, but it's an important milestone.

Using the GDSC example (see above), these are the results I am getting now:

Benchmark X_mini
Unit: milliseconds
 expr       min        lq     mean   median       uq      max neval cld
    R 1397.2303 1564.5367 1732.632 1614.797 1931.378 2561.068    30  a 
  C++  979.6887  999.8709 1135.751 1044.650 1159.363 1707.716    30   b

Benchmark X
Unit: seconds
 expr      min       lq     mean   median       uq      max neval cld
    R 4.100515 4.223840 4.372278 4.422617 4.475534 4.543463    10  a 
  C++ 3.610225 3.618488 3.650130 3.634111 3.657458 3.790086    10   b

Can you observe similar results? Here's how to install the package version with the changes:

remotes::install_github("ocbe-uio/MADMMplasso@issue-17")

Before testing, make sure you're running MADMMplasso version 0.0.0.9017-1716458578.

@Theo-qua
Copy link
Collaborator

Theo-qua commented May 24, 2024 via email

@Theo-qua
Copy link
Collaborator

Dear Waldir,
Sorry for delaying in getting back to you. I have compared the two and yes, the C++ is now outperforming the R code. This is a good progress and I think we should be ready to submit it to CRAN since we have seen an improvement.

I have noticed something from the plot function when using the call with C++. The one with the R did not give any error but I got an error after calling for the plot(fit). I will go through the plot function again to check reason but I wanted to draw your attention to it.

Best regards,
Theo

@Theo-qua
Copy link
Collaborator

Hello,
There are two other things that need to be checked. The parallel call in MADMMplasso produces the following error for the C++ ;
Error in { : task 1 failed - "argument "CW" is missing, with no default"

And it produces the following for the R ;
Error in MADMMplasso(X, Z, y, alpha = alpha, my_lambda = NULL, lambda_min = 0.01, :
object 'my_values' not found

I can see that the error in the R code is from line 219-223. The my_values was not predefined before line 219.

Regarding the C++, I think it is becuse of how you call the admm_MADMMplasso_cpp. Could you please compare that to the call of hh_nlambda_loop_cpp

Best regards,
Theo

@wleoncio
Copy link
Member Author

wleoncio commented Jun 3, 2024

Thanks for checking, Theo!

he parallel call in MADMMplasso produces the following error for the C++ ;
Error in { : task 1 failed - "argument "CW" is missing, with no default" And it produces the following for the R ;
Error in MADMMplasso(X, Z, y, alpha = alpha, my_lambda = NULL, lambda_min = 0.01, :
object 'my_values' not found

I'll take a look at this, haven't really tested the parallel option in the C++ code.

I have noticed something from the plot function when using the call with C++. The one with the R did not give any error but I got an error after calling for the plot(fit). I will go through the plot function again to check reason but I wanted to draw your attention to it.

This looks like a separate issue, so I think I'll fix the first one, issue a PR for the merge, and handle this separately (after the merge).

@Theo-qua
Copy link
Collaborator

Theo-qua commented Jun 3, 2024

Thank you!

wleoncio added a commit that referenced this issue Jun 24, 2024
wleoncio added a commit that referenced this issue Jun 24, 2024
wleoncio added a commit that referenced this issue Jun 24, 2024
Just noticed `reg()` is now broken, no idea why.
@wleoncio
Copy link
Member Author

OK, so I fixed the R code for all combinations of parallel and pal (this should close #34 once merged). Also fixed the issues mentioned in the previous message regarding the C++ code. There are still two things to tackle:

  1. The C++ results when parallel or pal are true don't match the ones when they're both false (in R, they all match)
  2. Unit tests are now failing for reg(). No idea why, but needs to be investigated.

I haven't checked the plot() issue mentioned yet.

@Theo-qua
Copy link
Collaborator

Hello Waldir,
Thank you for the update. Could you please tell me which version is this. I just tried the issue 17 but run into error.

Also, would it be possible for you to schedule a meeting so that we both go through points 1 and 2 together. I think that would help us identify the problem quickly.

Thank you again,

Theo

@wleoncio
Copy link
Member Author

wleoncio commented Jun 25, 2024

Hi Theo,

The latest version on the issue-17 branch is numbered 0.0.0.9017-1719289199. To install it:

remotes::install_github("ocbe-uio/MADMMplasso@issue-17")

(curiously enough, the parallel versions are taking forever on my home machine to run (they were quite fast on the work PC, which is much weaker. Let me know how it goes there)

The following test file can be run to reproduce point 1:

source("tests/testthat/test-parallel.R")

Glad to meet, I'll send you an e-mail about it.

@wleoncio
Copy link
Member Author

Update: the version number should match now (I had forgotten to push the version number update this morning). The codebase is otherwise identical, so the test files from my previous post should behave exactly the same.

wleoncio added a commit that referenced this issue Jun 25, 2024
wleoncio added a commit that referenced this issue Jun 25, 2024
@wleoncio wleoncio linked a pull request Jun 25, 2024 that will close this issue
Theo-qua added a commit that referenced this issue Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants