-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow admm_MADMMplasso()
under ADMMplasso()
#17
Comments
Can reproduce. See example below: # Train the model
# generate some data
library(MADMMplasso)
set.seed(1235)
N <- 100
p <- 50
nz <- 4
K <- nz
X <- matrix(rnorm(n = N * p), nrow = N, ncol = p)
mx <- colMeans(X)
sx <- sqrt(apply(X, 2, var))
X <- scale(X, mx, sx)
X <- matrix(as.numeric(X), N, p)
Z <- matrix(rnorm(N * nz), N, nz)
mz <- colMeans(Z)
sz <- sqrt(apply(Z, 2, var))
Z <- scale(Z, mz, sz)
beta_1 <- rep(x = 0, times = p)
beta_2 <- rep(x = 0, times = p)
beta_3 <- rep(x = 0, times = p)
beta_4 <- rep(x = 0, times = p)
beta_5 <- rep(x = 0, times = p)
beta_6 <- rep(x = 0, times = p)
beta_1[1:5] <- c(2, 2, 2, 2, 2)
beta_2[1:5] <- c(2, 2, 2, 2, 2)
beta_3[6:10] <- c(2, 2, 2, -2, -2)
beta_4[6:10] <- c(2, 2, 2, -2, -2)
beta_5[11:15] <- c(-2, -2, -2, -2, -2)
beta_6[11:15] <- c(-2, -2, -2, -2, -2)
Beta <- cbind(beta_1, beta_2, beta_3, beta_4, beta_5, beta_6)
colnames(Beta) <- c(1:6)
theta <- array(0, c(p, K, 6))
theta[1, 1, 1] <- 2
theta[3, 2, 1] <- 2
theta[4, 3, 1] <- -2
theta[5, 4, 1] <- -2
theta[1, 1, 2] <- 2
theta[3, 2, 2] <- 2
theta[4, 3, 2] <- -2
theta[5, 4, 2] <- -2
theta[6, 1, 3] <- 2
theta[8, 2, 3] <- 2
theta[9, 3, 3] <- -2
theta[10, 4, 3] <- -2
theta[6, 1, 4] <- 2
theta[8, 2, 4] <- 2
theta[9, 3, 4] <- -2
theta[10, 4, 4] <- -2
theta[11, 1, 5] <- 2
theta[13, 2, 5] <- 2
theta[14, 3, 5] <- -2
theta[15, 4, 5] <- -2
theta[11, 1, 6] <- 2
theta[13, 2, 6] <- 2
theta[14, 3, 6] <- -2
theta[15, 4, 6] <- -2
library(MASS)
pliable <- matrix(0, N, 6)
for (e in 1:6) {
pliable[, e] <- compute_pliable(X, Z, theta[, , e])
}
esd <- diag(6)
e <- MASS::mvrnorm(N, mu = rep(0, 6), Sigma = esd)
y_train <- X %*% Beta + pliable + e
y <- y_train
colnames(y) <- c(paste("y", 1:(ncol(y)), sep = ""))
TT <- tree_parms(y)
plot(TT$h_clust) gg1 <- matrix(0, 2, 2)
gg1[1, ] <- c(0.02, 0.02)
gg1[2, ] <- c(0.02, 0.02)
nlambda <- 1
e.abs <- 1E-4
e.rel <- 1E-2
alpha <- .2
tol <- 1E-3
fit <- MADMMplasso(
X, Z, y,
alpha = alpha, my_lambda = matrix(rep(0.2, dim(y)[2]), 1),
lambda_min = 0.001, max_it = 5000, e.abs = e.abs, e.rel = e.rel, maxgrid = nlambda,
nlambda = nlambda, rho = 5, tree = TT, my_print = FALSE, alph = 1, parallel = FALSE,
pal = 1, gg = gg1, tol = tol, cl = 6, legacy = FALSE
)
#> Using C++
#> Time difference of 2.135376 secs
#> [1] 1.000000 14.000000 29.000000 3.108636
fit_2 <- MADMMplasso(
X, Z, y,
alpha = alpha, my_lambda = matrix(rep(0.2, dim(y)[2]), 1),
lambda_min = 0.001, max_it = 5000, e.abs = e.abs, e.rel = e.rel, maxgrid = nlambda,
nlambda = nlambda, rho = 5, tree = TT, my_print = FALSE, alph = 1, parallel = FALSE,
pal = 1, gg = gg1, tol = tol, cl = 6, legacy = TRUE
)
#> Warning in admm_MADMMplasso(beta0, theta0, beta, beta_hat, theta, rho1, : Using
#> legacy R code for MADMMplasso.This functionality will be removed in a future
#> release.Please consider using legacy = FALSE instead.
#> Convergence reached after 37 iterations
#> Time difference of 1.288508 secs
#> [1] 1.000000 14.000000 33.000000 1.417865
print(packageVersion("MADMMplasso"))
#> [1] '0.0.0.9008' Created on 2023-12-04 with reprex v2.0.2 Since one call of the C++ version is much faster than the legacy, I agree with your suspicion that the back-and-forth is causing the slowdown. I'll perform some profiling and work on this ASAP. |
Dear Waldir,I think another factor could be how C++ is dealing with the task when p>N. Could you run the same example you sent with p=500, my_lambda = matrix(rep(2.3, dim(y)[2]), 1) and see the time difference. I think the problem occurs when p>N. I had the following results
system.time( fit<-MADMMplasso(X,Z,y,alpha=alpha,my_lambda = matrix(rep(2.3, dim(y)[2]), 1),lambda_min=0.01,max_it=5000,e.abs=e.abs,e.rel=e.rel,maxgrid=1,nlambda = 1, rho=5,tree = TT,my_print = F,alph=1,parallel =F,pal=1,gg=gg1,tol=tol,cl=4 ,legacy=T) )
Convergence reached after 41 iterations
Time difference of 13.84166 secs
[1] 1.00000 14.00000 1.00000 20.21003
user system elapsed
12.599 0.425 13.965
system.time( fit<-MADMMplasso(X,Z,y,alpha=alpha,my_lambda = matrix(rep(2.3, dim(y)[2]), 1),lambda_min=0.01,max_it=5000,e.abs=e.abs,e.rel=e.rel,maxgrid=1,nlambda = 1, rho=5,tree = TT,my_print = F,alph=1,parallel =F,pal=1,gg=gg1,tol=tol,cl=4 ,legacy=F) )
Time difference of 2.758352 mins
[1] 1.00000 11.00000 3.00000 20.62792
user system elapsed
163.759 0.619 165.662
We can look into it. I am also already working on the documentation.
Best regards,Theo
|
Dear Waldir and Chi,I have completed the documentation for your consideration. I am sure you are already on vacation. Please send me your feedback when you get time to check. Thank you
Best wishes,Theo
Theophilus Quachie Asenso, PhDPost-doctoral research fellowUniversity of Oslo, Norway+4797358955
On Monday, December 4, 2023 at 10:57:04 AM GMT+1, Theophilus Quachie Asenso ***@***.***> wrote:
Dear Waldir,I think another factor could be how C++ is dealing with the task when p>N. Could you run the same example you sent with p=500, my_lambda = matrix(rep(2.3, dim(y)[2]), 1) and see the time difference. I think the problem occurs when p>N. I had the following results
system.time( fit<-MADMMplasso(X,Z,y,alpha=alpha,my_lambda = matrix(rep(2.3, dim(y)[2]), 1),lambda_min=0.01,max_it=5000,e.abs=e.abs,e.rel=e.rel,maxgrid=1,nlambda = 1, rho=5,tree = TT,my_print = F,alph=1,parallel =F,pal=1,gg=gg1,tol=tol,cl=4 ,legacy=T) )Convergence reached after 41 iterationsTime difference of 13.84166 secs[1] 1.00000 14.00000 1.00000 20.21003 user system elapsed 12.599 0.425 13.965
system.time( fit<-MADMMplasso(X,Z,y,alpha=alpha,my_lambda = matrix(rep(2.3, dim(y)[2]), 1),lambda_min=0.01,max_it=5000,e.abs=e.abs,e.rel=e.rel,maxgrid=1,nlambda = 1, rho=5,tree = TT,my_print = F,alph=1,parallel =F,pal=1,gg=gg1,tol=tol,cl=4 ,legacy=F) )Time difference of 2.758352 mins[1] 1.00000 11.00000 3.00000 20.62792 user system elapsed 163.759 0.619 165.662
We can look into it. I am also already working on the documentation.
Best regards,Theo
Theophilus Quachie Asenso, PhDPost-doctoral research fellowUniversity of Oslo, Norway+4797358955
On Monday, December 4, 2023 at 06:49:20 AM GMT+1, Waldir Leoncio ***@***.***> wrote:
Can reproduce. See example below:
# Train the model
# generate some data
library(MADMMplasso)
set.seed(1235)
N <- 100
p <- 50
nz <- 4
K <- nz
X <- matrix(rnorm(n = N * p), nrow = N, ncol = p)
mx <- colMeans(X)
sx <- sqrt(apply(X, 2, var))
X <- scale(X, mx, sx)
X <- matrix(as.numeric(X), N, p)
Z <- matrix(rnorm(N * nz), N, nz)
mz <- colMeans(Z)
sz <- sqrt(apply(Z, 2, var))
Z <- scale(Z, mz, sz)
beta_1 <- rep(x = 0, times = p)
beta_2 <- rep(x = 0, times = p)
beta_3 <- rep(x = 0, times = p)
beta_4 <- rep(x = 0, times = p)
beta_5 <- rep(x = 0, times = p)
beta_6 <- rep(x = 0, times = p)
beta_1[1:5] <- c(2, 2, 2, 2, 2)
beta_2[1:5] <- c(2, 2, 2, 2, 2)
beta_3[6:10] <- c(2, 2, 2, -2, -2)
beta_4[6:10] <- c(2, 2, 2, -2, -2)
beta_5[11:15] <- c(-2, -2, -2, -2, -2)
beta_6[11:15] <- c(-2, -2, -2, -2, -2)
Beta <- cbind(beta_1, beta_2, beta_3, beta_4, beta_5, beta_6)
colnames(Beta) <- c(1:6)
theta <- array(0, c(p, K, 6))
theta[1, 1, 1] <- 2
theta[3, 2, 1] <- 2
theta[4, 3, 1] <- -2
theta[5, 4, 1] <- -2
theta[1, 1, 2] <- 2
theta[3, 2, 2] <- 2
theta[4, 3, 2] <- -2
theta[5, 4, 2] <- -2
theta[6, 1, 3] <- 2
theta[8, 2, 3] <- 2
theta[9, 3, 3] <- -2
theta[10, 4, 3] <- -2
theta[6, 1, 4] <- 2
theta[8, 2, 4] <- 2
theta[9, 3, 4] <- -2
theta[10, 4, 4] <- -2
theta[11, 1, 5] <- 2
theta[13, 2, 5] <- 2
theta[14, 3, 5] <- -2
theta[15, 4, 5] <- -2
theta[11, 1, 6] <- 2
theta[13, 2, 6] <- 2
theta[14, 3, 6] <- -2
theta[15, 4, 6] <- -2
library(MASS)
pliable <- matrix(0, N, 6)
for (e in 1:6) {
pliable[, e] <- compute_pliable(X, Z, theta[, , e])
}
esd <- diag(6)
e <- MASS::mvrnorm(N, mu = rep(0, 6), Sigma = esd)
y_train <- X %*% Beta + pliable + e
y <- y_train
colnames(y) <- c(paste("y", 1:(ncol(y)), sep = ""))
TT <- tree_parms(y)
plot(TT$h_clust)
gg1 <- matrix(0, 2, 2)
gg1[1, ] <- c(0.02, 0.02)
gg1[2, ] <- c(0.02, 0.02)
nlambda <- 1
e.abs <- 1E-4
e.rel <- 1E-2
alpha <- .2
tol <- 1E-3
fit <- MADMMplasso(
X, Z, y,
alpha = alpha, my_lambda = matrix(rep(0.2, dim(y)[2]), 1),
lambda_min = 0.001, max_it = 5000, e.abs = e.abs, e.rel = e.rel, maxgrid = nlambda,
nlambda = nlambda, rho = 5, tree = TT, my_print = FALSE, alph = 1, parallel = FALSE,
pal = 1, gg = gg1, tol = tol, cl = 6, legacy = FALSE
)
#> Using C++
#> Time difference of 2.135376 secs
#> [1] 1.000000 14.000000 29.000000 3.108636
fit_2 <- MADMMplasso(
X, Z, y,
alpha = alpha, my_lambda = matrix(rep(0.2, dim(y)[2]), 1),
lambda_min = 0.001, max_it = 5000, e.abs = e.abs, e.rel = e.rel, maxgrid = nlambda,
nlambda = nlambda, rho = 5, tree = TT, my_print = FALSE, alph = 1, parallel = FALSE,
pal = 1, gg = gg1, tol = tol, cl = 6, legacy = TRUE
)
#> Warning in admm_MADMMplasso(beta0, theta0, beta, beta_hat, theta, rho1, : Using
#> legacy R code for MADMMplasso.This functionality will be removed in a future
#> release.Please consider using legacy = FALSE instead.
#> Convergence reached after 37 iterations
#> Time difference of 1.288508 secs
#> [1] 1.000000 14.000000 33.000000 1.417865
print(packageVersion("MADMMplasso"))
#> [1] '0.0.0.9008'
Created on 2023-12-04 with reprex v2.0.2
Since one call of the C++ version is much faster than the legacy, I agree with your suspicion that the back-and-forth is causing the slowdown. I'll perform some profiling and work on this ASAP.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Start by translating this loop into C++: Lines 233 to 307 in 062fd9e
|
Regression introduced by me on 072b64a.
`[]` is technically possible, but has no bounds check. See <https://arma.sourceforge.net/docs.html#element_access> for more details.
This should streamline processing through `hh_nlambda_loop_cpp()`.
The only missing part now are bits related to and dependend on the translation of `count_nonzero_a()`.
Hi Theo, I've done a bit more work on this, and I think I've finally reached a point where the C++ code is outperforming R for matrices of any size. It's not a huge improvement, but it's an important milestone. Using the GDSC example (see above), these are the results I am getting now: Benchmark X_mini
Unit: milliseconds
expr min lq mean median uq max neval cld
R 1397.2303 1564.5367 1732.632 1614.797 1931.378 2561.068 30 a
C++ 979.6887 999.8709 1135.751 1044.650 1159.363 1707.716 30 b
Benchmark X
Unit: seconds
expr min lq mean median uq max neval cld
R 4.100515 4.223840 4.372278 4.422617 4.475534 4.543463 10 a
C++ 3.610225 3.618488 3.650130 3.634111 3.657458 3.790086 10 b Can you observe similar results? Here's how to install the package version with the changes: remotes::install_github("ocbe-uio/MADMMplasso@issue-17") Before testing, make sure you're running MADMMplasso version 0.0.0.9017-1716458578. |
Hello,This sounds good! I will check this today and give you my feedback.
Best regards,Theo
On Thursday, May 23, 2024 at 12:09:07 PM GMT+2, Waldir Leoncio ***@***.***> wrote:
Hi Theo,
I've done a bit more work on this, and I think I've finally reached a point where the C++ code is outperforming R for matrices of any size. It's not a huge improvement, but it's an important milestone.
Using the GDSC example (see above), these are the results I am getting now:
Benchmark X_mini
Unit: milliseconds
expr min lq mean median uq max neval cld
R 1397.2303 1564.5367 1732.632 1614.797 1931.378 2561.068 30 a
C++ 979.6887 999.8709 1135.751 1044.650 1159.363 1707.716 30 b
Benchmark X
Unit: seconds
expr min lq mean median uq max neval cld
R 4.100515 4.223840 4.372278 4.422617 4.475534 4.543463 10 a
C++ 3.610225 3.618488 3.650130 3.634111 3.657458 3.790086 10 b
Can you observe similar results? Here's how to install the package version with the changes:
***@***.***")
Before testing, make sure you're running MADMMplasso version 0.0.0.9017-1716458578.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
Dear Waldir, I have noticed something from the plot function when using the call with C++. The one with the R did not give any error but I got an error after calling for the plot(fit). I will go through the plot function again to check reason but I wanted to draw your attention to it. Best regards, |
Hello, And it produces the following for the R ; I can see that the error in the R code is from line 219-223. The my_values was not predefined before line 219. Regarding the C++, I think it is becuse of how you call the admm_MADMMplasso_cpp. Could you please compare that to the call of hh_nlambda_loop_cpp Best regards, |
Thanks for checking, Theo!
I'll take a look at this, haven't really tested the parallel option in the C++ code.
This looks like a separate issue, so I think I'll fix the first one, issue a PR for the merge, and handle this separately (after the merge). |
Thank you! |
Just noticed `reg()` is now broken, no idea why.
OK, so I fixed the R code for all combinations of
I haven't checked the |
Hello Waldir, Also, would it be possible for you to schedule a meeting so that we both go through points 1 and 2 together. I think that would help us identify the problem quickly. Thank you again, Theo |
Hi Theo, The latest version on the issue-17 branch is numbered 0.0.0.9017-1719289199. To install it: remotes::install_github("ocbe-uio/MADMMplasso@issue-17") (curiously enough, the parallel versions are taking forever on my home machine to run (they were quite fast on the work PC, which is much weaker. Let me know how it goes there) The following test file can be run to reproduce point 1: source("tests/testthat/test-parallel.R") Glad to meet, I'll send you an e-mail about it. |
Update: the version number should match now (I had forgotten to push the version number update this morning). The codebase is otherwise identical, so the test files from my previous post should behave exactly the same. |
It seems the C++ is now very slower than the R when I run with the example on github (setting p=500 and nlambda=50). I don't know wether it is due to the back and forth between C++ and R for each lambda.
You can check this by calling the MADMMplasso function and set legacy=T or F (I have included this in the call).
Originally posted by @Theo-qua in #16 (comment)
The text was updated successfully, but these errors were encountered: