-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cholesky solver #38
Comments
Thanks for reporting, I will take a look. Actually I expect the problem is the same as in #35 -
Loss with Cholesky should be lower, but hit rate might be worse since we don't optimize it directly... |
@david-cortes could set logger to library(rsparse)
library(lgr)
lg = get_logger('rsparse')
lg$set_threshold('debug')
data('movielens100k')
train = movielens100k
train = log1p(train)
for (seed in c(1, 2, 3)) {
for (solver in c('cholesky', 'conjugate_gradient')) {
model = WRMF$new(rank = 50, lambda = 1, solver = solver)
message(glue::glue("seed {seed}, solver {solver}"))
user_emb = model$fit_transform(train, n_iter = 10, convergence_tol = -1, cg_steps = 3)
}
} |
But that loss is not the same function that is being minimized. The solver minimizes (or is supposed to minimize) the values over all entries, whereas the loss function calculates it over the present entries. |
@david-cortes you are right! starting to forget all the details... |
Well I did some experiments calculating over all entries and they do indeed end up with similar loss. Weird that the resulting from Cholesky then performs worse at other metrics, and that it takes so little time. |
Did some more experiments this time, from the same movielens. After running for 10 iterations with seed=1, I get these losses:
Whereas with the other package I was comparing, I get these:
Wonder where the difference comes from... |
And if you remove |
Then the loss does end up lower than for CG, even though in a test set it didn't improve HR@5 or AUC. I guess technically it is working as it should - that is, it minimizes the function it is intended to - so I'll close this. |
I'm getting some pretty bad results using
WRMF
with the Cholesky solver.I tried doing a reproducible experiment as follows:
WRMF
model withk=40
andlambda=1
and 10 iterations.And got the following results:
Which is certainly not what I'd expect as the Cholesky is a more exact method and should lead to better results. Using different random seeds did not result in any significant change.
I additionally see a strange behavior with the timings:
According to the reference paper Applications of the conjugate gradient method for implicit feedback collaborative filtering, the Cholesky solver in this case should take close to 3x longer than the CG one. I did the same experiment with this package and got the expected results: Cholesky takes 3x longer and gets a better end result:
Tried playing with the armadillo option
arma::solve_opts::fast
, changing it to something more exact, but it didn't make any difference in the HR@5 metric.I'm not familiar with armadillo but I suspect this line is incorrect:
arma::Mat<T> inv = XtX + X_nnz.each_row() % (confidence.t() - 1) * X_nnz.t();
Since it's doing an operation per row, whereas it should be doing rank-1 updates.
The text was updated successfully, but these errors were encountered: