-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit solver in C++ #41
Conversation
Nevermind the comment about R6 attributes, I now notices that |
Codecov Report
@@ Coverage Diff @@
## master #41 +/- ##
==========================================
- Coverage 73.76% 73.14% -0.63%
==========================================
Files 28 28
Lines 1845 2022 +177
==========================================
+ Hits 1361 1479 +118
- Misses 484 543 +59
Continue to review full report at Codecov.
|
Thanks for PR.
That will be great addition. Approach with
makes sense. Do you have real-life examples of sub-optimal quality solution? Will it make any difference (hit@k / ndcg@k) when making recommendations using lastfm360k?
Eventually I would like to move solvers to a standalone header only C++ library (with only dependency on arma). So one day there might be a python interface to solvers in rsparse (wanted to try https://github.com/RUrlus/carma, but python build system is still a bit mysterious for me). I had this in mind for some time, but didn't have time to start. So |
I'm not sure how much of a difference does a sub-optimal solution make when computing recommendations, but here's an example showing that the difference can be rather big: library(Matrix)
library(rsparse)
data("movielens100k")
X <- movielens100k
set.seed(123)
m <- WRMF$new(rank=100, lambda=1,
feedback="explicit",
solver="conjugate_gradient")
factors_fit <- m$fit_transform(X)
factors_transf <- m$transform(X)
difference = factors_fit[10, ] - factors_transf[10, ]
cat("||diff same||: ", sqrt(difference %*% difference), "\n")
diff_rnd = factors_fit[10, ] - factors_fit[11, ]
cat("||diff other||: ", sqrt(diff_rnd %*% diff_rnd), "\n")
set.seed(123)
m2 <- WRMF$new(rank=100, lambda=1,
feedback="explicit",
solver="cholesky")
factors_fit2 <- m2$fit_transform(X)
factors_transf2 <- m2$transform(X)
difference2 = factors_fit2[10, ] - factors_transf2[10, ]
cat("||diff same||: ", sqrt(difference2 %*% difference2), "\n")
diff_rnd2 = factors_fit2[10, ] - factors_fit2[11, ]
cat("||diff other||: ", sqrt(diff_rnd2 %*% diff_rnd2), "\n") With CG:
With Cholesky:
And if you inspect the vector of differences it doesn't look small at all: difference
|
Okay, you've spotted interesting issue. Results from |
But even if you do it the other way waround, you'll still see such a difference. Here's an example using the package cmfrec which already does it the other way around: library(Matrix)
library(rsparse)
library(cmfrec)
data("movielens100k")
X <- movielens100k
X <- as(X, "TsparseMatrix")
set.seed(123)
m <- CMF(X, k=100, lambda=1, use_cg=TRUE, finalize_chol=FALSE)
factors_fit <- t(m$matrices$A)
factors_transf <- factors(m, X)
difference = factors_fit[10, ] - factors_transf[10, ]
cat("||diff same||: ", sqrt(difference %*% difference), "\n")
diff_rnd = factors_fit[10, ] - factors_fit[11, ]
cat("||diff other||: ", sqrt(diff_rnd %*% diff_rnd), "\n")
set.seed(123)
m2 <- CMF(X, k=100, lambda=1, use_cg=FALSE)
factors_fit2 <- t(m2$matrices$A)
factors_transf2 <- factors(m2, X)
difference2 = factors_fit2[10, ] - factors_transf2[10, ]
cat("||diff same||: ", sqrt(difference2 %*% difference2), "\n")
diff_rnd2 = factors_fit2[10, ] - factors_fit2[11, ]
cat("||diff other||: ", sqrt(diff_rnd2 %*% diff_rnd2), "\n") CG:
Cholesky:
|
Added functionality to calculate biases. This has 2 problems however:
The procedure for initializing the biases could be as follows:
But that I wouldn't know how to do in armadillo either. Also would be better if you could make the default |
Correction about the comment: even with random bias initializations, they still seems to provide a small lift in performance. Got these numbers for the ML10M (rank=40+biases, lambda=35, 10 iterations):
|
src/SolverAlsWrmf.cpp
Outdated
arma::Col<T> scd_nnls_solver_explicit(const arma::Mat<T> &X_nnz, | ||
const arma::Col<T> &confidence, | ||
T regularization) { | ||
const arma::Mat<T> inv = X_nnz * X_nnz.t() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inv.diag() += regularization;
should work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't compile unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because of const
. I've updated your branch
- minor formatting
…avid-cortes-explicit_cpp
@david-cortes it will take couple more days to review |
9126dba
to
b6541b1
Compare
- optional biases for explicit feedback related to #41
@david-cortes I've started to review and eventually modified big part of the WRMF solver (pushed to master now). Now it is much easier to read and reason about. It will take a bit more time to incorporate your additions and merge this PR. Also I haven't noticed that different user and item bias initializations has significant impact on the loss. |
@david-cortes I've resolved conflicts and merged. Thanks a lot for your work! However I've commented out code related to bias init and global mean (as I don't yet fully understand how it works). Would you be able to try current version and add this logic in a new PR? TODO from my side:
|
Ok, will re-add them. A small reminder just in case that, if you decide to add biases for the implicit-feedback version, you'd have to do it through |
} | ||
|
||
template <class T> | ||
void initialize_biases(const dMappedCSC& ConfCSC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@david-cortes I believe it will be easier/more efficient to pass global_mean
here and to ALS solver rather then subtracting it from c_ui
and c_iu
.
This PR adds a copycat version of the
als_implicit_fun
with minor changes for it work in the explicit feedback case. This is my first time using armadillo so I'm not sure I'm doing things the best way here. Particularly, I didn't get how to add values to the diagonal of a matrix (what's written in their docs doesn't compile), so this line can perhaps be done better:const arma::Mat<T> inv = X_nnz * X_nnz.t() + regularization * arma::Mat<T>(X_nnz.n_rows, X_nnz.n_rows, arma::fill::eye);
In addition, the PR also performs mean centering when the inputs are not constrained to be non-negative. However I wasn't sure how to add an extra public attribute to an R6 class, so I added
glob_mean
to the list of public attributes definitions, but R complains that it's not documented. If I remove the line that adds it, it will throw an error when trying to add it dynamically as an attribute:Even though it somehow doesn't throw such error when adding
self$components
which was not declared beforehand.As a next step, I also wanted to add biases to the model, but I'm not sure about the best way of going around it from a usability point of view. I think the best way would be to create a matrix for components having dimension
rank+2
and also outputing a matrix of dimensionrank+2
when calling transform, with one row/column in each one set to ones to match the other. What do you think about it?Also wanted to make other potential changes:
transform
, I think it'd be a better idea to always use the Cholesky solver as the CG one usually doesn't reach the optimal solution, especially when starting from zero which is what happens intransform
./src
instead.