-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to co-factor implicit data in explicit feedback #52
Conversation
Codecov Report
@@ Coverage Diff @@
## master #52 +/- ##
==========================================
- Coverage 73.97% 71.83% -2.15%
==========================================
Files 27 27
Lines 2079 2201 +122
==========================================
+ Hits 1538 1581 +43
- Misses 541 620 +79
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will take some time to review/understand underlying co-factor model. Would be great if you could add some description on when does it worth to use co-factor model. And what kind of accuracy gains one can expect.
src/SolverAlsWrmf.cpp
Outdated
Y_new = cg_solver_explicit<T>(X_nnz, confidence, init, lambda_use, cg_steps); | ||
if (!with_implicit_features) { | ||
if (with_biases) { | ||
auto init = drop_row<T>(Y.col(i), !is_x_bias_last_row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit afraid of using auto
with arma objects. I had some segfaults and don't use auto
with arma structures anymore. Better to use const arma::Mat<T>
here.
In fact arma docs discourage using auto
:
Use of C++11 auto is not recommended with Armadillo objects and expressions.
src/SolverAlsWrmf.cpp
Outdated
} | ||
} else { | ||
if (with_biases) { | ||
auto init = drop_row<T>(Y.col(i), !is_x_bias_last_row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same const arma::Mat<T>
73d745c
to
0bb7129
Compare
Tried switching everything to armadillo, but now I find that it actually produces worse results compared to not using implicit features. Works quite fine in |
This PR adds a very simplified version of the collective/co-factoring technique used in e.g. this and similar papers for the explicit feedback model.
What it does is it basically factorizes both the explicit feedback matrix and an implicit-feedback version of it with only binary entries without weights, sharing the same components between factorizations.
These are not counted towards the loss, because it wouldn't be computationally efficient to add them as it would imply iterating over every entry of the ratings matrix, whether it's missing or not.
The procedure is however not very efficient, especially when using floats since it has the solver done in R and needs to cast at each iteration. I wasn't sure how to do it in armadillo so perhaps there's a better option for floats, or could be done directly with blas and lapack.
This PR includes the commits from the previous one, so let's merge the earlier one first.