Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to co-factor implicit data in explicit feedback #52

Closed
wants to merge 11 commits into from

Conversation

david-cortes
Copy link
Contributor

@david-cortes david-cortes commented Dec 6, 2020

This PR adds a very simplified version of the collective/co-factoring technique used in e.g. this and similar papers for the explicit feedback model.

What it does is it basically factorizes both the explicit feedback matrix and an implicit-feedback version of it with only binary entries without weights, sharing the same components between factorizations.

These are not counted towards the loss, because it wouldn't be computationally efficient to add them as it would imply iterating over every entry of the ratings matrix, whether it's missing or not.

The procedure is however not very efficient, especially when using floats since it has the solver done in R and needs to cast at each iteration. I wasn't sure how to do it in armadillo so perhaps there's a better option for floats, or could be done directly with blas and lapack.

This PR includes the commits from the previous one, so let's merge the earlier one first.

@codecov
Copy link

codecov bot commented Dec 6, 2020

Codecov Report

Merging #52 (73d745c) into master (9c8b2c5) will decrease coverage by 2.14%.
The diff coverage is 41.04%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #52      +/-   ##
==========================================
- Coverage   73.97%   71.83%   -2.15%     
==========================================
  Files          27       27              
  Lines        2079     2201     +122     
==========================================
+ Hits         1538     1581      +43     
- Misses        541      620      +79     
Impacted Files Coverage Δ
src/SolverAlsWrmf.cpp 78.13% <24.59%> (-17.70%) ⬇️
R/model_WRMF.R 70.77% <53.62%> (-6.15%) ⬇️
src/utils.cpp 78.26% <75.00%> (-0.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c8b2c5...0bb7129. Read the comment docs.

Copy link
Owner

@dselivanov dselivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will take some time to review/understand underlying co-factor model. Would be great if you could add some description on when does it worth to use co-factor model. And what kind of accuracy gains one can expect.

Y_new = cg_solver_explicit<T>(X_nnz, confidence, init, lambda_use, cg_steps);
if (!with_implicit_features) {
if (with_biases) {
auto init = drop_row<T>(Y.col(i), !is_x_bias_last_row);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit afraid of using auto with arma objects. I had some segfaults and don't use auto with arma structures anymore. Better to use const arma::Mat<T> here.
In fact arma docs discourage using auto:

Use of C++11 auto is not recommended with Armadillo objects and expressions.

}
} else {
if (with_biases) {
auto init = drop_row<T>(Y.col(i), !is_x_bias_last_row);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same const arma::Mat<T>

@david-cortes
Copy link
Contributor Author

david-cortes commented Dec 28, 2020

Tried switching everything to armadillo, but now I find that it actually produces worse results compared to not using implicit features. Works quite fine in cmfrec though. Will close it as I have no idea how to "fix" it. Guess it might be an issue with numeric precision due to the order of operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants