Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-add mean centering and bias init #45

Merged
merged 2 commits into from
Nov 30, 2020

Conversation

david-cortes
Copy link
Contributor

This PR re-adds mean centering and better bias initialization for the explicit-feedback model.

However, I see that for the loss calculation, it takes a full row of all-ones in the factors into account, which should be excluded (that is, the matrix being optimized for should have 1 more row added into the regularized loss than the other).

@codecov
Copy link

codecov bot commented Nov 29, 2020

Codecov Report

Merging #45 (96c848f) into master (35247b4) will increase coverage by 1.50%.
The diff coverage is 75.60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #45      +/-   ##
==========================================
+ Coverage   71.84%   73.35%   +1.50%     
==========================================
  Files          28       28              
  Lines        1911     1944      +33     
==========================================
+ Hits         1373     1426      +53     
+ Misses        538      518      -20     
Impacted Files Coverage Δ
R/model_WRMF.R 72.22% <74.07%> (+0.26%) ⬆️
src/SolverAlsWrmf.cpp 91.78% <78.57%> (+19.63%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35247b4...96c848f. Read the comment docs.

double glob_mean = 0;
for (size_t ix = 0; ix < ConfCSC.nnz; ix++)
glob_mean += (ConfCSC.values[ix] - glob_mean) / (double)(ix+1);
#pragma omp simd
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add conditional #ifdef _OPENMP as it won't compile on certain platforms

@@ -168,7 +168,7 @@ T als_explicit(const dMappedCSC& Conf,
if(lambda > 0) {
if (with_biases) {
auto X_no_bias = X(arma::span(1, X.n_rows - 1), arma::span::all);
auto Y_no_bias = X(arma::span(1, Y.n_rows - 1), arma::span::all);
auto Y_no_bias = Y(arma::span(1, Y.n_rows - 1), arma::span::all);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

arma::Col<T>& user_bias,
arma::Col<T>& item_bias,
T lambda, bool non_negative) {
double initialize_biases(const dMappedCSC& ConfCSC,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we modify ConfCSC and ConfCSR in-place we need to remove const

arma::Col<T>& user_bias,
arma::Col<T>& item_bias,
T lambda, bool non_negative) {
/* Robust mean calculation */
Copy link
Owner

@dselivanov dselivanov Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where can I get more information about "Robust mean"? Why is this better than just standard mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some discussion: https://stackoverflow.com/questions/7552443/whats-the-numerically-best-way-to-calculate-the-average

It gets a mean estimation which is correct to higher numerical precision that sum(x)/n. But it's not better than R's mean function if that's what you mean to ask (which uses a compensated summation).

In this case the sum is expected to be over a potentially very large number of elements, so a simpler mean calculation can lose precision.

@dselivanov dselivanov merged commit f12366e into dselivanov:master Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants