Re-add mean centering and bias init #45

david-cortes · 2020-11-29T20:22:08Z

This PR re-adds mean centering and better bias initialization for the explicit-feedback model.

However, I see that for the loss calculation, it takes a full row of all-ones in the factors into account, which should be excluded (that is, the matrix being optimized for should have 1 more row added into the regularized loss than the other).

codecov · 2020-11-29T20:33:41Z

Codecov Report

Merging #45 (96c848f) into master (35247b4) will increase coverage by 1.50%.
The diff coverage is 75.60%.

@@            Coverage Diff             @@
##           master      #45      +/-   ##
==========================================
+ Coverage   71.84%   73.35%   +1.50%     
==========================================
  Files          28       28              
  Lines        1911     1944      +33     
==========================================
+ Hits         1373     1426      +53     
+ Misses        538      518      -20

Impacted Files	Coverage Δ
R/model_WRMF.R	`72.22% <74.07%> (+0.26%)`	⬆️
src/SolverAlsWrmf.cpp	`91.78% <78.57%> (+19.63%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35247b4...96c848f. Read the comment docs.

dselivanov · 2020-11-30T03:49:13Z

src/SolverAlsWrmf.cpp

+  double glob_mean = 0;
+  for (size_t ix = 0; ix < ConfCSC.nnz; ix++)
+    glob_mean += (ConfCSC.values[ix] - glob_mean) / (double)(ix+1);
+#pragma omp simd


we need to add conditional #ifdef _OPENMP as it won't compile on certain platforms

dselivanov · 2020-11-30T03:49:43Z

src/SolverAlsWrmf.cpp

@@ -168,7 +168,7 @@ T als_explicit(const dMappedCSC& Conf,
  if(lambda > 0) {
    if (with_biases) {
      auto X_no_bias = X(arma::span(1, X.n_rows - 1), arma::span::all);
-      auto Y_no_bias = X(arma::span(1, Y.n_rows - 1), arma::span::all);
+      auto Y_no_bias = Y(arma::span(1, Y.n_rows - 1), arma::span::all);


nice catch!

dselivanov · 2020-11-30T03:51:42Z

src/SolverAlsWrmf.cpp

-                       arma::Col<T>& user_bias,
-                       arma::Col<T>& item_bias,
-                       T lambda, bool non_negative) {
+double initialize_biases(const dMappedCSC& ConfCSC,


as we modify ConfCSC and ConfCSR in-place we need to remove const

dselivanov · 2020-11-30T04:23:53Z

src/SolverAlsWrmf.cpp

+                         arma::Col<T>& user_bias,
+                         arma::Col<T>& item_bias,
+                         T lambda, bool non_negative) {
+  /* Robust mean calculation */


Where can I get more information about "Robust mean"? Why is this better than just standard mean?

Here's some discussion: https://stackoverflow.com/questions/7552443/whats-the-numerically-best-way-to-calculate-the-average

It gets a mean estimation which is correct to higher numerical precision that sum(x)/n. But it's not better than R's mean function if that's what you mean to ask (which uses a compensated summation).

In this case the sum is expected to be over a potentially very large number of elements, so a simpler mean calculation can lose precision.

re-add mean centering and bias init

d0aa4c2

add simd just in case it doesn't vectorize on its own

96c848f

dselivanov requested changes Nov 30, 2020

View reviewed changes

dselivanov reviewed Nov 30, 2020

View reviewed changes

dselivanov merged commit f12366e into dselivanov:master Nov 30, 2020

david-cortes mentioned this pull request Jun 27, 2024

Solve CRAN issues from Rcpp with UBSAN #76

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-add mean centering and bias init #45

Re-add mean centering and bias init #45

david-cortes commented Nov 29, 2020

codecov bot commented Nov 29, 2020 •

edited

Loading

dselivanov Nov 30, 2020

dselivanov Nov 30, 2020

dselivanov Nov 30, 2020

dselivanov Nov 30, 2020 •

edited

Loading

david-cortes Nov 30, 2020

Re-add mean centering and bias init #45

Re-add mean centering and bias init #45

Conversation

david-cortes commented Nov 29, 2020

codecov bot commented Nov 29, 2020 • edited Loading

Codecov Report

dselivanov Nov 30, 2020

Choose a reason for hiding this comment

dselivanov Nov 30, 2020

Choose a reason for hiding this comment

dselivanov Nov 30, 2020

Choose a reason for hiding this comment

dselivanov Nov 30, 2020 • edited Loading

Choose a reason for hiding this comment

david-cortes Nov 30, 2020

Choose a reason for hiding this comment

codecov bot commented Nov 29, 2020 •

edited

Loading

dselivanov Nov 30, 2020 •

edited

Loading