Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: refresh hyps check + use data available in json + refresh hyps + upper constraints fix when higher than mean #974

Merged
merged 13 commits into from
Jun 17, 2024

Conversation

laresbernardo
Copy link
Collaborator

Fix on #960

@laresbernardo laresbernardo added the bug Something isn't working label May 13, 2024
@laresbernardo laresbernardo requested a review from gufengzhou May 13, 2024 07:36
@laresbernardo laresbernardo self-assigned this May 13, 2024
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 13, 2024
@gufengzhou
Copy link
Contributor

Thanks for updating @laresbernardo . Would you please test the current change vs the main branch on refresh each to see if everything is identical? It'd be identical when you compare the JSONs and the list "hyper_values" are having the same values. Also when testing refresh, make sure to test 2 separate refresh steps (rf1 & rf2) to see if the whole chaining etc work properly. Thanks!

@laresbernardo
Copy link
Collaborator Author

Hey @gufengzhou

Yes, they give the same hyper_values. The decimals are rounded somewhere, so we get 0.5769212583 vs 0.5769213, but looking good.

The issue here is that Robyn$listInit$OutputCollect$hyper_updated is not available when you start building the first refresh model. That is why the user was having issues and crashed. We only have Robyn$listInit$OutputCollect$hyper_fixed which is TRUE and the attr(,"hypParamSamName") has the names of the updated hyperparameters (including penalties, etc). Changing the code a little bit I was able to get around this error but the following question arises: should the penalties/lambda hyps be fixed or should be allow for the model to find new ones within the (unexisting) bounds freedom? If they are fixed I can commit a change that will fix it all; if not, then I have to fix it so that maybe we use the fixed value +- the bounds freedom in those specific hyps. Let me know your thoughts...

@gufengzhou
Copy link
Contributor

Thanks for digging. Theoretically refresh should also iterate through penalty. we just didn't have users going this far so far.

@laresbernardo
Copy link
Collaborator Author

Alright, did some tweaks all around and I feel more confident on refreshing results. I've:

  • feat: created a new bounds_freedom parameter so that users can overwrite the flexibility on hyperparameters bounds for refreshed models (default remains the same)
  • feat: more flexibility on chain logic so original file may be elsewhere
  • fix: fixed hyperparameters for lambda, penalties, train_test are now flexible, using the refresh flexibility bounds. Results are way more consistent and similar to original model
  • fix: issues with refresh plots when chain is not followed correctly

@laresbernardo
Copy link
Collaborator Author

laresbernardo commented May 28, 2024

@gufengzhou can you please check the Fitted vs actual calculations when refreshing? For some reason, the baseline is a large number and the predictions are small numbers (as if they haven't been scaled - check rowSums(xDecompVec)); so you get almost a straight line. Funny thing is that we haven't specifically changed anything there... unless I'm getting terrible fit because of the hyps constraints(?)

@amanrai2508
Copy link

Hi @laresbernardo

Can you check this issue as well , I am getting way different result when I am doing data refresh (for 2 weeks of data)
: #985

laresbernardo and others added 4 commits June 3, 2024 18:01
- negative trend is not interpretable for MMM
- force negative coef when trend is negative to get positive decomp
…dstock

feat: instead of Inf, use channel_constr_up, which by default is 10 for target_efficiency
@laresbernardo laresbernardo changed the title fix: refresh hyps check + use data available in json fix: refresh hyps check + use data available in json + refresh hyps + upper constraints fix when higher than mean Jun 11, 2024
@gufengzhou
Copy link
Contributor

gufengzhou commented Jun 13, 2024

@gufengzhou can you please check the Fitted vs actual calculations when refreshing? For some reason, the baseline is a large number and the predictions are small numbers (as if they haven't been scaled - check rowSums(xDecompVec)); so you get almost a straight line. Funny thing is that we haven't specifically changed anything there... unless I'm getting terrible fit because of the hyps constraints(?)

If you check report_aggregated.csv, you'll see that the intercept are wrong. It was dropped in the initial model but not dropped in the refresh. The dropping of intercept happens in the refit function within robyn_mmm @laresbernardo

image

@amanrai2508
Copy link

Hi @laresbernardo @gufengzhou
This is the same issue that I am facing as well : #985
If you need any help with respect to testing, I can help you with this. (Robyn refresh is flawed currently because of this)

@gufengzhou can you please check the Fitted vs actual calculations when refreshing? For some reason, the baseline is a large number and the predictions are small numbers (as if they haven't been scaled - check rowSums(xDecompVec)); so you get almost a straight line. Funny thing is that we haven't specifically changed anything there... unless I'm getting terrible fit because of the hyps constraints(?)

gufengzhou and others added 2 commits June 15, 2024 12:27
The refactoring of initBounds & listOutputPrev in refresh_hyps was wrong in 774c18d
@laresbernardo
Copy link
Collaborator Author

Ok. It looks good now. @amanrai2508 would you mind updating Robyn to this branch (Robyn::robyn_update(ref = "bl02") + refresh R session) and testing the refresh functionality? Before we merge I'd like to hear from you. Thanks!

@amanrai2508
Copy link

amanrai2508 commented Jun 16, 2024

Hi @laresbernardo
Now the refresh looks good. (For some data it is off but for most of the data it is working)
Just wanted to know why we are plotting onepager of a solutionid where topsolution is False ( Though the results are similar in this solution as well)
image

image

@laresbernardo
Copy link
Collaborator Author

Could you please check if model 1_142_5 is the solution with the lowest DECOMP.RSSD error across all Pareto-front models? The top solutions are the minimum combined error models per cluster, which may not match.

@amanrai2508
Copy link

Yeah, it has the lowest DECOMP.RSSD.
It looks fine on my end @laresbernardo ,now the values are more credible.

@laresbernardo laresbernardo merged commit 3bc8422 into main Jun 17, 2024
1 check passed
laresbernardo added a commit that referenced this pull request Jun 19, 2024
* build(deps): bump braces from 3.0.2 to 3.0.3 in /website (#997)

* fix: refresh hyps check + use data available in json + refresh hyps + upper constraints fix when higher than mean (#974)

* fix: refresh hyps check #960 + use data available in json

* fix: update based on gz's comments

* fix: fixed penalties and other fixed hyps on refreshing models

* fix: refresh plot when chain is broken + feat: new bounds_freedom parameter to overwrite default calculation

* fix: import and store original model when not in original plot_dir

* recode: applied styler::tidyverse_style() to clean code for CRAN

* fix: paid_media_total calc

* fix: print ExportedModel only when available

* fix: deal with negative trend

- negative trend is not interpretable for MMM
- force negative coef when trend is negative to get positive decomp

* fix: upper constraint issue on BA for target_efficiency and weibull adstock

feat: instead of Inf, use channel_constr_up, which by default is 10 for target_efficiency

* fix: reverse wrong bounds update in refresh_hyps

The refactoring of initBounds & listOutputPrev in refresh_hyps was wrong in 774c18d

* recode: apply styler::tidyverse_style()

---------

Co-authored-by: gufengzhou <gufengzhou@gmail.com>

---------

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: gufengzhou <gufengzhou@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants