Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to lift test calibration notebook #1000

Merged
merged 19 commits into from
Sep 10, 2024
Merged

Conversation

drbenvincent
Copy link
Contributor

@drbenvincent drbenvincent commented Sep 4, 2024

This issue closes #999 in that it provides various updates and clarifications to the lift test calibration notebook.

Checklist

Modules affected

  • MMM
  • CLV

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

📚 Documentation preview 📚: https://pymc-marketing--1000.org.readthedocs.build/en/1000/

@drbenvincent drbenvincent added docs Improvements or additions to documentation MMM labels Sep 4, 2024
@drbenvincent drbenvincent marked this pull request as draft September 4, 2024 14:12
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@drbenvincent drbenvincent marked this pull request as ready for review September 5, 2024 13:57
Copy link

review-notebook-app bot commented Sep 5, 2024

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2024-09-05T18:38:56Z
----------------------------------------------------------------

Quick idea here: On the second point, I think it may be misleading to mention that we do not account lag. May be better to mention that even considering the lagging (Adstock) function it can be difficult to find these relationships. Otherwise, users might assume that increasing the Max Lagging parameter would solve this problem, because the model would consider large lags.


juanitorduz commented on 2024-09-05T20:11:30Z
----------------------------------------------------------------

+1

drbenvincent commented on 2024-09-09T14:56:47Z
----------------------------------------------------------------

Good point - I now explicitly mention the adstock function. See what you think.

Copy link

review-notebook-app bot commented Sep 5, 2024

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2024-09-05T18:38:57Z
----------------------------------------------------------------

Opinion: In an attempt to get closer to real life, I would lower the X2 values, just dividing by a constant. The channels would still be perfectly correlated, but would make the image more understandable by showing two lines moving in an identical manner.


drbenvincent commented on 2024-09-09T15:00:33Z
----------------------------------------------------------------

Good idea, have done something along these lines, but retaining the max abs transformation

juanitorduz commented on 2024-09-09T17:09:37Z
----------------------------------------------------------------

nice!

Copy link

review-notebook-app bot commented Sep 5, 2024

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2024-09-05T18:38:58Z
----------------------------------------------------------------

Opinion: As in the first notebook, a time series with the actual contribution v contribution recovered by the initial model would be nice, I think!


drbenvincent commented on 2024-09-09T16:08:03Z
----------------------------------------------------------------

I've given this a go. Can you double check the line y=true_params["saturation_beta"][i] * df[f"x{i+1}_adstock_saturated"], # ?????? in the utility function plot_channel_contributions please? I think it's ok, but would appreciate a sanity check.

Copy link

review-notebook-app bot commented Sep 5, 2024

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2024-09-05T18:38:58Z
----------------------------------------------------------------

Opinion: Here I see another opportunity to place recovered and real contribution.


drbenvincent commented on 2024-09-09T16:08:20Z
----------------------------------------------------------------

done

@cetagostini
Copy link
Contributor

I like the improvement in the doc, I leave some comments with ideas and opinions!

Copy link
Collaborator

+1


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Sep 5, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-09-05T20:14:27Z
----------------------------------------------------------------

Line #7.    %config InlineBackend.figure_format = 'retina'

"retina" (double ")


drbenvincent commented on 2024-09-09T14:54:22Z
----------------------------------------------------------------

Done in an upcoming commit

@juanitorduz
Copy link
Collaborator

@drbenvincent I love the changes 💪 ! Think about Carlo's suggestions and when you are done ping me for a quick last change ant merge !

* Adding causal language

* Rephrasing!

* changes requested by Juanito

---------

Co-authored-by: Juan Orduz <juanitorduz@gmail.com>
@ulfaslak ulfaslak force-pushed the main branch 2 times, most recently from 891191c to b003606 Compare September 8, 2024 09:29
Copy link
Contributor Author

Done in an upcoming commit


View entire conversation on ReviewNB

Copy link
Contributor Author

Good point - I now explicitly mention the adstock function. See what you think.


View entire conversation on ReviewNB

Copy link
Contributor Author

Good idea, have done something along these lines, but retaining the max abs transformation


View entire conversation on ReviewNB

Copy link
Contributor Author

I've given this a go. Can you double check the line y=true_params["saturation_beta"][i] * df[f"x{i+1}_adstock_saturated"], # ?????? in the utility function plot_channel_contributions please? I think it's ok, but would appreciate a sanity check.


View entire conversation on ReviewNB

Copy link
Contributor Author

done


View entire conversation on ReviewNB

@drbenvincent
Copy link
Contributor Author

Thanks for comments @carlosagostini. I think I've addressed your comments. I did drop in a question for you to double check I'm calculating the true channel contributions correctly.

Copy link
Collaborator

nice!


View entire conversation on ReviewNB

@juanitorduz
Copy link
Collaborator

@carlosagostini if you have time can you please double check @drbenvincent 's las questions :)?

@cetagostini
Copy link
Contributor

cetagostini commented Sep 9, 2024

Hey @drbenvincent and @juanitorduz the implementation is correct but I'll recommend do the following:

# Save the true contribution for each channel
df[["true_contribution_x1", "true_contribution_x2"]] = pm.draw(fixed_model["channel_contributions"], random_seed=rng)

Then replace in the function by:

def plot_channel_contributions(mmm, df):
    channels_contribution_original_scale = (
        mmm.compute_channel_contribution_original_scale()
    )
    channels_contribution_original_scale_hdi = az.hdi(
        ary=channels_contribution_original_scale
    )

    get_mean_contributions_over_time_df = mmm.compute_mean_contributions_over_time(
        original_scale=True
    )

    fig, ax = plt.subplots(
        nrows=2,
        figsize=(15, 8),
        ncols=1,
        sharex=True,
        sharey=False,
        layout="constrained",
    )

    for i, x in enumerate(["channel 1", "channel 2"]):
        # Estimate true contribution in the original scale from the data generating process
        sns.lineplot(
            x=df["date"],
            y=df[f"true_contribution_x{i+1}"], ## Change everything here.
            color="black",
            label=f"{x} true contribution",
            ax=ax[i],
        )
        # HDI estimated contribution in the original scale
        ax[i].fill_between(
            x=df["date"],
            y1=channels_contribution_original_scale_hdi.sel(channel=x)["x"][:, 0],
            y2=channels_contribution_original_scale_hdi.sel(channel=x)["x"][:, 1],
            color=f"C{i}",
            label=rf"{x} $94\%$ HDI contribution",
            alpha=0.4,
        )
        # Mean estimated contribution in the original scale
        sns.lineplot(
            x=df["date"],
            y=get_mean_contributions_over_time_df[x].to_numpy(),
            color=f"C{i}",
            label=f"{x} posterior mean contribution",
            alpha=0.8,
            ax=ax[i],
        )
        ax[i].legend(loc="center left", bbox_to_anchor=(1, 0.5))
        ax[i].set(title=x, ylim=(0, 1))

If we are using pm.draw already, then we can keep it simple estimating the true contribution from the start. What do you think?

This is optional, but not a blocker. I'll approve!

@cetagostini
Copy link
Contributor

@juanitorduz @drbenvincent

Another comment, we are adding numpyro to all our notebooks but it is not a dependency, neither is Jax. So a new user will have problems running our notebooks, sometimes I forget, for example now, trying this I got errors.

I would recommend not using numpyro in the examples, otherwise we should add it to the dependencies. What do you think? I had removed numpyro in a PR but apparently it came back 😅

@juanitorduz
Copy link
Collaborator

juanitorduz commented Sep 10, 2024

Good point @cetagostini ! We need to be more explicit about the samplers available.

@juanitorduz juanitorduz added this to the 0.9.0 milestone Sep 10, 2024
@juanitorduz juanitorduz merged commit 35dc7f1 into main Sep 10, 2024
7 checks passed
@juanitorduz juanitorduz deleted the update-lift-notebook branch September 10, 2024 11:33
@juanitorduz
Copy link
Collaborator

Thanks @drbenvincent !

@drbenvincent
Copy link
Contributor Author

drbenvincent commented Sep 10, 2024

Thanks. I'll get on these changes today/tomorrow.

Oh, merged already 👍🏻

twiecki pushed a commit that referenced this pull request Sep 10, 2024
* minor fixes

* shorter forest plots + update first set of saturation curve plots

* jazz up the triangle plot + edit associated text

* add some additional explanations, clarifications

* additional explanation

* add link to quote by Box

* various tweaks and clarifications

* add examples of when lift tests are useful

* tweaks + set one lift test to _decrease_ channel spend

* add brief mathematical description of what add_lift_test_measurements does

* minor fixes

* Adding causal language (#1003)

* Adding causal language

* Rephrasing!

* changes requested by Juanito

---------

Co-authored-by: Juan Orduz <juanitorduz@gmail.com>

* single quote -> double quote not picked up by ruff

* update 2nd example in intro

* add contribution plots

---------

Co-authored-by: Carlos Trujillo <59846724+cetagostini@users.noreply.github.com>
Co-authored-by: Juan Orduz <juanitorduz@gmail.com>
@wd60622 wd60622 added MMM docs Improvements or additions to documentation and removed docs Improvements or additions to documentation MMM labels Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Improvements or additions to documentation MMM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggested edits to Lift Test Calibration notebook
4 participants