[715] Adding Mr P docs #716

NathanielF · 2023-09-03T20:33:32Z

Mister P

This is a PR trying to explain and illustrate the technique and value of Multilevel regression and post-stratification techniques using bambi. It is related to issue 715.

The structure of the notebook is seperated in two parts: (a) a refresher on how to think about the properties of the regression modelling as a method for generating an appropriately weighted average across the conditional strata in your in X matrix. We demonstrate this in some detail because MrP can be viewed as a correction to the effects of regression weighting when regression is applied to non-representative data.

We then replicate the kind of analysis achieved with R here

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

review-notebook-app · 2023-09-03T20:33:36Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

NathanielF · 2023-09-03T21:01:21Z

Quarto site seems to render ok too:

NathanielF · 2023-09-03T21:03:57Z

Issue with new contrast function:

NathanielF · 2023-09-03T21:05:18Z

issue with new predictions function:

tomicapretto · 2023-09-04T20:06:42Z

@NathanielF thanks a lot for proactively writing this notebook! I'm adding some minor comments.

docs/notebooks/mister_p.ipynb

tomicapretto · 2023-09-04T20:10:47Z

issue with new predictions function:

I'll try to review this ASAP

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

NathanielF · 2023-09-09T22:07:45Z

I've added some usages of the new plot comparisons and comparisons functionality to this notebook. I specified a base model without the (1 | edu) syntax and demonstrated using this new functionality how the variables of race and education interact within states and used this to justify the complexity of our final model. I think i'm happy enough with the notebook as it stands so i'm marking this as ready for review.

GStechschulte · 2023-09-10T09:48:54Z

issue with new predictions function:

I'll try to review this ASAP

I am back from holidays and should be able to review this issue this week.

NathanielF · 2023-09-11T09:28:32Z

Sorry @GStechschulte I must have deleted my earlier comment. Let me know if you have any questions about the errors i found. They only seemed to occur in my "complex" model where i had allot of group specfic intercept terms (1 | group).

tomicapretto · 2023-09-12T08:29:16Z

issue with new predictions function:

I'll try to review this ASAP

I am back from holidays and should be able to review this issue this week.

@GStechschulte I did a dirty and quick check (translation: added a print before .predict() is called inside predictions()) and saw the dataframe that is passed to predict does not contain the variable state.

GStechschulte · 2023-09-13T07:36:51Z

The problem is that no default values are being computed for model terms specified as group level effects. This is because group level effects do not have a components attribute, and thus never satisfy the logic in the set_default_values() function of ../interpret/utils.py.

For example, the output of the for loop in utils.py with print statements added:

# ../interpret/utils.py
    for term in terms.values():
        print(f"term variable names: {term.var_names}")
        if hasattr(term, "components"):
            print(f"component variable names: {term.var_names}")
            for component in term.components:
                # If the component is a function call, use the argument names
                if isinstance(component, Call):
                    names = [arg.name for arg in component.call.args]
                else:
                    names = [component.name]
                for name in names:
                    if name not in data_dict:
                        # For numeric predictors, select the mean.
                        if component.kind == "numeric":
                            data_dict[name] = np.mean(model.data[name])
                        # For categoric predictors, select the most frequent level.
                        elif component.kind == "categoric":
                            data_dict[name] = mode(model.data[name])

term variable names: set()
term variable names: {'male'}
component variable names: {'male'}
term variable names: {'repvote'}
component variable names: {'repvote'}
term variable names: {'state'}
term variable names: {'eth'}
term variable names: {'edu'}
term variable names: {'male', 'eth'}
term variable names: {'age', 'edu'}
term variable names: {'edu', 'eth'}

The created cap_data indeed includes the user provided predictors and repvote and male. However, no default values are computed for terms that do not have the attribute component.

Group specific effects do have attributes such as expr, factor, and groups that could be used to identify such terms. However, this is still problematic since it is not a component, the kind attribute (used to determine the data type) is intercept or slope.

Furthermore, it is possible to specify interaction terms within the group level effect. This would require us to find the unique set of term names defined by the modeller to ensure default values are not computed more than once.

I am currently attempting to find a solution.

GStechschulte · 2023-09-15T07:10:31Z

predictions, comparisons, and slopes now works for models with common and group specific effects as the bug for computing default values for group specific effects is resolved.

Now, to compute default values, all unique common and group specific effects variable names are identified and put into a list. From there, the data type is determined by identifying the data types used to fit the model.

However, with group specific effects, when the user (or default values are computed) passes values, it is possible these values are new unseen data. Thus, I leverage PR #693 and add the arg sample_new_groups to the functions predictions, comparisons, and slopes (and the associated plotting functions) to allow plotting of new groups.

The code and plots below are for the hierarchical model with group specific effects.

bmb.interpret.plot_predictions(
    model=model_hierarchical,
    idata=result_hierarchical,
    covariates=["edu", "eth"],
    sample_new_groups=True,
    pps=False,
    legend=False,
    fig_kwargs={"figsize": (12, 5), "sharey": True}
)

fig, ax = bmb.interpret.plot_comparisons(
    model=model_hierarchical,
    idata=result_hierarchical,
    contrast={"eth": ["Black", "White"]},
    conditional=["age", "edu"],
    comparison_type="diff",
    sample_new_groups=True,
    subplot_kwargs={"main": "age", "group": "edu"},
    fig_kwargs={"figsize": (7, 3), "sharey": True},
    legend=True,
)

@NathanielF I will create a separate PR today, and then you can pull these changes into your branch 👍🏼

NathanielF · 2023-09-15T08:40:18Z

Great!

GStechschulte · 2023-09-15T12:10:13Z

Great!

Changes are now up in PR #721

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

NathanielF · 2023-09-17T20:16:51Z

I've tightened some of the writing and updated some of the plots. The broad thrust of the article is where I want it to be.

Moving from a discussion of regression as a means for automating stratum specific effect modification into a discussion of MrP as a technique for recovering the correct modification effects.

Throughout, I make use of the new model interepretability functions to emphasise the variation of response across different strata. I think the themes of the piece tie together well with a demonstration of these new functions.

On reflection I think the way I've plotted the output of the final model doesn't need the new plotting functions because in a sense the plots I need require I adjust the model, not plot the predictions as they are.

I'm happy enough with it now as it stands, but would like a review of the content - maybe in particular with an emphasis on what aspects of the new functionality I could add to tie things together better... @tomicapretto , @GStechschulte

tomicapretto · 2023-09-18T16:45:30Z

The error in the test is because the 1.26 release of NumPy. See pymc-devs/pytensor#441

tomicapretto · 2023-09-18T17:18:51Z

docs/notebooks/mister_p.ipynb

@@ -0,0 +1,8615 @@
+{


nit: warnings is part of the standard library, can you import like this?

import warnings

import arviz as az
import bambi as bmb
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm

Reply via ReviewNB

tomicapretto · 2023-09-18T17:18:51Z

docs/notebooks/mister_p.ipynb

@@ -0,0 +1,8615 @@
+{


can it explain the coding of the data? For example, what is the meaning of Risk_Strata when it's 0 or 1. Similar for Outcome and Treatment (this last one is easier to infer)

Reply via ReviewNB

Added above.

looks great!

tomicapretto · 2023-09-18T17:18:51Z

docs/notebooks/mister_p.ipynb

@@ -0,0 +1,8615 @@
+{


others may have the same question as me: where does the "0.25" come from?

Reply via ReviewNB

Separated out the calculation of mean outcome from weighted average calc to make this clearer

tomicapretto · 2023-09-18T17:19:36Z

@NathanielF thanks a lot for this great example. Couldn't finish the review now but I'll try to do it later today!

tomicapretto · 2023-09-19T00:38:01Z

docs/notebooks/mister_p.ipynb

@@ -0,0 +1,8615 @@
+{


Can I ask for one or two sentences explaining what you're computing here and what's the information in the table? I can follow it because I'm familiar with what's going on, but I think it would be better if it was more digested for users.

Reply via ReviewNB

tomicapretto · 2023-09-19T00:38:01Z

docs/notebooks/mister_p.ipynb

@@ -0,0 +1,8615 @@
+{


Can we remove the %%capture so users see how much time it took?

Reply via ReviewNB

docs/notebooks/mister_p.ipynb

tomicapretto · 2023-09-19T00:39:42Z

@NathanielF the notebook is in great shape. It's a lot of great work. I'm still requesting some changes. Many of them are minor. In most cases I'm asking to expand a bit on the explanations/interpretations. Once that is done this will be ready to be merged. Thanks a lot!

NathanielF · 2023-09-19T08:44:39Z

Great to hear. Will adjust today

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

tomicapretto · 2023-09-19T14:15:08Z

@NathanielF thanks for all the updates! If you confirm you're not modifying it anymore, I can merge.

NathanielF · 2023-09-19T14:16:40Z

Thank you! Confirmed

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

[715] Adding Mr P docs

591ec65

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

NathanielF mentioned this pull request Sep 3, 2023

Adding Mister P Docs #715

Closed

tomicapretto reviewed Sep 4, 2023

View reviewed changes

docs/notebooks/mister_p.ipynb Show resolved Hide resolved

docs/notebooks/mister_p.ipynb Show resolved Hide resolved

docs/notebooks/mister_p.ipynb Show resolved Hide resolved

NathanielF added 2 commits September 4, 2023 23:10

[715] updating black formatting and suppressing numpyro messages

dea38ee

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

[715] adding plot_comparisons checks on basic model

0c2132f

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

NathanielF marked this pull request as ready for review September 9, 2023 22:07

GStechschulte mentioned this pull request Sep 15, 2023

interpret support group effects #721

Merged

1 task

[715] updated writing and plots

631d3c6

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

tomicapretto reviewed Sep 18, 2023

View reviewed changes

tomicapretto reviewed Sep 19, 2023

View reviewed changes

[715] added clarifications based on Review

a2bd772

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

[715] updated with black formatting

f65d7f8

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

[715] tiny correction

ba04cf0

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>

tomicapretto merged commit bbc2cb1 into bambinos:main Sep 19, 2023

GStechschulte pushed a commit to GStechschulte/bambi that referenced this pull request Oct 3, 2023

[715] Adding Mr P docs (bambinos#716)

0707b37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[715] Adding Mr P docs #716

[715] Adding Mr P docs #716

NathanielF commented Sep 3, 2023 •

edited

Loading

review-notebook-app bot commented Sep 3, 2023

NathanielF commented Sep 3, 2023

NathanielF commented Sep 3, 2023

NathanielF commented Sep 3, 2023

tomicapretto commented Sep 4, 2023

tomicapretto commented Sep 4, 2023

NathanielF commented Sep 9, 2023

GStechschulte commented Sep 10, 2023

NathanielF commented Sep 11, 2023

tomicapretto commented Sep 12, 2023

GStechschulte commented Sep 13, 2023 •

edited

Loading

GStechschulte commented Sep 15, 2023 •

edited

Loading

NathanielF commented Sep 15, 2023

GStechschulte commented Sep 15, 2023

NathanielF commented Sep 17, 2023

tomicapretto commented Sep 18, 2023

tomicapretto Sep 18, 2023 •

edited

Loading

tomicapretto Sep 18, 2023 •

edited

Loading

NathanielF Sep 19, 2023

tomicapretto Sep 19, 2023

tomicapretto Sep 18, 2023 •

edited

Loading

NathanielF Sep 19, 2023

tomicapretto commented Sep 18, 2023

tomicapretto Sep 19, 2023 •

edited

Loading

tomicapretto Sep 19, 2023 •

edited

Loading

tomicapretto commented Sep 19, 2023

NathanielF commented Sep 19, 2023

tomicapretto commented Sep 19, 2023

NathanielF commented Sep 19, 2023 •

edited

Loading

[715] Adding Mr P docs #716

[715] Adding Mr P docs #716

Conversation

NathanielF commented Sep 3, 2023 • edited Loading

Mister P

review-notebook-app bot commented Sep 3, 2023

NathanielF commented Sep 3, 2023

NathanielF commented Sep 3, 2023

NathanielF commented Sep 3, 2023

tomicapretto commented Sep 4, 2023

tomicapretto commented Sep 4, 2023

NathanielF commented Sep 9, 2023

GStechschulte commented Sep 10, 2023

NathanielF commented Sep 11, 2023

tomicapretto commented Sep 12, 2023

GStechschulte commented Sep 13, 2023 • edited Loading

GStechschulte commented Sep 15, 2023 • edited Loading

NathanielF commented Sep 15, 2023

GStechschulte commented Sep 15, 2023

NathanielF commented Sep 17, 2023

tomicapretto commented Sep 18, 2023

tomicapretto Sep 18, 2023 • edited Loading

Choose a reason for hiding this comment

tomicapretto Sep 18, 2023 • edited Loading

Choose a reason for hiding this comment

NathanielF Sep 19, 2023

Choose a reason for hiding this comment

tomicapretto Sep 19, 2023

Choose a reason for hiding this comment

tomicapretto Sep 18, 2023 • edited Loading

Choose a reason for hiding this comment

NathanielF Sep 19, 2023

Choose a reason for hiding this comment

tomicapretto commented Sep 18, 2023

tomicapretto Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

tomicapretto Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

tomicapretto commented Sep 19, 2023

NathanielF commented Sep 19, 2023

tomicapretto commented Sep 19, 2023

NathanielF commented Sep 19, 2023 • edited Loading

NathanielF commented Sep 3, 2023 •

edited

Loading

GStechschulte commented Sep 13, 2023 •

edited

Loading

GStechschulte commented Sep 15, 2023 •

edited

Loading

tomicapretto Sep 18, 2023 •

edited

Loading

tomicapretto Sep 18, 2023 •

edited

Loading

tomicapretto Sep 18, 2023 •

edited

Loading

tomicapretto Sep 19, 2023 •

edited

Loading

tomicapretto Sep 19, 2023 •

edited

Loading

NathanielF commented Sep 19, 2023 •

edited

Loading