-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[715] Adding Mr P docs #716
Conversation
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@NathanielF thanks a lot for proactively writing this notebook! I'm adding some minor comments. |
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
I've added some usages of the new plot comparisons and comparisons functionality to this notebook. I specified a base model without the |
Sorry @GStechschulte I must have deleted my earlier comment. Let me know if you have any questions about the errors i found. They only seemed to occur in my "complex" model where i had allot of group specfic intercept terms |
@GStechschulte I did a dirty and quick check (translation: added a print before .predict() is called inside |
The problem is that no default values are being computed for model terms specified as group level effects. This is because group level effects do not have a For example, the output of the for loop in # ../interpret/utils.py
for term in terms.values():
print(f"term variable names: {term.var_names}")
if hasattr(term, "components"):
print(f"component variable names: {term.var_names}")
for component in term.components:
# If the component is a function call, use the argument names
if isinstance(component, Call):
names = [arg.name for arg in component.call.args]
else:
names = [component.name]
for name in names:
if name not in data_dict:
# For numeric predictors, select the mean.
if component.kind == "numeric":
data_dict[name] = np.mean(model.data[name])
# For categoric predictors, select the most frequent level.
elif component.kind == "categoric":
data_dict[name] = mode(model.data[name])
The created Group specific effects do have attributes such as Furthermore, it is possible to specify interaction terms within the group level effect. This would require us to find the unique set of term names defined by the modeller to ensure default values are not computed more than once. I am currently attempting to find a solution. |
Now, to compute default values, all unique common and group specific effects variable names are identified and put into a list. From there, the data type is determined by identifying the data types used to fit the model. However, with group specific effects, when the user (or default values are computed) passes values, it is possible these values are new unseen data. Thus, I leverage PR #693 and add the arg The code and plots below are for the hierarchical model with group specific effects. bmb.interpret.plot_predictions(
model=model_hierarchical,
idata=result_hierarchical,
covariates=["edu", "eth"],
sample_new_groups=True,
pps=False,
legend=False,
fig_kwargs={"figsize": (12, 5), "sharey": True}
) fig, ax = bmb.interpret.plot_comparisons(
model=model_hierarchical,
idata=result_hierarchical,
contrast={"eth": ["Black", "White"]},
conditional=["age", "edu"],
comparison_type="diff",
sample_new_groups=True,
subplot_kwargs={"main": "age", "group": "edu"},
fig_kwargs={"figsize": (7, 3), "sharey": True},
legend=True,
) @NathanielF I will create a separate PR today, and then you can pull these changes into your branch 👍🏼 |
Great! |
Changes are now up in PR #721 |
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
I've tightened some of the writing and updated some of the plots. The broad thrust of the article is where I want it to be. Moving from a discussion of regression as a means for automating stratum specific effect modification into a discussion of MrP as a technique for recovering the correct modification effects. Throughout, I make use of the new model interepretability functions to emphasise the variation of response across different strata. I think the themes of the piece tie together well with a demonstration of these new functions. On reflection I think the way I've plotted the output of the final model doesn't need the new plotting functions because in a sense the plots I need require I adjust the model, not plot the predictions as they are. I'm happy enough with it now as it stands, but would like a review of the content - maybe in particular with an emphasis on what aspects of the new functionality I could add to tie things together better... @tomicapretto , @GStechschulte |
The error in the test is because the 1.26 release of NumPy. See pymc-devs/pytensor#441 |
@@ -0,0 +1,8615 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: warnings is part of the standard library, can you import like this?
import warnings
import arviz as az
import bambi as bmb
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
Reply via ReviewNB
@@ -0,0 +1,8615 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it explain the coding of the data? For example, what is the meaning of Risk_Strata
when it's 0 or 1. Similar for Outcome
and Treatment
(this last one is easier to infer)
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!
@@ -0,0 +1,8615 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separated out the calculation of mean outcome from weighted average calc to make this clearer
@NathanielF thanks a lot for this great example. Couldn't finish the review now but I'll try to do it later today! |
@@ -0,0 +1,8615 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask for one or two sentences explaining what you're computing here and what's the information in the table? I can follow it because I'm familiar with what's going on, but I think it would be better if it was more digested for users.
Reply via ReviewNB
@@ -0,0 +1,8615 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NathanielF the notebook is in great shape. It's a lot of great work. I'm still requesting some changes. Many of them are minor. In most cases I'm asking to expand a bit on the explanations/interpretations. Once that is done this will be ready to be merged. Thanks a lot! |
Great to hear. Will adjust today |
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF thanks for all the updates! If you confirm you're not modifying it anymore, I can merge. |
Thank you! Confirmed |
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Mister P
The structure of the notebook is seperated in two parts: (a) a refresher on how to think about the properties of the regression modelling as a method for generating an appropriately weighted average across the conditional strata in your in X matrix. We demonstrate this in some detail because MrP can be viewed as a correction to the effects of regression weighting when regression is applied to non-representative data.
We then replicate the kind of analysis achieved with R here