Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATT/ATC incorrectly updating A #66

Open
Larsvanderlaan opened this issue Oct 5, 2020 · 4 comments
Open

ATT/ATC incorrectly updating A #66

Larsvanderlaan opened this issue Oct 5, 2020 · 4 comments
Labels

Comments

@Larsvanderlaan
Copy link

The ATT parameter targets both the A node and Y node. However (in the spec), the likelihood factor for "A" is of type "likelihood" while the likelihood factor for "Y" is of type "mean". As a result, when targeting "A" with the logistic submodel, the actual observed likelihood P(A|W) is updated (incorrectly) and not the conditional mean P(A=1|W). The epsilon is fit based on a logistic submodel with offset logit P(A|W). This will generally lead to an incorrect update and not solve the necessary score equation. The current ATT test is very simple and does not detect the difference between using P(A=1|W) or P(A|W). With a more complex simulation, you can see that the current approach does not necessarily lead to an increase in the likelihood relative to initial, nor does it solve the A-specific score equation as well as the correct method. With either of the two approaches, I couldn't really get it to match the classic tmle results that well, so this might be worth looking into.

This simulation code is what I used:

D <- DAG.empty()
D <- D +
node("W", distr = "runif", min = -0.8, max = 0.8) +
node("W1", distr = "runif", min = -1, max = 1) +
node("A", distr = "rbinom", size = 1, prob = plogis(W1)) +
node("g1", distr = "rconst", const = plogis(W1)) +
node("Y", distr = "rbinom", size =1 , prob = plogis(( -1.5 + 1 + A + W - W1/2 ))) +
node ("EY1", distr = "rconst", const = plogis(( -1.5 + 1 + 1 + W - W1/2 )))
setD <- set.DAG(D)
data <- sim(setD, n = 1000)
data <- as.data.table(data)

@Larsvanderlaan
Copy link
Author

Also, as a heads up. The data structure/node_list from the cpp data:
node_list <- list(
W = c("sexn"),
A = "parity01",
Y = "haz01"
)

has virtually no signal between W and A and Y. So all the estimates are pretty much Lrnr_mean. I would recommend making a simulated dataset of sufficient complexity.

@jeremyrcoyle
Copy link
Collaborator

I was under the impression that updates were always fit using observed outcomes and the corresponding likelihood values (so here observed A and p(A=a|W). The update is then applied to both the observed and counterfactual P(A=1|W). Is that not the case for the ATT?

@Larsvanderlaan
Copy link
Author

If you want to update both the observed and counterfactual likelihoods (and not just P(A=1|W)) then you need to use a different submodel. The logistic submodel is really only for updating E[A|W] or E[Y|W]. But this of course translates to an update for P(A=0|W) = 1- P(A=1|W). The score of A is something like HA * (A - P(A=1|W) which means we can update/target P(A=1|W) with a logistic submodel. But currently, we are updating as P(A=a|W)* = plogis(qlogis(P(A=a|W)) + eps HA) where now our offset actually depends on our outcome A, which isn't what we want. We wouldn't use type = "likelihood" for the Y node, and for the same reason, we don't want to for the A node.

If we would like to continue representing the A node as a true likelihood and not a conditional mean then
the submodel would be A * plogis( qlogis(ifelse(A=1, pA, 1 - pA)) + eps H ) + (1-A) (1 - plogis( qlogis(ifelse(A=1, pA, 1 - pA)) + eps H )) where pA = P(A=a|W) and the epsilon is fit by running a logistic regression with covariate H and offset qlogis(P(A=1|W)).

@Larsvanderlaan
Copy link
Author

ATT and ATC are missing the W component for the EIF. Also, I think we can avoid needing to do iterative targeting by estimating using the empirical mean over (A,W). (I think Nima used this trick for the shift intervention as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants