Do I need to scale/normalize the features(input variables #193
Replies: 3 comments
-
Dear @ferrenlove , thanks for opening this discussion. Generally, you don't have to normalize the data before using DoubleML. Many learners (like Regarding your second question: I think it's very difficult to give general advice on this. I'd recommend you to have a look at the predictive performance of each of these learners, i.e., the first-stage prediction errors for the nuisance components Maybe that helps you to better understand which learner works well and which doesn't in your setting. Also you might consider the hyperparameter choice of your learner. Whereas the CV Lasso adjusts the value for I hope that helps you. Feel free to give an update on your causal modeling task :) |
Beta Was this translation helpful? Give feedback.
-
Hi PhilippBac:
Thank you for your reply! It is super helpful! I did check your training
material, but it seems I only have time for Europe and Asia sessions, not
for Pacific time. Will there be any recordings available?
Regarding the contents, will you cover something like causal forest? I am
interested in that as well.
Secondly, thanks for the detailed explanation. I have a follow-up question.
I found that using ml_l = lm, lasso, or random forest gives me relatively
low MAE (or RMSE). However, ml_l = xgb usually produces a model that is
about 10%-15% larger than other models. For the second step, with ml_m =
xgb, it produces the largest treatment effect coefficient, which can be a
few times that from ml_l = lm, lasso, or random forest. What is your
opinion?
…On Thu, Feb 29, 2024 at 1:08 AM PhilippBach ***@***.***> wrote:
Dear @ferrenlove <https://github.com/ferrenlove> ,
thanks for opening this discussion. Generally, you don't have to normalize
the data before using DoubleML. Many learners (like cv.glmnet() or it's
mlr3 interface) normalize the data internally.
One example is the CV Lasso in Python (scikit-learn) and R (glmnet),
where glmnet normalizes by default but scikit-learn doesn't. You can have
a look at the learners in the 401 k example with the Python package
<https://docs.doubleml.org/stable/examples/py_double_ml_pension.html> and
the R package
<https://docs.doubleml.org/stable/examples/R_double_ml_pension.html>. In
Python, we had to define a pipeline that implements the normalization for
lasso.
Regarding your second question: I think it's very difficult to give
general advice on this. I'd recommend you to have a look at the predictive
performance of each of these learners, i.e., the first-stage prediction
errors for the nuisance components ml_l and ml_m. Our experience tells us
that the choice and the performance of the learners plays an important role
for the resulting causal estimate. See for example our WP here for more
information: https://arxiv.org/abs/2402.04674
Maybe that helps you to better understand which learner works well and
which doesn't in your setting. Also you might consider the hyperparameter
choice of your learner. Whereas the CV Lasso adjusts the value for
$\lambda$ internally, you'd have to tune the parameters for the gradient
boosting learner. We experienced that the default parameters of XGBoost
do not seem appropriate in several settings and tuning changes XGBoost's
performance pretty much.
I hope that helps you. Feel free to give an update on your causal modeling
task :)
—
Reply to this email directly, view it on GitHub
<#193 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHMDKLHLF54MV2X6GO6C7KLYV3XZRAVCNFSM6AAAAABDUCUDQ6VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMMRYGA4DQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Sincerely,
Xue Liu (Sheryl)
|
Beta Was this translation helpful? Give feedback.
-
Hi @ferrenlove , thanks for following up on this... Regarding the estimation with XGBoost etc. I'm not sure if I fully understand. Do you refer to a 10%-15% larger coefficient or a 10%-15% larger RMSE for the predictive tasks? Generally, I'd recommend you to double check the predictive performance, e.g. by repeating estimation several times (e.g. setting Regarding the trainings: We generally don't record the sessions... Sorry to hear that you won't be able to join the pacific-time training in June. Hopefully, it works on another occasion. We do not cover causal forests in much detail. But we do have a session on heterogeneous treatment effects and also some materials on the relation of DML for CATES & GATES to causal forests. I hope this helps! Best, Philipp |
Beta Was this translation helpful? Give feedback.
-
I am a data scientist who does some research without AB test. I am trying double ML in R and have a few questions:
XGBoost
learner_xgb <- lrn("regr.xgboost", objective = "reg:squarederror")
dml_plr_xgb <- DoubleMLPLR$new(data_ml, learner_xgb, learner_xgb)
dml_plr_xgb$fit()
GLMNET
learner_glmnet <- lrn("regr.cv_glmnet")
dml_plr_glmnet <- DoubleMLPLR$new(data_ml, learner_glmnet, learner_glmnet)
dml_plr_glmnet$fit()
Hope these questions make sense to you!
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions