-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Reproduce the result of cross validation #5000
Comments
Thanks for your interest in LightGBM and excellent issue write-up! I haven't had a chance to run your code yet but I can just say briefly that I see one major way that your code differs from A LightGBM To ensure that all subsets used in CV have the same bin boundaries, the process for LightGBM/python-package/lightgbm/engine.py Lines 348 to 349 in 0688f47
In your code, you are creating a new For more details, you may want to see this related discussion: #4319 Could you try modifying your code to create one |
Thank you very much. I change the the codes according to your suggestion and it works like a charm! And very good discussion in #4319 too. Much appreciate your effort on this repo. # ------------------------------
boosters = []
for i in range(3):
train_fold = train_data.subset(folds[i][0]) # <----- change here
booster = lgb.train(
params=DEFAULTED_PARAMS,
train_set=train_data,
num_boost_round=10
)
boosters.append(booster)
# ------------------------------
valid_feat = data.loc[folds[0][1], ['FEAT1', 'FEAT2']]
pred_cv = cvbooster['cvbooster'].boosters[0].predict(valid_feat)
pred_train = boosters[0].predict(valid_feat)
pred_cv
array([0.12345])
pred_train
array([0.12345]) # it's exact as in cv |
oh great, glad that worked! I'll close this for now, come back any time if you have other questions 👋 |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Description
Just for self-researching and understanding the package, I would like to replicate the boosters generated by
lgb.cv()
(with custom folds). I uselgb.train()
on each of the custom fold data but the boosters fromlgb.train()
is different from the boosters fromlgb.cv()
.In following example, I use custom 3-folds on a toy data set.
Reproducible example
So
cvbooster['cvbooster'].boosters[0]
is a different booster fromboosters[0]
. I've tried a lot with different data and custom folds but still can't replicate the booster from cv. Sorry if I make silly mistake somewhere.Environment info
LightGBM version: 3.2.1
Command(s) you used to install LightGBM:
The text was updated successfully, but these errors were encountered: