-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refit in Python does not support weights #3038
Comments
Please refer to #1629 (comment).
|
Thanks for pointing me to the comment, but Example: tree with three observations that makes one split. Using the weights, I can determine the value of the leaf for the bigger group. import numpy as np
import lightgbm
X = np.array([1, 2, 2]).reshape((3, 1))
label = np.array([1, 2, 3])
data = lightgbm.basic.Dataset(X, label)
booster = lightgbm.engine.train(
{
"min_data_in_bin": 1,
"min_data_in_leaf": 1,
"learning_rate": 1,
"boost_from_average": False,
},
data,
num_boost_round=2,
)
booster.predict(X)
# array([1. , 2.5, 2.5])
# let's refit (to make sure it works)
booster_refit = booster.refit(X, label, decay_rate=0.0)
booster_refit.predict(X)
# array([1. , 2.5, 2.5])
# use weights (I added data_set_kwargs)
booster_refit = booster.refit(
X, label, decay_rate=0.0, data_set_kwargs={"weight": np.array([1.0, 0.0, 1.0])}
)
booster_refit.predict(X)
# array([1., 3., 3.])
booster_refit = booster.refit(
X, label, decay_rate=0.0, data_set_kwargs={"weight": np.array([1.0, 1.0, 0.0])}
)
booster_refit.predict(X)
# array([1., 2., 2.]) |
Closed in favor of being in #2302. We decided to keep all feature requests in one place. Welcome to contribute to this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature. |
I'll take this to work, here is the plan I'll follow, any changes to it are welcome.
|
…t() (fixes #3038) (#4894) * feat: refit additional kwargs for dataset and predict * test: kwargs for refit method * fix: __init__ got multiple values for argument * fix: pycodestyle E302 error * refactor: dataset_params to avoid breaking change * refactor: expose all Dataset params in refit * feat: dataset_params updates new_params * fix: remove unnecessary params to test * test: parameters input are the same * docs: address StrikeRUS changes * test: refit test changes in train dataset * test: set init_score and decay_rate to zero
This issue has been automatically locked since there has not been any recent activity since it was closed. |
Summary
The refit method in Python does not support weights (or in fact anything other than data and labels). This is because here, the training data set gets created using:
It would be great if refit accepted additional arguments (or kwargs) specifically for the
Dataset
call.Passing in the
train_set
directly would also be an option. But since we need to predict first, we would also need to pass the training data in as data frame, which is not so nice.Motivation
I'm using re-fit in an application where weights are very important.
I'm happy to open a PR if this sounds useful.
The text was updated successfully, but these errors were encountered: