-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement easy access to single-tree prediction in fitted LGBM model #3058
Comments
Any update on this? We are facing similar issues. |
@shiyu1994 can you help to check this? |
Maybe we can add a |
Will adding LightGBM/python-package/lightgbm/basic.py Lines 2809 to 2851 in b299de3
LightGBM/python-package/lightgbm/basic.py Lines 2611 to 2633 in b299de3
|
I've done the implementation as @StrikerRUS suggested. If boost_from_average is enabled, the average score will be integrated into the first tree. So |
Hello Juline,
There was some response from LighGBM developers, see email below and ticket
in GitHub.
Is this what we were missing?
Hope you are doing well!
Regards
On Tue, 4 Aug 2020 at 07:57, shiyu1994 ***@***.***> wrote:
I've done the implementation as @StrikerRUS
<https://github.com/StrikerRUS> suggested. If boost_from_average is
enabled, the average score will be integrated into the first tree. So booster.predict(data,
start_iteration=0, num_iteration=1), it will provide the score of the
first tree with average value added. Does that meet your request?
@pransito <https://github.com/pransito>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3058 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABIGNMU2DRHDOP4O4SDVEVDR66POFANCNFSM4M4DHCXQ>
.
--
*Francisco J. Navarro-Brull*
|
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
This has been mentioned in #845. However, the suggested solution there is not working. Here I would like to re-emphasize the need and elaborate on the desired feature.
Summary
In sklearn it is super easy to get via "model.estimators_" access to the prediction of every single tree in the ensemble. I mean the single prediction regardless of all other trees (no cumulative prediction). In LightGBM (I am mainly concerned with regression) this is difficult to achieve or even impossible so far (In #845 it was suggested to achieve that via booster.dump_model, leaf_index prediction etc..., but I have not managed to make that work, the values associated with the leaves also seem to be mean-corrected or just reflect the incremental change to the previous tree... but even taking all this into consideration it still is a cumulative prediction and hence super narrow prediction distributions).
Motivation
It would be very useful to have this feature because it certain use cases it is important to get an idea of the distribution of predictions of all the trees (is it wide or narrow; is it skewed). In some way it may be interpreted as a posterior distribution on the metric variable that is to be predicted (in LGBM regression). This is relevant for both classical GBM regression and classical RF regression.
Description
Like in sklearn there should be a .estimators_ object, with a .predict(X) method that gives out the prediction of every single tree for every row in X. It should be easily accessible and not hidden. It should handle whether boost_from_average was used or not automatically. There should be made a clear distinction between cumulative prediction (which is currently implemented with .predict(num_iteration=i) and "iid" prediction (i.e. every single tree on its own), which I suggest to implement as a new feature. One could imagine to have for the .predict() function a flag "cumulative=True" and when set to false, the trees will answer independently from one another.
References
https://github.com/scikit-learn/scikit-learn/blob/95d4f0841/sklearn/tree/_classes.py#L395
The text was updated successfully, but these errors were encountered: