Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] use 2d collections for predictions, grads and hess in multiclass custom objective #4925
[python-package] use 2d collections for predictions, grads and hess in multiclass custom objective #4925
Changes from all commits
36137f7
a40ddb7
d90464b
363aaf1
7aee1bb
5a56a30
dd21325
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to use
_Booster__num_class
here instead to avoid the lib call? I don't fully understand where__num_class
gets converted to_Booster__num_class
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It is safe since
Booster.__num_class
comes from the lib call. SeeLightGBM/python-package/lightgbm/basic.py
Lines 2598 to 2601 in a1fbe84
and
LightGBM/python-package/lightgbm/basic.py
Lines 2616 to 2620 in a1fbe84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that the attribute changes name. I see it's used as
self.__num_class
in some places but if I add a breakpoint at that line it doesn't have that attribute but hasself._Booster__num_class
which is the part that confuses me. Do you think the performance impact of calling the lib on each iteration is noticeable and should be changed to use the attribute instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also check that customized evaluation function with multi class works correctly? I've read the code, and it seems that the customized evaluation function will finally take the output of
__inner_predict
as input, which is of the shapen_sample * n_class
. This is inconsistent with the hint here.LightGBM/python-package/lightgbm/basic.py
Line 3840 in 820ae7e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm you're right. I've only modified the portions required for
fobj
, I'll work onfeval
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the reshaping to
__inner_predict
in 5a56a30 so that it works in both places and added a test to check that we get the same result using the built-in log loss and computing it manually.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The grad and hess are weighted in the sklearn interface but they're not in basic, should we weigh them there as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guolinke Hey! Don't you remember the reason for doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with interfaces in
basic.py
, the weighting will be done in the C++ side finally. I'll double check why weighting is done here directly with sklearn interfaces.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shiyu1994 You've merged this PR without resolving this conversation. Could you please share your findings about weighting derivatives here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I did not notice that what we discussed above is customized objective. I thought we are discussing native objectives of LightGBM. I suddenly noticed that weights with customized objective function is not handled correctly for Python API. See the code below.
If we comment out the weights in the training dataset construction. The code will provide exactly the same output as below.
We need a separate PR to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I find an additional issue. The latest master branch did not produce any evaluation results in the log as above. I get the log with version 3.3.2 instead. This is another issue we need to investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I noticed when I saw that in the scikit-learn interface grad and hess are weighted before boosting. I don't know if it's because in basic you get a
Dataset
and have access to the weights and can weigh them in the objective function and in sklearn you can't but if that's the case it's worth mentioning in the docs.I believe this is because callbacks are now preferred (#4878), to log the evaluation you have to specify
callbacks=[lgb.log_evaluation(1)]