Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent release version-correlated decrease of LGBMRanker performance #4349

Closed
GerardBCN opened this issue Jun 7, 2021 · 6 comments
Closed
Labels

Comments

@GerardBCN
Copy link

Description

First of all thank you for your work in this amazing library, it has been extremely useful in our research. I open this issue to ask for your opinion about something we have observed in my research group and perhaps is relevant to other people here using LGBM ranker.

It all started when a colleague of mine couldn't reproduce my results and then we noticed that we were using different versions of the lightgbm library. We went on a small experiment to see the differences of performance (we use a custom metric function) depending on the release version and we saw that actually performance decreases quite consistently from old to new releases. See the plot below.

download (2)

We tried to pinpoint which were the possible breaking changes and to do so we tracked the commit history of the lambdarank test function, located at tests/python_package_test/test_sklearn.py . Particularly, we were able to observe several changes in the lambdarank loss function and changes in default parameters. Unfortunately, we are not well versed with the inner workings of lambdarank so we can't fully grasp the relevance of such changes. What was surprising to us, was to see that the plaintext performances used in the equality tests also decrease quite consistently from old to new releases. Please see the plot below which shows the performance written as plaintext as a function of time (commit date).

download (1)

Is there any guidance that the developers could offer us to justify the use of one or another version? Is there any reason why the changes in loss function were applied? Any bugs we should be aware of?

Reproducible example

The performances for the later plot were extracted from the following commits (from old to new):

# https://github.com/microsoft/LightGBM/commit/496a07d1dbd5c3a8cf28d50f5aad84428fddf2f4#diff-711a5439fdebb728fb5859f49561c5cd1388e25276dd03c409dc63c46f2f88d2
# https://github.com/microsoft/LightGBM/commit/aee92f63ba124e1f6a3168eb2864d032567cbf9e#diff-711a5439fdebb728fb5859f49561c5cd1388e25276dd03c409dc63c46f2f88d2
# https://github.com/microsoft/LightGBM/commit/0dfda82607633132e10a693eba9666ed75585ac8#diff-711a5439fdebb728fb5859f49561c5cd1388e25276dd03c409dc63c46f2f88d2
# https://github.com/microsoft/LightGBM/commit/509c2e50c25eded99fc0997afe25ebee1b33285d#diff-98ca62132fa18e4a80cd57f16e9337fe3d72a08d5862d02eae9935bed9e43486
# https://github.com/microsoft/LightGBM/commit/ba0a1f8d38d12aeb29f1c769596308eb8b1e5874
@GerardBCN GerardBCN changed the title Consistent release version-correlated decrease of LGBMRanker custom metric performance Consistent release version-correlated decrease of LGBMRanker performance Jun 7, 2021
@jameslamb
Copy link
Collaborator

Thanks very much for using LightGBM and for the thorough write-up!

Are you able to provide a reproducible example? Without an example to try (and to rule out some theories), I think it will be very difficult to find an answer to this question.

@jameslamb
Copy link
Collaborator

to see that the plaintext performances used in the equality tests also decrease quite consistently from old to new releases

After re-reading I understand what you mean by this. You're saying that for the lambdarank tests in LightGBM's tests suite, you can see hard-coded performance expectations being reduced.

e.g., from 509c2e5#diff-98ca62132fa18e4a80cd57f16e9337fe3d72a08d5862d02eae9935bed9e43486

image

I think maybe @shiyu1994 or @btrotta will be able to give you the best guidance on this question.

@GerardBCN
Copy link
Author

Yes, that's right. Thank you for your prompt answer!

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Jun 9, 2021

@GerardBCN Thanks a lot for sharing your observations! All commits were made to fix bugs or increase ranking performance on real data. I promise that no commit was merged with the aim to intentionally decrease ranking performance 😃 . You can see that some important changes were proved to increase the score on some kind of "standard" ranking datasets for benchmarking: #2322 (comment), #2331 (comment), #3425 (comment).
Regarding intentional score decreasing in the test_sklearn.py file, this was done given that improvements for the ranking algorithm in general or particular bug fixes are not always positively act on one particular dataset used in our tests. I guess that this is also applicable for your situation.

Thanks for providing the list of commits you've found related to ranking! Thanks to GitHub, we can easily check corresponding Pull Requests and find out what was the aim of those PRs.

@no-response
Copy link

no-response bot commented Jul 12, 2021

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

@no-response no-response bot closed this as completed Jul 12, 2021
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants