-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dask] Parallel tree learner with Dask cannot overfit a small dataset #4471
Comments
Hi. I'm not able to reproduce this with the latest version in master. I believe it could be related to #4026 where if one split produced an empty child in one of the workers the predictions would become very large (which could be the cause of the high MSE you observe). That issue was fixed but hasn't been released yet. Can you try running this with the version in master? I ran it and got:
|
Hi José, thank you for your reply! It seems to be an installation issue -- running by re-installing LightGBM 3.2.1 using pip/conda generates results like you provided. Thanks for the help! |
thanks for the help @jmoralez |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Hi, thanks for providing such a fantastic gradient boosting library! I was doing a sanity check of the parallel tree learner with dask, asking the model to overfit a small dataset. However, it seems the model fails to do so as long as there are at least two workers. The following code is modified from here. As we can see, with
n_workers
=1, the model successfully overfit the training data with a very small MSE (0.05 on my machine), but withn_workers
=2, the model failed to do so, resulting in a MSE of a few thousand (2890 on my machine). May someone let me know the actual training procedure happening in the library? I am using the library with a version of 3.2.1. Thanks!The text was updated successfully, but these errors were encountered: