[dask] Parallel tree learner with Dask cannot overfit a small dataset #4471

xingyuansun · 2021-07-13T18:37:46Z

Hi, thanks for providing such a fantastic gradient boosting library! I was doing a sanity check of the parallel tree learner with dask, asking the model to overfit a small dataset. However, it seems the model fails to do so as long as there are at least two workers. The following code is modified from here. As we can see, with n_workers=1, the model successfully overfit the training data with a very small MSE (0.05 on my machine), but with n_workers=2, the model failed to do so, resulting in a MSE of a few thousand (2890 on my machine). May someone let me know the actual training procedure happening in the library? I am using the library with a version of 3.2.1. Thanks!

import dask.array as da
from distributed import Client, LocalCluster
from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error
import lightgbm as lgb

if __name__ == "__main__":
    for n_workers in [1, 2]:
        X, y = make_regression(n_samples=1000, n_features=50, random_state=0)
        cluster = LocalCluster(n_workers=n_workers)
        client = Client(cluster)
        dX = da.from_array(X, chunks=(100, 50))
        dy = da.from_array(y, chunks=(100,))
        dask_model = lgb.DaskLGBMRegressor(n_estimators=1000, random_state=0)
        dask_model.fit(dX, dy)
        assert dask_model.fitted_
        preds = dask_model.predict(dX)
        preds_local = preds.compute()
        actuals_local = dy.compute()
        mse = mean_squared_error(actuals_local, preds_local)
        print(f"MSE: {mse}")

The text was updated successfully, but these errors were encountered:

jmoralez · 2021-07-15T00:36:33Z

Hi. I'm not able to reproduce this with the latest version in master. I believe it could be related to #4026 where if one split produced an empty child in one of the workers the predictions would become very large (which could be the cause of the high MSE you observe). That issue was fixed but hasn't been released yet.

Can you try running this with the version in master? I ran it and got:

MSE: 0.05205989998796295
MSE: 0.049111817689211024

xingyuansun · 2021-07-15T15:22:52Z

Hi José, thank you for your reply! It seems to be an installation issue -- running by re-installing LightGBM 3.2.1 using pip/conda generates results like you provided. Thanks for the help!

jameslamb · 2021-07-15T15:28:55Z

thanks for the help @jmoralez

github-actions · 2023-08-23T14:27:18Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

xingyuansun closed this as completed Jul 15, 2021

jameslamb added dask question labels Jul 15, 2021

jameslamb changed the title ~~Parallel tree learner with Dask cannot overfit a small dataset~~ [dask] Parallel tree learner with Dask cannot overfit a small dataset Jul 15, 2021

jameslamb mentioned this issue Aug 24, 2021

Weights & Early Stopping with LGBMRegressor #4551

Closed

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dask] Parallel tree learner with Dask cannot overfit a small dataset #4471

[dask] Parallel tree learner with Dask cannot overfit a small dataset #4471

xingyuansun commented Jul 13, 2021

jmoralez commented Jul 15, 2021

xingyuansun commented Jul 15, 2021

jameslamb commented Jul 15, 2021

github-actions bot commented Aug 23, 2023

[dask] Parallel tree learner with Dask cannot overfit a small dataset #4471

[dask] Parallel tree learner with Dask cannot overfit a small dataset #4471

Comments

xingyuansun commented Jul 13, 2021

jmoralez commented Jul 15, 2021

xingyuansun commented Jul 15, 2021

jameslamb commented Jul 15, 2021

github-actions bot commented Aug 23, 2023