You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of #4378, DaskLGBMClassifier.predict(X, pred_contrib=True) returns a list of Dask Arrays if the model is a multiclass classification model and X is a scipy sparse array.
However, those Dask Arrays only have a single chunk. That code should be updated to preserve the original chunking from X.
Motivation
Preserving the chunking would improve the parallelism of any postprocessing of the prediction results using other Dask Array operations, which would reduce the risk of out-of-memory issues.
Description
See #4378 (comment) for a proposed solution, using dask.array.core.concatenate_lookup().
Per this project's process, I've added this to #2302, the issue where all feature requests are tracked. Anyone is welcome to contribute this feature. Please leave a comment here if you're interested in contributing and this issue can be re-opened.
Summary
As of #4378,
DaskLGBMClassifier.predict(X, pred_contrib=True)
returns a list of Dask Arrays if the model is a multiclass classification model andX
is a scipy sparse array.However, those Dask Arrays only have a single chunk. That code should be updated to preserve the original chunking from
X
.Motivation
Preserving the chunking would improve the parallelism of any postprocessing of the prediction results using other Dask Array operations, which would reduce the risk of out-of-memory issues.
Description
See #4378 (comment) for a proposed solution, using
dask.array.core.concatenate_lookup()
.References
Created from #4378 (comment) and #4378 (comment).
This issue is only relevant once #4378 is merged.
The different output format for the multiclass + pred_contrib + sparse X case is described in detail in #3881.
The text was updated successfully, but these errors were encountered: