[dask] Support custom objective functions #3934

jameslamb · 2021-02-10T04:18:52Z

Summary

The Dask estimators in lightgbm.dask should support the use of a custom objective function.

Motivation

This feature would bring Dask estimators closer to parity with the sklearn estimators.

Description

I haven't thought through this much yet, just writing up a placeholder issue for discussion. If you're reading this and have ideas, please comment and the issue can be re-opened.

References

See

LightGBM/python-package/lightgbm/sklearn.py

Lines 430 to 453 in 15916a9

    
                   A custom objective function can be provided for the ``objective`` parameter. 
        
                   In this case, it should have the signature 
        
                   ``objective(y_true, y_pred) -> grad, hess`` or 
        
                   ``objective(y_true, y_pred, group) -> grad, hess``: 
        
                       y_true : array-like of shape = [n_samples] 
        
                           The target values. 
        
                       y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) 
        
                           The predicted values. 
        
                       group : array-like 
        
                           Group/query data. 
        
                           Only used in the learning-to-rank task. 
        
                           sum(group) = n_samples. 
        
                           For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups, 
        
                           where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc. 
        
                       grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) 
        
                           The value of the first order derivative (gradient) for each sample point. 
        
                       hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) 
        
                           The value of the second order derivative (Hessian) for each sample point. 
        
                   For binary task, the y_pred is margin. 
        
                   For multi-class task, the y_pred is group by class_id first, then group by row_id. 
        
                   If you want to get i-th row y_pred in j-th class, the access way is y_pred[j * num_data + i] 
        
                   and you should group grad and hess in this way as well.

for an explanation of how this works in the sklearn estimators.

Can look at how xgboost.dask handles this for inspiration:

The text was updated successfully, but these errors were encountered:

jameslamb · 2021-02-10T04:19:31Z

Closing this in favor of putting it in #2302 with other feature requests. Anyone is welcome to pick up this feature! Please comment if interested and the issue can be re-opened.

jameslamb · 2021-12-29T20:52:37Z

Re-opening this as I'm working on this one right now.

* add test for custom objective with regressor * add test for custom binary classification objective with classifier * isort * got tests working for multiclass * update docs * train deeper model for classifier * Apply suggestions from code review Co-authored-by: José Morales <jmoralz92@gmail.com> * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update multiclass tests * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix multiclass probabilities * linting Co-authored-by: José Morales <jmoralz92@gmail.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

github-actions · 2023-08-16T00:18:44Z

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

jameslamb added the feature request label Feb 10, 2021

jameslamb closed this as completed Feb 10, 2021

jameslamb mentioned this issue Feb 10, 2021

Feature Requests & Voting Hub #2302

Open

StrikerRUS mentioned this issue May 25, 2021

[dask] add support for eval sets and custom eval functions #4101

Merged

StrikerRUS mentioned this issue Nov 14, 2021

[python] add type hints for custom objective and metric functions in scikit-learn interface #4547

Merged

jameslamb reopened this Dec 29, 2021

jameslamb added the dask label Dec 29, 2021

jameslamb self-assigned this Dec 29, 2021

jameslamb mentioned this issue Dec 30, 2021

[dask] add support for custom objective functions (fixes #3934) #4920

Merged

StrikerRUS closed this as completed in #4920 Jan 17, 2022

jameslamb mentioned this issue Oct 7, 2022

[DO NOT MERGE] Release v3.3.3 #5525

Closed

40 tasks

github-actions bot locked as resolved and limited conversation to collaborators Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dask] Support custom objective functions #3934

[dask] Support custom objective functions #3934

jameslamb commented Feb 10, 2021

jameslamb commented Feb 10, 2021

jameslamb commented Dec 29, 2021

github-actions bot commented Aug 16, 2023

[dask] Support custom objective functions #3934

[dask] Support custom objective functions #3934

Comments

jameslamb commented Feb 10, 2021

Summary

Motivation

Description

References

jameslamb commented Feb 10, 2021

jameslamb commented Dec 29, 2021

github-actions bot commented Aug 16, 2023