Support for additional objective functions with random effects and boosting #7

dagley11 · 2020-08-18T20:31:41Z

Hi, very excited about this package! Do you see extensions to additional loss functions for random effects + boosting as a potential near term upgrade? For my use case, I need a custom objective function, which is currently supported with LightGBM via fobj. I am of course getting the following error, "GPBoostError: Gaussian process boosting can currently only be for 'objective = "regression"'. Curious if adding support for either custom objectives or the other objective functions offered by LightGMB is in the cards.

Thanks for the great work here.

The text was updated successfully, but these errors were encountered:

fabsig · 2020-08-19T06:26:47Z

Thank you. Yes, this is a feature that is planned to be added in the near future (hopefully before the end of this year). We will start with well known loss functions such as those used in classification, but supporting custom loss functions would obviously also be a nice feature.

fabsig · 2021-01-29T10:24:10Z

With the newly released version 0.3.0, the GPBoost library now also supports other objective functions. See, e.g., these examples on how to use them. Currently supported objective functions are: binary, poisson, and gamma.

Unfortunately, it is not easy possible to allow for custom objective functions. The situation is much more involved compared to standard boosting without random effects or GP models as the objective functions and the first three partial derivative are extensively used in C++ (and not just once per boosting iteration). However, if you provide me the objective function together with the first three partial derivatives wrt the parameter that is related to the tree ensemble and random effects, I can try to implement it.

dagley11 · 2021-02-20T16:34:06Z

Thanks Fabio! This is good news. So, our loss functions are more complex. Let me quickly explain, and I'd love your take on whether or not this would be implementable on your end. I am working with a team of engineers on this problem (with some C++ experience), and we could certainly pitch in to help here! We are trying to estimate a coefficient on the RHS of a regression where this coefficient is the single unknown variable. For example, in the Poisson case our target variable is actually a (not Y):

E[Y] = exp(a*p + offset)

So, our goal is to map a high dimensional feature set onto the coefficient a. To do this we take the partial derivatives of the Poisson loss w.r.t. a. But note that we also need to use the offset vector and a p vector for partial derivative calculations (as well as Y). So, if we were to implement this as a custom loss in LightGBM, we'd need to include these variables as arguments to the gradient/hessian calculation function. These variables are static and would not change over the course of estimating a.

We would love to have an implementation for both the Poisson and Logistic use case. We have the partial derivative calculations ready and can easily share. Please let me know if this feels possible or if you need additional info/context on our loss functions.

fabsig · 2021-02-20T20:54:52Z

I do think that I need more information to give an answer whether this is possible or not. Do you have a document that you could share with more details about your approach? Maybe its easier if you contact me by email.

One thing is a requirement: The loss function (or the negative exponential of the loss function, depending on how you define it) needs to be integrable with respect to the multivariate Gaussian distribution of the random effects. Otherwise, we have situation where required quantities for the GPBoost algorithm (and also generalized linear mixed effects models) are not well defined.

dagley11 · 2021-02-20T21:14:27Z

Sure, happy to follow up via email and can definitely provide a brief document with technical details on the approach. Will you link me to your email address or reply to me directly at alexander.s.dagley@gmail.com? Appreciate the help here!

fabsig closed this as completed Feb 8, 2021

fabsig added the enhancement New feature or request label Feb 20, 2021

mikejacktzen mentioned this issue Sep 26, 2022

proportion outcome between [0,1] #75

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for additional objective functions with random effects and boosting #7

Support for additional objective functions with random effects and boosting #7

dagley11 commented Aug 18, 2020

fabsig commented Aug 19, 2020

fabsig commented Jan 29, 2021

dagley11 commented Feb 20, 2021

fabsig commented Feb 20, 2021 •

edited

Loading

dagley11 commented Feb 20, 2021

Support for additional objective functions with random effects and boosting #7

Support for additional objective functions with random effects and boosting #7

Comments

dagley11 commented Aug 18, 2020

fabsig commented Aug 19, 2020

fabsig commented Jan 29, 2021

dagley11 commented Feb 20, 2021

fabsig commented Feb 20, 2021 • edited Loading

dagley11 commented Feb 20, 2021

fabsig commented Feb 20, 2021 •

edited

Loading