Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add num_leaves dials object #256

Merged
merged 6 commits into from
Nov 3, 2022
Merged

add num_leaves dials object #256

merged 6 commits into from
Nov 3, 2022

Conversation

joeycouse
Copy link
Contributor

This PR adds the num_leaves dials object. This is an engine specific tuning parameter for 'lightgbm', it's probably the main parameter for controlling the complexity of lightbgm models since they grow leaf-wise instead of depth-wise lightgbm doc.

joeycouse and others added 5 commits October 13, 2022 08:48
Merge commit 'd47dc47f42ad9c190d7f4a1ec85db0e385345ec0'

#Conflicts:
#	tests/testthat/test-params.R
(roxygen2 version bump does not affect the Rds)
@hfrick
Copy link
Member

hfrick commented Nov 2, 2022

Thank you @joeycouse ! I've put in a PR for parsnip so that the tidymodels machinery can pick up the new dials parameter.

@simonpcouch do you have opinions on this? In particular: any comments on the default range chosen here? 🙌

@simonpcouch
Copy link
Contributor

This PR looks great, thank you @joeycouse!

100 seems reasonable for the upper bound. 31 feels possibly high—31 is lightgbm’s default. I would think something like 5, maybe 10?

@jameslamb, do you have any thoughts here? For context, this PR adds objects that support tidymodels tuning the num_leaves parameter. If a user notes they'd like to tune num_leaves but don't supply the grid of values they'd like to evaluate over, tidymodels will select a set of values in some default range. That range can be adjusted, as well as the number of draws from that range and the sampling design used to take those draws.

A stray note, for bonsai: lightgbm allows passing this argument with a few different aliases. We ought to keep an eye out for users attempting to tune this parameter with one of these aliases, probably nudging them to just use the argument with this name so they get the tuning machinery "for free."

Related to tidymodels/bonsai#49.

@jameslamb
Copy link

thanks for the @ and context @simonpcouch !

I agree that 31 is probably too high of a floor. Knowing nothing about the size and shape of data, I think the range num_leaves in [5, 100] is a pretty good starting point!

Please also keep in mind this related discussion from @dfsnow: tidymodels/bonsai#49. Some combinations of max_depth and num_leaves are impossible.

with a few different aliases

LightGBM supports so many different aliases for parameters as a way to offer compatibility with a wide range of frameworks in different languages. I understand that that can make it challenging to configure, and has been a source of a lot of bugs and maintenance effort over the years 😭

We have some internal mechanisms in the R and Python packages (and another in the core C/C++ code) for resolving aliases. If you'd like, in a separate issue thread, I'd be happy to provide some links to those and to have a discussion about the possibility of making some of those mechanisms part of the public API that you could just import.

@simonpcouch
Copy link
Contributor

Thanks a lot for chiming in, @jameslamb!

Let's go with [5, 100], then.🏄‍♀️

Following up here, I'll revisit bonsai's 49 and open up an issue in bonsai related to the aliasing interface. Thanks for your willingness to discuss / make changes here!🙂

@hfrick
Copy link
Member

hfrick commented Nov 3, 2022

Excellent, thanks for the input @jameslamb and thanks for the PR @joeycouse !

@simonpcouch I'll merge this PR and leave the parsnip one open until we've discussed release schedule etc 👍

@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants