-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] Promote number of threads to top-level argument in lightgbm()
and change default to number of cores
#4972
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the proposal! I'm ok with adding num_threads
as a top-level argument to lightgbm::lightgbm()
, and with using the built-in {parallel}
package to set a default value for it.
Please see my first round of review comments.
In addition, please add unit tests checking this behavior. At least the following:
- "
lightgbm()
does not require parameter num_threads" - "if num_threads in
params
and keyword argument to lightgbm() have different values, the value in params is used" - "if num_threads is not found in
params
,lightgbm()
uses the value passed to keyword argumentnum_threads
"
Whenever you propose user-facing changes to this project, please add tests on those changes to improve maintainers' confidence and to ensure that future developments in the project don't break the proposed behavior inadvertently.
Updated. |
Updated, with a fallback to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the changes, and the idea to fall back to parallel::detectCores()
if RhcpBLASctl::get_num_cores()
is not available.
- please remove all the added
num_threads = 1
in docs / examples /tests not otherwise touched by this PR, per the request in [R-package] Promote number of threads to top-level argument inlightgbm()
and change default to number of cores #4972 (comment) - please preserve the alphabetical ordering of imports in
DESCRIPTION
{RhpcBLASctl}
should be between{processx}
and{rmarkdown}
{parallel}
should be between{methods}
and{utils}
regarding this linter warning (build link)
Please suppress it with a |
Aren't capitalized letters supposed to come before small-case letters? |
maybe in some types of sorting, but alphabetizing that list is mainly intended to benefit developers on this project visually inspecting that file, and I don't always remember if, for example, |
Updated. |
Looks like |
Yes please. I think these are all the places: LightGBM/.ci/test_r_package_windows.ps1 Line 125 in 0688f47
LightGBM/.github/workflows/r_package.yml Line 191 in eb686a7
LightGBM/.ci/test_r_package_solaris.sh Line 10 in 5fa887b
LightGBM/.ci/test_r_package.sh Line 114 in 0688f47
Line 316 in d4cdbcf
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some suggestions for how to get the R linter to pass, and a suggestion to please remove unrelated changes from this PR.
Updated. |
lightgbm()
and make default match with Python'slightgbm()
and change default to number of cores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! Thanks very much for your help with this, and with the corresponding changes to the Python package in #5105.
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
ref #4968
The Python interface for
lightgbm
allows passing the number of threads to use as a top-level parameter, while the R interface does not. The Python interface also defaults to using all available threads, while the R interface doesn't.This PR promotes the number of threads towards top-level parameter in
lightgbm()
, changes the default to match Python's, and explains that a number of "zero" means using the configured OMP threads.IMHO the number of threads is a very common thing to want to control (probably more so than other top-level parameters such as
callbacks
), and its default value should be explicit to the user.