-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update feature_fraction_bynode #2381
Conversation
now the behavior is the same as xgb. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As usual, minor style comments from me 😄
@StrikerRUS can this be merged? |
Yeah, I think so. I like new way of setting |
A great feature. As this is one of the core elements to make random forests shine at least in high-dimensional settings, I was playing with the diamonds data set in The results are a bit uncomfortable:
What could be the reason for this sudden drop in the lgb version? (Of course, we cannot directly compare the results due to different parametrizations).
|
@mayer79 did you try different seeds? |
Not yet. With this data set, an R-squared of 0.98 is quite bad. So a value of 0.91 is extreme. |
@mayer79 could you provide the data file? csv format will be better. |
It is shipped along with ggplot2 in R. The raw source is https://github.com/tidyverse/ggplot2/blob/master/data-raw/diamonds.csv |
@mayer79 BTW, I change the sample rate of xgb to 0.33, its result is only 0.86:
|
@mayer79 |
@guolinke : It indeed seems like a rounding issue. I was using floating point 1/3 as rate, leading xgb to sample 6/3 = 2 and lgb to sample only one feature. I would not force to sample at least 2 features but rather keep your implementation as it is. |
Maybe rounding up the number of sampled columns would be an idea. |
I just notice that
feature_fraction
has an alias,colsample_bytree
.Therefore, use it for the colsample by node is not straight-forward.
following is the new definition of the
feature_fraction_bynode
:ping @BlindApe for the changes.