-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improving LightGBM, XGBoost experience with Dask #104
Comments
Thanks for opening this. I'm generally +1 on the core libraries (XGBoost, LightGBM) handling this if they're willing. There might be some building blocks that should go into |
Great to hear from you. I think both these efforts are seen as high profile and important from the community's point of view. Yes, this is probably the best place to start the conversation aside from the two repos (but maybe xref on dask-ml for a slightly different set of eyes). If you want a more general conversation around the strategy and Dask's involvement in the projects, you might want t attend our community meeting on the first Thursday of the month. |
Hello @jameslamb, great to hear that you are interested in helping us with the dask-lightgbm package. The idea of merging dask-lightgbm into main LightGBM repo seems reasonable to me. I agree with @TomAugspurger that main building blocks could be moved to Second option that came to mi mind is to merge dask-xgboost and dask-lightgbm into single library (or moving to dask-ml), since a lot of the code can be reused, and specify LightGBM and xgboost as its optional dependencies and use it as middle-man between LightGBM,XGBoost and Dask libraries. In general I like the idea. The only thing that I am not sure is where to separate the libraries and what should be maintained where. What are your plans (in LightGBM) for supporting different frameworks for distributed computing? Do you have/plan to have support for spark, flink, etc.? If you already have the support in your code, the effort you proposed could follow similar path. |
cc @RAMitchell @hcho3 @trivialfis from the xgboost side |
Thanks! adding @guolinke , @StrikerRUS from LightGBM
Today in LightGBM, the only option for distributed training that is maintained in the main project is our own ( We produce a library for JVM languages using SWIG. This is then used by the LightGBM doesn't currently directly support Dask...users who want to use LightGBM with Dask go to There are also
I think that keeping the work in the This seems like the conclusion that this thread reached: dask/dask-xgboost#39 I think doing this also reduces the difference between the experience with, say, One reason that I've opened this issue is to say that I'm willing to do the legwork to coordinate between XGBoost and LightGBM to borrow from each other and to find opportunities to push things from the modelling frameworks upstream into |
I've checked the xgboost code for dask integration and it seems reasonable to use the same approach. So, I am ok that dask-lightgbm functionality should be merged directly into LightGBM core library. |
|
Yeah, at dask/dask-xgboost#39.
Last I checked there were a few issues / missing features from
dmlc/xgboost, but the eventual goal is to deprecate dask-xgboost when those
are resolved.
…On Wed, Oct 28, 2020 at 1:38 PM Joshua Patterson ***@***.***> wrote:
Has anyone discussed deprecating dask-xgboost to avoid confusion?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#104 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOISUE2EW7ZFJX7HP653SNBQLDANCNFSM4S7O3ICQ>
.
|
Yep! They're listed in this comment: dask/dask-xgboost#39 (comment). I have some free cycles to work on this, and @hcho3 I'm willing to help with those issues in XGBoost if you need another set of hands 😀 |
@jameslamb Thanks for your offer. Currently the only blocking issue is dmlc/xgboost#5765. in the latest |
Thanks very much @SfinxCZ ! I've talked with other LightGBM maintainers in microsoft/LightGBM#2791 and they're open to the idea as well 🎉 Could you open a pull request into https://github.com/microsoft/LightGBM with the contents of We can talk in the PR about:
And actually, before we do that...are there other |
@jameslamb does that mean LightGBM will potentially have MNMG support :) |
@datametrician it could be a route to that, yes |
@jameslamb I've created new PR microsoft/LightGBM#3515 where I've put the core functionality + unit tests. I would suggest to move the discussion regarding the migration of the code to the PR. Regarding other contributors, I am not sure if others are interested in the discussion (maybe @striajan ?), but I would suggest to start the moving process and if others choose to remain silent I would assume that they agree :-). |
+1 for lazy consensus. In general I think that it's fine to proceed.
…On Sun, Nov 1, 2020 at 2:58 PM Jan Stiborek ***@***.***> wrote:
@jameslamb <https://github.com/jameslamb> I've created new PR
microsoft/LightGBM#3515 <microsoft/LightGBM#3515>
where I've put the core functionality + unit tests. I would suggest to move
the discussion regarding the migration of the code to the PR.
Regarding other contributors, I am not sure if others are interested in
the discussion (maybe @striajan <https://github.com/striajan> ?), but I
would suggest to start the moving process and if others choose to remain
silent I would assume that they agree :-).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#104 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTHUSDARXAOF7YYHH6LSNXRZLANCNFSM4S7O3ICQ>
.
|
Thanks for including me into the discussion. To put my two cents in, integration of this repository into the main LightGBM repository sounds great, even though I do not plan to work on it in the future because of moving to a different company. Maybe @shcherbin would be interested in this discussion? |
The last blocking issue (dmlc/xgboost#5765) has been addressed, so we can proceed with migration from dask-xgboost to xgboost.dask. We are planning to release a Release Candidate for the next release (XGBoost 1.3.0) by end of this month. |
Wanted to update those following this issue. Thanks to a lot of effort from @SfinxCZ , the main parts of I think there will be substantial changes to it between now and the 3.2.0 release of |
I've opened up a Request For Comment in LightGBM, about how LightGBM should get the Dask client you want to use. If anyone here has time / interest, we'd appreciate comments at microsoft/LightGBM#3808. Thanks to @jsignell for already adding her opinion. |
This is the first release of You can see the full details of this release at https://github.com/microsoft/LightGBM/releases/tag/v3.2.0. Thanks to everyone on this issue for your help. I especially want to shout out @ffineis, @jmoralez, and @StrikerRUS for putting a lot of time and energy into this first |
Thanks for the update James! 😀 Nice work everyone! 👏 Should we go ahead and close this issue out then? 😉 |
Woot!
…On Mon, Mar 22, 2021 at 11:58 PM jakirkham ***@***.***> wrote:
Thanks for the update James! 😀Nice work everyone! 👏
Should we go ahead and close this issue out then? 😉
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#104 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTHHK3FODV7LXSUKEH3TFANWLANCNFSM4S7O3ICQ>
.
|
@jakirkham I think it can be closed, yeah! There is still some work to be done but all the relevant conversations are already happening in the specific projects that might be affected, so I don't think we need this issue any more.
I'm going to close this out, thanks again to everyone for your help! |
Yeah it sounds like with XGBoost that |
👋 hello from Chicago!
I'm a LightGBM maintainer and an engineer at Saturn Cloud. I'd like to devote some cycles to improving the experience of using XGBoost and LightGBM on Dask.
I'm planning to talk with the XGBoost maintainers about helping out on the issues noted in dask/dask-xgboost#39 , and starting a similar discussion proposing that we migrate https://github.com/dask/dask-lightgbm into LightGBM directly and maintain it there.
Success for this would look like formally deprecating
dask-xgboost
anddask-lightgbm
eventually and focusing all development in XGBoost and LightGBM (similar to how xgboost-spark and xgboost-flink are maintained in the mainxgboost
project).What do folks here think about this? Are there other groups besides the maintainers of projects I've mentioned above that should be involved in this conversation?
Thanks for your time and consideration.
The text was updated successfully, but these errors were encountered: