-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] multi-gpu #620
Comments
Sorry for my late reply because I was traveling recently. The multi-GPU case can be view as a special case of distributed feature-parallel training, and many existing code can (hopefully) be reused. Right now we can theoretically run multiple instances of LightGBM on one machine, each assigned a different GPU and a subset of features, and use the feature parallel distributed tree learner. But this makes multiple copies of data and is inefficient. One way to enable multi-GPU is to just allow LightGBM to launch multiple instances of parallel learner, which share the input data. This is not perfect, but I think it is the fastest path and only needs to make minimal changes to GPU code. @guolinke Do you think it is possible? I am currently working on another issue. Currently, we build feature histograms on GPU, and transfer the histograms to CPU to find the best split. The overhead of data transfer is significant on dataset with a lot of features and limits the available speedup. If we find the best split on GPU, it can be done in the very high bandwidth GPU local memory, and eliminate most data transfer overhead. To do this, we also need to store the histograms on GPU, because a future split may need them to construct the feature histogram for the larger child (the subtraction trick). This requires to implement a histogram pool on GPU (similar to what we have on CPU), and we only move histograms to CPU memory when GPU memory is not sufficient. After this is implemented, I can expect a significant speedup on Bosch, YahooLTR and epsilon dataset especially when the number of bins used is larger (255). |
@huanzhang12 , I am running lightgbm in R on a machine with 4 GPU's. I tried to specify gpu_device_id with 0,1,2,3, but it always runs on the default 0 device. Is gpu_device_id not supported by R wrapper, or am I doing something wrong? Need your helps. thanks! Btw, the improvement you are working on can be super useful, as the dataset I am using have ~4K features, the training time is actually slower than using the CPU (16 cores) alone. And I reckon the overhead of data transfer to CPU can becomes a bottleneck for multi-GPU who races for CPU resources |
@zhukunism It runs only on a single GPU currently as explained in this issue |
@Laurae2, I am trying to launch multiple session and let each lightgbm runs on different GPU device, but got no luck. My machine has 4 GPUs. So my question is, can it runs on one specific GPU by configuring the gpu_device_id ? |
@Laurae2 , I fixed the issues after checking that doc. I need to specify both gpu_platform_id and gpu_device_id to use GPU devices other than default one. Thanks! |
so @huanzhang12 training lightgbm make use of all GPUs on machine avaiable like xgboost does? thanks |
Closed in favor of being in #2302. We decided to keep all feature requests in one place. Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature. |
For everyone subscribed to this issue, please try our new experimental CUDA version which was kindly contributed by our friends from IBM. This version supports multi-GPU training. We will really appreciate any early feedback on this experimental feature (please create new issues, do not comment here). How to install: https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html#build-cuda-version-experimental. Argument to specify number of GPUs: https://lightgbm.readthedocs.io/en/latest/Parameters.html#num_gpu. |
hi @StrikerRUS I find in the latest release version the multi-gpu support has been removed. Since the commit 6b56a90 of @shiyu1994 a new cuda learner had been introduced, which completely replaced the old multi-gpu learner later.
|
@huanzhang12
Do you have plan for the multi-gpu support ?
The text was updated successfully, but these errors were encountered: