Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for GPU installation on Windows #409

Merged
merged 11 commits into from
Apr 17, 2017
Merged

Documentation for GPU installation on Windows #409

merged 11 commits into from
Apr 17, 2017

Conversation

Laurae2
Copy link
Contributor

@Laurae2 Laurae2 commented Apr 12, 2017

Working installation steps, applies to Windows installation for:

  • Command Line Interface (CLI)
  • Python
  • R

Indirectly applies to any Unix-based installation for R (similar steps applies for installation, path, Makeconf, and Makevars modification).

Step by step installation with pictures.

In addition, allow R installation but requires small changes to R's Makeconf. We are using the following variables, as they will not be found easily without:

BOOST_INCLUDE_DIR = "C:/boost/boost-build/include"
BOOST_LIBRARY = "C:/boost/boost-build/lib"
OpenCL_INCLUDE_DIR = "C:/Program Files (x86)/AMD APP SDK/3.0/include"
OpenCL_LIBRARY = "C:/Program Files (x86)/AMD APP SDK/3.0/lib/x86_64"

These variables are identical to what cmake uses.

User is free to change the paths, as it will be dependent on each installation. The example is using AMD OpenCL SDK.

Check out the tutorial as a direct link on this GitHub blob: https://github.com/Laurae2/LightGBM/blob/patch-2/docs/GPU-Windows.md

Edit: see now https://github.com/Microsoft/LightGBM/blob/master/docs/GPU-Windows.md

Tested working on:

  • Intel GPU
  • AMD GPU
  • NVIDIA GPU

Untested using OpenCL CPU, but is already confirmed working on Unix-based OS: #389 (comment)

Currently, GPU version known working on:

OS CLI Python R
Linux Yes Yes Yes
Windows Yes* Yes* Yes
Mac Yes* ** Yes* ?***

* requires a small change
** based on @wxchan work (#389 (comment)) and CLI -> Python installs
*** untested/unknown status

ping @gugatr0n1c for method

ping @huanzhang12 for correctness.

Please test and review steps before merging. We might not have the same Windows environments, even though I tested this on 10 different machines today to use GPU version.

In addition, about the variables used, review if we can get a better way of handling this (BOOST_INCLUDE_DIR, BOOST_LIBRARY, OpenCL_INCLUDE_DIR, and OpenCL_LIBRARY).

@huanzhang12
Copy link
Contributor

@Laurae2 I was able to follow the Windows GPU installation steps on Windows Server 2016 (from Azure) and successfully compiled LightGBM, and also installed the Python interface. All tests can pass as well, despite some small glitches.

The GPU does not achieve the same speedup as on Ubuntu (with the same GPU on Azure), not sure if it is the GPU driver/compiler or GPU configuration problem, but the good news is it works!

Some small suggestions for the building procedure:

  • optionally disable antivirus software (like Windows Defender) temporarily, I found this can greatly increase Boost installation speed. With antivirus on, even unzipping the boost package can take a long time.
  • optionally use parallel build for boost (by adding -j N to b2, where N is the number of cores)

I have fixed the Boost.Compute bug on Windows in boostorg/compute#704. The problem is not the OpenMP parallel region, but an undefined macro. You can try to apply the patch and see if it fixes the Windows builds for you. If yes, we don't need to hack gpu_tree_learner.cpp on Windows.

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 13, 2017

@huanzhang12

All tests can pass as well, despite some small glitches.

Do you know what are the small glitches? I also noticed it's slower in Windows, but I am not sure why it is slower exactly.

Some small suggestions for the building procedure

Added.

I have fixed the Boost.Compute bug on Windows in boostorg/compute#704. The problem is not the OpenMP parallel region, but an undefined macro. You can try to apply the patch and see if it fixes the Windows builds for you. If yes, we don't need to hack gpu_tree_learner.cpp on Windows.

With your proposed change, CLI / Python:

  • .cpp hack not working in Windows (crash)
  • .h hack working in Windows (no crash)
  • Using no hack not working in Windows (crash)

It does not apply to R-package, as it works without any hack (due to being R and its way of installing like Unix variants).

@huanzhang12
Copy link
Contributor

The small glitches are fixed in #411 and #412. Make sure you are testing the latest master.

I think the crash is related to GCC/MinGW version. There can be multiple bugs there. I am using the latest MinGW with GCC 6.3.0 on Windows Server 2016 (should be somewhat equivalent to Windows 10). After fixing boostorg/compute#704 it will not crash.

Can you compile with debug information and get a backtrace on where the bug is?

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 13, 2017

@huanzhang12 I am using commit 3a8b5e5 (6h ago) + MinGW 5.3 for CLI / Python, and MinGW 4.9 for R., with your boostorg/compute#704 fix.

It is possible it is related to MinGW version, currently I have this error with 5.3:

image

Full log if you need, I don't know if it is useful for you:

C:\xgboost\LightGBM\examples\binary_classification>gdb --args "../../lightgbm.exe" config=train.conf data=binary.train valid=binary.test objective=binary device
=gpu
GNU gdb (GDB) 7.10.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ../../lightgbm.exe...done.
(gdb) run
Starting program: C:\xgboost\LightGBM\lightgbm.exe "config=train.conf" "data=binary.train" "valid=binary.test" "objective=binary" "device=gpu"
[New Thread 105220.0x199b8]
[New Thread 105220.0x783c]
[Thread 105220.0x783c exited with code 0]
[LightGBM] [Info] Finished loading parameters
[New Thread 105220.0x19490]
[New Thread 105220.0x1a71c]
[New Thread 105220.0x19a24]
[New Thread 105220.0x4fb0]
[Thread 105220.0x4fb0 exited with code 0]
[LightGBM] [Info] Loading weights...
[New Thread 105220.0x19988]
[Thread 105220.0x19988 exited with code 0]
[New Thread 105220.0x1a8fc]
[Thread 105220.0x1a8fc exited with code 0]
[LightGBM] [Info] Loading weights...
[New Thread 105220.0x1a90c]
[Thread 105220.0x1a90c exited with code 0]
[LightGBM] [Info] Finished loading data in 1.011408 seconds
[LightGBM] [Info] Number of positive: 3716, number of negative: 3284
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6143
[LightGBM] [Info] Number of data: 7000, number of used features: 28
[New Thread 105220.0x1a62c]
[LightGBM] [Info] Using GPU Device: Oland, Vendor: Advanced Micro Devices, Inc.
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...

Program received signal SIGSEGV, Segmentation fault.
0x00007ffbb37c11f1 in strlen () from C:\Windows\system32\msvcrt.dll
(gdb) backtrace
#0  0x00007ffbb37c11f1 in strlen () from C:\Windows\system32\msvcrt.dll
#1  0x000000000048bbe5 in std::char_traits<char>::length (__s=0x0)
    at C:/PROGRA~1/MINGW-~1/X86_64~1.0-P/mingw64/x86_64-w64-mingw32/include/c++/bits/char_traits.h:267
#2  std::operator+<char, std::char_traits<char>, std::allocator<char> > (__rhs="\\", __lhs=0x0)
    at C:/PROGRA~1/MINGW-~1/X86_64~1.0-P/mingw64/x86_64-w64-mingw32/include/c++/bits/basic_string.tcc:1157
#3  boost::compute::detail::appdata_path[abi:cxx11]() () at C:/boost/boost-build/include/boost/compute/detail/path.hpp:38
#4  0x000000000048eec3 in boost::compute::detail::program_binary_path (hash="d27987d5bd61e2d28cd32b8d7a7916126354dc81", create=create@entry=false)
    at C:/boost/boost-build/include/boost/compute/detail/path.hpp:46
#5  0x00000000004913de in boost::compute::program::load_program_binary (hash="d27987d5bd61e2d28cd32b8d7a7916126354dc81", ctx=...)
    at C:/boost/boost-build/include/boost/compute/program.hpp:605
#6  0x0000000000490ece in boost::compute::program::build_with_source (
    source="\n#ifndef _HISTOGRAM_256_KERNEL_\n#define _HISTOGRAM_256_KERNEL_\n\n#pragma OPENCL EXTENSION cl_khr_local_int32_base_atomics : enable\n#pragma OPENC
L EXTENSION cl_khr_global_int32_base_atomics : enable\n\n//"..., context=...,
    options=" -D POWER_FEATURE_WORKGROUPS=5 -D USE_CONSTANT_BUF=0 -D USE_DP_FLOAT=0 -D CONST_HESSIAN=0 -cl-strict-aliasing -cl-mad-enable -cl-no-signed-zeros -c
l-fast-relaxed-math") at C:/boost/boost-build/include/boost/compute/program.hpp:549
#7  0x0000000000454339 in LightGBM::GPUTreeLearner::BuildGPUKernels () at C:\xgboost\LightGBM\src\treelearner\gpu_tree_learner.cpp:583
#8  0x00000000636044f2 in libgomp-1!GOMP_parallel () from C:\Program Files\mingw-w64\x86_64-5.3.0-posix-seh-rt_v4-rev0\mingw64\bin\libgomp-1.dll
#9  0x0000000000455e7e in LightGBM::GPUTreeLearner::BuildGPUKernels (this=this@entry=0x3b9cac0)
    at C:\xgboost\LightGBM\src\treelearner\gpu_tree_learner.cpp:569
#10 0x0000000000457b49 in LightGBM::GPUTreeLearner::InitGPU (this=0x3b9cac0, platform_id=<optimized out>, device_id=<optimized out>)
    at C:\xgboost\LightGBM\src\treelearner\gpu_tree_learner.cpp:720
#11 0x0000000000410395 in LightGBM::GBDT::ResetTrainingData (this=0x1f26c90, config=<optimized out>, train_data=0x1f28180, objective_function=0x1f280e0,
    training_metrics=std::vector of length 2, capacity 2 = {...}) at C:\xgboost\LightGBM\src\boosting\gbdt.cpp:98
#12 0x0000000000402e93 in LightGBM::Application::InitTrain (this=this@entry=0x23f9d0) at C:\xgboost\LightGBM\src\application\application.cpp:213
---Type <return> to continue, or q <return> to quit---
#13 0x00000000004f0b55 in LightGBM::Application::Run (this=0x23f9d0) at C:/xgboost/LightGBM/include/LightGBM/application.h:84
#14 main (argc=6, argv=0x1f21e90) at C:\xgboost\LightGBM\src\main.cpp:7

Line 583 of gpu_tree_learner.cpp:

program = boost::compute::program::build_with_source(kernel_source_, ctx_, opts.str());

@huanzhang12
Copy link
Contributor

huanzhang12 commented Apr 13, 2017 via email

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 13, 2017

@huanzhang12 I tested it on another Windows computer and it does not crash anymore.

On the computer which had LightGBM crashing, I wiped and reinstalled my main LightGBM directory and it also works now (no more crash). It always crashed before even though I was deleting the build directory and lib_lightgbm.dll / lib_lightgbm.dll.a / lightgbm.exe (cmake is caching stuff elsewhere?).

So currently we don't need anymore the small hack as long as we use the new compute module.

Can you update the submodule on LightGBM to boostorg/compute@6de7f64 ? I'll modify my tutorial afterwards to reflect the changes. I will also add a tutorial for debugging using CLI/gdb in case we are getting users with crashes, this will help developers to trace problems faster.

@huanzhang12
Copy link
Contributor

@Laurae2 Sure, I will update the submodule. I didn't update because I wanted to know your testing results. Now it is safe to update.

@huanzhang12
Copy link
Contributor

@Laurae2 Yes, it will be very useful to add a debugging tutorial to collect crash information from users. Thanks!

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 14, 2017

@huanzhang12 done.

If everything is OK, we can ask @guolinke if it's OK to merge and if he is OK about the Makeconf modifications using four predefined variables (I don't have an easier solution so far - but it makes the installation invariant afterwards).

Users who have setup R "perfectly" can install the R-package with GPU support at any time using devtools::install_github("Laurae2/LightGBM", subdir = "R-package"), as no flags can be passed directly to choose compiling with or without GPU.

As with any R package, copy & pasting the library folder from R must work (keeping the GPU identical) even if you get rid of the installation as long as you are not using architecture-dependent compilation variables (-march=native for instance).

@huanzhang12
Copy link
Contributor

@Laurae2 I checked the steps you wrote and they look good! I don't know about R, but I can confirm that CLI and Python work by following the steps.

For GDB debugging, do we need to set CMAKE_BUILD_TYPE=debug? I am not sure on Windows, but on Linux this flag is required to get a full backtrace with exact line number (otherwise only function names are shown).

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 14, 2017

@huanzhang12

For GDB debugging, do we need to set CMAKE_BUILD_TYPE=debug? I am not sure on Windows, but on Linux this flag is required to get a full backtrace with exact line number (otherwise only function names are shown).

correct, forgot this step, I added it!

I don't know about R, but I can confirm that CLI and Python work by following the steps.

For R, it should be OK as I got friends reproducing the installation steps without problems/supervision. However, I don't have the "self-contained package" setup for it. For Windows it is not useful, as it would be reliant on your computer (and you only need to copy & paste the library folder from R to export/install it).

@huanzhang12
Copy link
Contributor

huanzhang12 commented Apr 15, 2017

@Laurae2 The installation steps look good now. I really like the detailed steps and screenshots, which can be really helpful for newbies.

@guolinke Maybe you can take a final review now, probably make some small revisions if needed and merge the PR.

@gugatr0n1c
Copy link

@Laurae2 @huanzhang12 thx for working on WIN tutorial, I will try it with 1080 GTX next week. I guess you used latest cmake (3.8 _x64), right? When I was compiling xgboost couple months ago with MinGW there was critical to use right version of MinGW (with openMP, posix,..). Can you please share which version of MinGW you used? and from which source? (until now I used Visual Studio for WIN).

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 15, 2017

@gugatr0n1c tested under the following conditions:

  • cmake: 3.5 to 3.8
  • MinGW: 4.9 to 6.3

During MinGW installation, you only need to change to the correct architecture x86_64, all the defaults are fine.

Source: on the GPU tutorial file in the first post: https://github.com/Laurae2/LightGBM/blob/patch-2/docs/GPU-Windows.md

Simple MinGW installation: http://iweb.dl.sourceforge.net/project/mingw-w64/Toolchains%20targetting%20Win32/Personal%20Builds/mingw-builds/installer/mingw-w64-install.exe

@gugatr0n1c
Copy link

@Laurae2 great thx, so you are using Mingw-w64 (maybe this can in docs as well? there are severel MinGW versions/forks: MinGW, MinGW-x64, cygwin, neune, tdm fork,..).
Also on homepage of MinGW-x64 is latest version 5.0.3, but on SF is also 6.3, what is little bit confusing. But thx for the link for win-installer.

@Laurae2
Copy link
Contributor Author

Laurae2 commented Apr 16, 2017

Also on homepage of MinGW-x64 is latest version 5.0.3, but on SF is also 6.3, what is little bit confusing.

MinGW latest version is 6.3, you must check here for the new versions: https://sourceforge.net/projects/mingw-w64/files/?source=navbar

The "homepage(s)" you are describing are pages which were not updated manually for many years, they all mostly already moved to sourceforge if you want the new versions (5.x already works fine for 99% of cases anyway, even 4.9 is already enough for most uses).

there are severel MinGW versions/forks: MinGW, MinGW-x64, cygwin, neune, tdm fork,..

Avoid at any cost using MinGW forks such as TDM because they patch differently std::thread (or patch different stuff) which can result in unexpected behaviors/results/crashes, unless you have a real need to use them (even worse in TDM is the mixing of folders - while in original MinGW they are not mixed).

Same rules applies to xgboost compilation. If you have the default MinGW, it goes flawlessly out of the box no matter what your Windows installation is unless your CPU/Windows flavor is not supported. Using forked MinGWs is asking for more troubles, unless - as said earlier -, you have a very specific reason to use them.

so you are using Mingw-w64

It is already in the doc.

maybe this can in docs as well?

It is already in the doc.

@chivee chivee merged commit f154da6 into microsoft:master Apr 17, 2017
@Laurae2 Laurae2 deleted the patch-2 branch April 17, 2017 12:05
@lock lock bot locked as resolved and limited conversation to collaborators Mar 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants