-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build integrated Python package library #3144
Build integrated Python package library #3144
Conversation
CMakeOpenCLPythonPackageCache.txt
Outdated
set(USE_TIMETAG OFF CACHE BOOL "") | ||
set(USE_DEBUG OFF CACHE BOOL "") | ||
set(BUILD_FOR_R OFF CACHE BOOL "") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have other comments to offer yet, but can you please be sure that CMake
cache files are not checked into the repo?
Maybe add a *Cache.txt
entry to .gitignore
, I don't think that would conflict with anything we have intentionally checked in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the early feedback.
The file was meant as a cmake preload file, but since it's likely to cause further confusion, I've moved its contents into the main CMakeLists.txt file and removed the offending file.
I don't believe the failing test (doc build) is related to my patch. |
Just checking in to see if there are any comments on this PR. I don't believe the test failures are related to my changes (two are R-package issues and one seems like a transient download error). Thanks. |
@StrikerRUS can you help to review this PR? |
Sorry, I'm not sure that I'm able to provide a thoughtful review for this PR as I'm not familiar enough with OpenCl. I believe that @huanzhang12 should be the main reviewer for this PR. Of course, after his review I'll review the Python part. |
@huanzhang12 any chance you could review this? Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpboudreau Thanks a lot for your amazing work! Please take a look at one more round of review when have time.
endif() | ||
list(APPEND INTEGRATED_OPENCL_INCLUDES ${OPENCL_ICD_LOADER_HEADERS_DIR}) | ||
list(APPEND INTEGRATED_OPENCL_LIBRARIES ${opencl-icd-loader_BINARY_DIR}/Release/OpenCL.lib cfgmgr32.lib runtimeobject.lib) | ||
list(APPEND INTEGRATED_OPENCL_DEFINITIONS CL_TARGET_OPENCL_VERSION=120) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please comment on this from compatibility side? Why 1.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be the latest earliest stable version that contains all the required api's (and 2.x seems to have been a long experiment: https://www.extremetech.com/computing/309842-opencl-3-0-kicks-off-with-a-huge-step-backwards).
But the library compiles without specifying the version (with a few extra messages), so we can remove this if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification! I'm afraid I have too little knowledge about OpenCL to take a responsibility for this change. I'd better leave it for your decision or more experienced in OpenCL maintainer.
I just want to make it possible that as many as possible users will be able to simply download our wheel and run LightGBM with GPU support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, including the 1.2 guard when you compile the headers prevents you from inadvertently including a call to a 2.x function, which would limit the library to newer implementations. OpenCL versions are backward compatible, so a 2.x implementation can run a 1.2 application/library. (See item 3 here for example: http://developer.amd.com/wordpress/media/2013/12/AMD_APP_SDK_FAQ1.pdf)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! Thanks again!
CMakeIntegratedOpenCL.cmake
Outdated
set(BOOST_BOOTSTRAP "${BOOST_BASE}/source/bootstrap.bat") | ||
set(BOOST_BUILD "${BOOST_BASE}/source/b2.exe") | ||
set(BOOST_FLAGS "") | ||
list(APPEND BOOST_SUBMODULES "libs/*" "tools/*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can workaround 1.5Gb+ size problem by specifying only required submodules here.
IIRC the issue is that all boost libraries were being cloned even though only a handful were needed and I couldn't figure out a way to get around this.
#3144 (comment)
Starting from chrono, filesystem, headers, system, tools/*
by trials and errors I came up with the following list which allows to compile successfully, but I bet you know better which submodules are needed. Also, I believe we can reduce number of submodules under the tools
folder.
list(APPEND BOOST_SUBMODULES "libs/*" "tools/*") | |
list(APPEND BOOST_SUBMODULES "libs/algorithm" "libs/align" "libs/any" "libs/array" "libs/assert" "libs/bind" "libs/chrono" "libs/compute" "libs/concept_check" "libs/config" "libs/container" "libs/container_hash" "libs/core" "libs/detail" "libs/filesystem" "libs/foreach" "libs/format" "libs/function" "libs/function_types" "libs/fusion" "libs/headers" "libs/integer" "libs/io" "libs/iterator" "libs/lexical_cast" "libs/math" "libs/move" "libs/mpl" "libs/multi_index" "libs/numeric/conversion" "libs/optional" "libs/predef" "libs/preprocessor" "libs/property_tree" "libs/range" "libs/ratio" "libs/serialization" "libs/smart_ptr" "libs/static_assert" "libs/system" "libs/throw_exception" "libs/tuple" "libs/typeof" "libs/type_index" "libs/type_traits" "libs/utility" "libs/uuid" "libs/winapi" "tools/boost_install" "tools/build") |
I know, this list is looking super scary, but explicitly listing submodules it is possible to decrease build time and downloading size significantly (1.5Gb+ -> 700Mb
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I believe we can reduce number of submodules under the tools folder.
OK, I've experimented more and it looks like we do not need submodules from . I'm updating my yesterday original comment with new suggestion.tools/
at all
UPD: Was wrong. tools/boost_install
and tools/build
are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@StrikerRUS , thank you for doing this exploration.
I'm not opposed to this change, and I've committed the suggested patch, but I think there are some considerations the maintainers should be aware of.
One of the challenges in this patch was coaxing three different build systems (Boost's B2, cmake, and MSVC/make) to cooperate. And as you know I was originally targeting more than just the Windows platform. I found that the Boost dependency list generated at build time differed significantly by platform -- on Linux (generally) only the few expected dependent submodules were built, on Windows (generally), many more submodules were fetched. This was a bit surprising, but I didn't spend a lot of time chasing down why, because it seemed to me that downloading all submodules was always safe, and that even if I figured it all out for each platform, it would lead to complicated platform checking code that might turn out to be fragile in the long run (meaning a risk of not working anymore if the Boost libs or any of the three build systems changed or started interacting differently).
Of course I realize that there may be costs associated with fetching and processing code unecessarily in the CI pipeline, and also that non-Windows builds may never be needed, so I'm OK with the suggested change. I just wanted to alert you all that you may have to rethink this targeted download approach if other platforms are added or the build becomes unstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I understand! Please correct me: as long as we fix a particular commit in the boost
repository and building the library only for Windows users, it is safe to go with my list. In future we can update this list to support building on Linux machines or add conditioning if(WIN32) ... else() ...
or fall back to downloading all submodules for all OSes.
Despite that we are using only free CI services (BTW, it will be not true anymore after introducing CUDA support #3160), we build nightly artifacts for each commit in master
and in general I believe it is good practice to download things only that are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, but just my curiosity. Don't you know were there any changes in macOS related to OpenCL support? Refer to #1523 (comment) and #667. I believe some macOS users with eGPUs will benefit a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, but I'm not too familiar with the details of OpenCL on MacOS. But the #667 issue you linked to seems (at first glance) like it could be resolved by an approach similar to this PR. I'll look into it further when I get a bit of time.
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Targeted download of Boost submodules. Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpboudreau Excellent contribution! I hope it will greatly help to make GPU installation easier for users. Just one super minor comment below.
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Looks green after the latest sync to master. Thanks @StrikerRUS and @jameslamb for your attention and contributions to this PR. |
@tpboudreau Could you please resolve two conflicts and I think we can merge this PR. |
I think this is ready for merging. I believe the failing test is a resource availability issue unrelated to this patch -- please let me know if you think otherwise. |
Hey @tpboudreau @itamarst ! Is there any news about integration of OpenCL on non-Windows platforms? |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
PLEASE DO NOT MERGE WITHOUT MODIFICATION -- SEE BELOW.
This PR introduces a new build option that builds lib_lightgbm.{so,dll} with the GPU code enabled, but without any runtime dependency on an OpenCL ICD loader or Boost libraries. It does this by integrating the Khronos ICD Loader and Boost filesystem and system libraries into the LightGMB shared library at build time (akin to static linking). This build strategy is useful for building the library for inclusion in a Python wheel targeted for Windows, as it allows the package to run on both GPU and non-GPU (OpenCL) equipped machines.
There are at least two issues that must be resolved before merging:
the static build technique also works on Linux, but will require a small upstream change in the build options for the Khronos ICD library. For now, I'm referencing my fork with that change in the Khronos build script. Depending on community feedback (that is, whether this build option would be useful on Linux too) I'll either pursue the upstream change in the Khronos library or pull out the Linux build code from this PR.
Ideally, when building a dll for inclusion in a Windows wheel, the library name would contain a unique stamp or hash of some kind as a failsafe to avoid conflicts with other copies of the library that might be present on the runtime system (see for example this discussion: https://discuss.python.org/t/packaging-dlls-on-windows/1401). The current dll build options don't do this. I've included logic in the new build path to generate such a stamp, but I've commented out the lines of the script that rename the final library until I hear back from the community on whether this approach is desirable. (It would require some changes to setup.py and perhaps other files, and maybe it belongs in a different PR.)
To run this build you can do: "python setup.py install --opencl-python-package".
Alternatively, to build directly from a git-bash command line on Windows run:
You can also build with the equivalent MSVC GUI actions.
On Linux, build with:
(As mentioned above, the Linux build won't run to completion until we resolve the Khronos upstream change issue.) [EDIT: the Linux build should work now, but the upstream change issue still must be resolved before merging.]