Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmake] Some improvements to handling of OpenMP on macOS #6489

Merged
merged 8 commits into from
Jul 14, 2024

Conversation

barracuda156
Copy link
Contributor

Please review.

This is still hackish and makes assumptions which may not hold. However, it is arguably a bit more sane:

  1. While retaining default usage of Homebrew, let a user to disable it.
  2. Do not bake in paths to libomp with GCC, which uses its own libgomp (and which normally does not need specific paths at all).

@borchero
Copy link
Collaborator

@jameslamb we've repeatedly had issues with OpenMP on MacOS, I'm wondering whether we should just advertise a conda-based compilation process. I successfully compiled locally with the following environment:

dependencies:
  - python
  - cxx-compiler
  - llvm-openmp
  - cmake
  - make

and by setting CXXFLAGS="-I${CONDA_PREFIX}/include".

@barracuda156
Copy link
Contributor Author

@borchero I think any default which you (upstream) prefer is fine, just do not hardcode it, otherwise everyone else is forced to patch the code to get around that.

Compilation works perfectly fine in MacPorts, for example, but I have to throw away huge chunks from CMakeLists now, because we do not want rpaths, and certainly do not wait brewisms, and even less so hardcoded usage of incompatible libraries.

That solves the problem for us, but it still exists elsewhere, since some thirdparty software borrow pre-built LightGBM, and that has hardcoded paths to Homebrew prefix, which of course cannot work in any other setup: ankane/lightgbm-ruby#7

@jameslamb
Copy link
Collaborator

Thanks for your interest in LightGBM.

To start, please... don't come here and say that the current state is not "sane". We can discuss the relative benefits and disadvantages of different approaches without insulting each other.


Compilation works perfectly fine in MacPorts, for example, but I have to throw away huge chunks from CMakeLists now, because we do not want rpaths, and certainly do not wait brewisms, and even less so hardcoded usage of incompatible libraries.

Can you link us to the patches you're using to do that, so we can see specifically what you're referring to?


that has hardcoded paths to Homebrew prefix, which of course cannot work in any other setup: ankane/lightgbm-ruby#7

The hard-coded install name /opt/homebrew/opt/libomp/lib/libomp.dylib mentioned in that issue was removed in v4.4.0, thanks to the changes in #6391.


I'm wondering whether we should just advertise a conda-based compilation process.

The project should already support this without any need to add any other headers, and without adding any new conda-specific changes to its CMakeLists. conda's compilers do all that manipulation of includes, linker paths, etc. for you as a part of how they work.

how I tested that (click me)
conda create \
    --name delete-me \
    -c conda-forge \
    --yes \
        python=3.10 \
        cmake \
        cxx-compiler \
        llvm-openmp

source activate delete-me

cmake -B build -S .
cmake --build build --target _lightgbm -j4

That produces a library with the expected path entries.

otool -L lib_lightgbm.dylib
# lib_lightgbm.dylib:
#	@rpath/lib_lightgbm.dylib (compatibility version 0.0.0, current version 0.0.0)
#	@rpath/libomp.dylib (compatibility version 5.0.0, current version 5.0.0)
#	@rpath/libc++.1.dylib (compatibility version 1.0.0, current version 1.0.0)
#	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

And an RPATH entry pointing to where libomp.dylib was found in conda's libraries during compilation.

otool -l lib_lightgbm.dylib
# Load command 15
#          cmd LC_RPATH
#      cmdsize 56
#         path /Users/jlamb/miniforge3/envs/delete-me/lib (offset 12)
# Load command 16
#          cmd LC_RPATH
#      cmdsize 48
#         path /opt/homebrew/opt/libomp/lib (offset 12)

The Python package built in that way works without issue.

source activate delete-me
sh build-python.sh bdist_wheel install
conda install -c conda-forge --yes pandas scikit-learn
python examples/python-guide/sklearn_example.py

If you tried this and observed something different, please tell me.


Do not bake in paths to libomp with GCC, which uses its own libgomp

Can you share an example where you ran into this issue? Because the install name LightGBM uses for the OpenMP it found at build time should be libgomp.dylib when using gcc.

That's what I see (on my M2 Mac).

brew install gcc
export CC=gcc-14
export CXX=g++-14

cmake -B build -S .
build logs showing 'gcc -fopenmp' was used (click me)
-- The C compiler identification is GNU 14.1.0
-- The CXX compiler identification is GNU 14.1.0
-- Checking whether C compiler has -isysroot
-- Checking whether C compiler has -isysroot - yes
-- Checking whether C compiler supports OSX deployment target flag
-- Checking whether C compiler supports OSX deployment target flag - yes
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/homebrew/bin/gcc-14 - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Checking whether CXX compiler has -isysroot
-- Checking whether CXX compiler has -isysroot - yes
-- Checking whether CXX compiler supports OSX deployment target flag
-- Checking whether CXX compiler supports OSX deployment target flag - yes
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/homebrew/bin/g++-14 - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test MM_PREFETCH
-- Performing Test MM_PREFETCH - Failed
-- Performing Test MM_MALLOC
-- Performing Test MM_MALLOC - Failed
-- Configuring done (2.8s)
-- Generating done (0.1s)
-- Build files have been written to: /Users/jlamb/repos/LightGBM/build
cmake --build build --target _lightgbm -j4

otool showing that libgomp.dylib, not libomp.dylib, was linked.

otool -L lib_lightgbm.dylib
./lib_lightgbm.dylib:
	@rpath/lib_lightgbm.dylib (compatibility version 0.0.0, current version 0.0.0)
	/opt/homebrew/opt/gcc/lib/gcc/current/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.33.0)
	/opt/homebrew/opt/gcc/lib/gcc/current/libgomp.1.dylib (compatibility version 2.0.0, current version 2.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

It would be a problem for relocation to have that absolute, Homebrew-specific install name for libgomp.dylib in the binary... something not caught here because from this repo all of the macOS binaries we build for redistribution are built with clang. I'd welcome a change to allow this project to produce more relocation-friendly binaries on macOS using gcc... although I'm not sure that the current approach you're proposing in this PR would do that. Like this:

if(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
          INSTALL_RPATH "/opt/homebrew/opt/libomp/lib;/opt/local/lib/libomp;${OpenMP_LIBRARY_DIR}"
      else()
          INSTALL_RPATH "${OpenMP_LIBRARY_DIR}"
      endif()

I suspect OpenMP_LIBRARY_DIR will still be an absolute path at that point, and maybe not a highly portable one if using Homebrew's gcc / g++.

While retaining default usage of Homebrew, let a user to disable it.

I'm not convinced that the case you've described justifies further complicating the API of project's CMakeLists with a new top-level option like USE_BREW. The code I think you're referring to, starting here:

LightGBM/CMakeLists.txt

Lines 164 to 167 in d56a7a3

if(NOT OpenMP_FOUND)
# libomp 15.0+ from brew is keg-only, so have to search in other locations.
# See https://github.com/Homebrew/homebrew-core/issues/112107#issuecomment-1278042927.
execute_process(COMMAND brew --prefix libomp

will run only if find_package(OpenMP) has not found OpenMP by other means. If MacPorts is placing libomp.dylib at a standard path like /usr/local/lib, I'd be surprised to learn that find_package(OpenMP) is not finding it.

Can you describe precisely, in a way that I could reproduce, a case where you saw different behavior?

@barracuda156
Copy link
Contributor Author

@jameslamb Thank you for responding in detail.

We can discuss the relative benefits and disadvantages of different approaches without insulting each other.

Absolutely. I apologize if my choice of words created such an impression, it was not intended. (I readily admit that my own code is not sane in some instances.)

Can you share an example where you ran into this issue? Because the install name LightGBM uses for the OpenMP it found at build time should be libgomp.dylib when using gcc.

Yes, you are right, of course. I just did not see any relevant condition on compiler here:

INSTALL_RPATH "/opt/homebrew/opt/libomp/lib;${OpenMP_LIBRARY_DIR}"

So my impression was that this will still be used regardless. Sorry, if I misread the code.

It would be a problem for relocation to have that absolute, Homebrew-specific install name for libgomp.dylib in the binary.

Any specific hardcoded path is problematic for general-case distribution, even without any concern for relocating a binary, since there are no standard paths for non-system libraries. I do not know a solution for a general case, as long as pre-built binaries/libraries are distributed. (Using rpaths is not perfect either, though preferable to baked in paths to any package manager.)
For building from source a general solution is allowing the build to be configurable (while maintaining default behavior which you consider optimal). A downstream distribution using Unix-style installations may not need or want any rpaths being used, since absolute paths are more robust (provided they match the actual environment, of course).

I'm not convinced that the case you've described justifies further complicating the API of project's CMakeLists with a new top-level option like USE_BREW.

To be honest I would rather remove all package-specific code, since it is redundant at best: package managers handle install prefix etc. in their own build systems. But I think someone may get upset, whether now or at some point in a future, if such commit is merged, that somebody from MacPorts removed Homebrew code :) So I do not want to do that.
But I hope Homebrew is capable enough to handle its installations, and in that case this code is indeed unneeded.

However if the approach is to leave defaults as they are (there may be reasons for that which I did not think of, after all), then there is a case to allow disabling certain default behavior. It is conceivable that someone may have Homebrew and Macports both, or have Homebrew but trying to build without relying on a package manager etc.
(I have seen cases when some other software packages tried to download and install some random stuff. It is good if it fails explicitly, worse if it succeeds without one noticing what is going on.)
But yes, I have noticed it is a fallback.

If MacPorts is placing libomp.dylib at a standard path like /usr/local/lib, I'd be surprised to learn that find_package(OpenMP) is not finding it.

MacPorts is placing it in /opt/local/lib/libomp because we do not want it to be found accidentally. But the codebase takes care of finding it when it is needed, so there is no problem in this sense. I rather see a problem in something being found when it should not be (from a point of view of a given user).

Can you link us to the patches you're using to do that, so we can see specifically what you're referring to?

Sure, but this is our local patch, not a proposal for changes (I understand it may not fit the needs/preferences of others).
https://github.com/macports/macports-ports/blob/b9671ddc017ddf902b248ea760e7fd2a05178792/math/LightGBM/files/0001-Fix-CMakeLists.txt.patch
(Though IMO it will be cool to have configure options to use external libraries instread of building and installing duplicates.)

@barracuda156
Copy link
Contributor Author

@jameslamb Should I drop the second commit and leave only 26cc564 which should be uncontroversial?

@jameslamb
Copy link
Collaborator

It may be a few days until I'm able to provide a thoughtful answer here, sorry.

The state of these codepaths is very focused on building the library for redistribution (e.g. in Python wheels) and you've brought up some excellent points about how that might make other types of builds more difficult. I need to find a bit of time to think carefully about this.

@barracuda156
Copy link
Contributor Author

@jameslamb Sure, thank you. No hurry here.

@jameslamb
Copy link
Collaborator

IMO it will be cool to have configure options to use external libraries instread of building and installing duplicates.

Thanks for this. We intentionally vendor sources of specific, fixed commits of Eigen, fmt, fast_double_parser, and (in some builds) Boost here for stability reasons and because the sources are relatively small. I'd like to preserve that pattern... this project has been struggling for years from a lack of maintainer availability (relative to its size), and I don't want to take on the packaging and maintenance burden of allowing those dependencies to be pluggable.

But if you do want to propose that separately, we'd be happy to talk about it more on a separate issue.

A downstream distribution using Unix-style installations may not need or want any rpaths being used, since absolute paths are more robust

Sure, and this is why conda re-writes all of the paths embedded in the binaries it creates (as one example): https://docs.conda.io/projects/conda-build/en/latest/resources/make-relocatable.html.

if the approach is to leave defaults as they are (there may be reasons for that which I did not think of, after all)

Yes I'd like to preserve the defaults. In short, we want to support the following:

  • distribute pre-compiled binaries for macOS that are compiled with clang + use LLVM OpenMP (libomp)
  • where those are Python wheels (which contain a lib_lightgbm.dylib):
    • if the library is loaded into a process that already has a libomp.dylib loaded (e.g. by some other library), dynamically link to that instead of loading a second, different copy of libomp.dylib
    • if the linker needs to search for libomp.dylib, it should eventually try wherever Homebrew puts libomp.dylib before raising a runtime error

In this project, we are producing shared libraries that are distributed in Python wheels installed with e.g. pip... a package manager that does not have a distribution of OpenMP. And to make things even more fun, we want to support the common case of installing such a wheel into an environment otherwise managed by conda (for, e.g., building one of the many variants of the Python package that we do not publish precompiled binaries for, like pip install --no-binary lightgbm -C cmake.define=-DUSE_CUDA=ON).

Given that, I strongly thing the project should continue to set the install name for its OpenMP dependency to @rpath/lib[go|io|o]mp.dylib.

MacPorts is placing it in /opt/local/lib/libomp because we do not want it to be found accidentally.

Ah sorry, my mistake. Thank you for explaining, that makes sense to me.


Based on my read of the things you've written, and reviewing other OpenMP codepaths in LightGBM's CMake configuration, I'm open to adding a new CMake option as you've proposed.

Here's what I'd like to propose, please let me know what you think:

  1. call this new option USE_HOMEBREW_FALLBACK
    • default ON to preserve the current behavior
    • docstring "(macOS-only) set to OFF to avoid looking in 'brew --prefix' for libraries (e.g. OpenMP)"
  2. change this other compiler condition from STREQUAL "Clang" to MATCHES "Clang" as you have in this PR
  3. change any other mentions of 'libomp' in code comments in CMakeLists.txt that are not LLVM-specific to "OpenMP" or "lib[go|io|o]mp" to make it clearer that they shouldn't be LLVM-specific
  4. put OpenMP_LIBRARY_DIR as the first entry on the list of clang paths added to the RPATH, like this:
if(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
    INSTALL_RPATH "${OpenMP_LIBRARY_DIR};/opt/homebrew/opt/libomp/lib;/opt/local/lib/libomp"
else()
    INSTALL_RPATH "${OpenMP_LIBRARY_DIR}"
endif()

That should ensure that if you use the shared library on the same system where you built it, the location where libomp.dylib was found at build time will be the one that's loaded and runtime.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving a blocking review based on my last comment (meant to post that as a review not a regular comment)

@barracuda156
Copy link
Contributor Author

barracuda156 commented Jun 23, 2024

@jameslamb Thank you for reviewing, sounds good to me. I will deal with this tomorrow and rebase the PR.

P. S. As for supporting external dependencies, I do not think as a non-default option it is likely to increase maintenance burden. A notice can be added that such configuration is not tested / not guaranteed to work (or something to this effect). Maybe also use mark_as_advanced.

@jameslamb
Copy link
Collaborator

@barracuda156 would it be ok if I push a commit to this PR with the changes recommended in #6489 (comment)?

We are going to have to do a new release very soon to keep the R package on CRAN (#6522) and I'd love to get this fix into that release.

@barracuda156
Copy link
Contributor Author

@jameslamb Yes, please, and thank you very much.

@jameslamb
Copy link
Collaborator

(I'll fix the cmakelint error in the next commit)

@barracuda156 I've pushed some changes here, could you please take a look and let me know what you think?

@barracuda156
Copy link
Contributor Author

@jameslamb LGTM, thank you very much.

@jameslamb
Copy link
Collaborator

Since I pushed some commits here, my approval probably shouldn't be enough for a merge.

@borchero could you look again whenever you have time?

@jameslamb jameslamb mentioned this pull request Jul 12, 2024
27 tasks
@jameslamb
Copy link
Collaborator

I've updated this to latest master. I feel confident merging it based on @guolinke 's approval and @barracuda156 's verbal approval.

I'll merge this once CI succeeds. @borchero if you come back and look later whenever you have time, please do comment here if you see anything problematic. We can always make follow-up changes.

@jameslamb jameslamb changed the title Some improvements to handling of OpenMP on macOS [cmake] Some improvements to handling of OpenMP on macOS Jul 14, 2024
@jameslamb jameslamb merged commit 4842832 into microsoft:master Jul 14, 2024
45 checks passed
@barracuda156 barracuda156 deleted the macports branch July 16, 2024 13:04
@barracuda156
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants