Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] [python-package] replace 'python setup.py' with a shell script #5837

Merged
merged 20 commits into from
May 4, 2023

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Apr 18, 2023

Contributes to #5061.

This PR proposes replacing all existing direct invocations of python setup.py with a shell script, build-python-package.sh.

What will the use experience be like?

For people using pip install or conda install, this PR does not change anything.

Anyone currently building / installing the Python package from sources here on GitHub using commands like the following:

cd python-package
python setup.py {command} [OPTIONS]

Would now instead run

sh ./build-python.sh {command} [OPTIONS]

Why a shell script?

Most Python build tools (e.g. wheel, build, pip wheel, scikit-build, hatchling, etc.) assume a directory structure with all source files at or below the level of where the tool is called from. This repo contains more than just a Python package, and so it can't be directly used that way.

This project currently solves that problem via Python code in a setup.py file which reaches up into the parent directory and copying files at build time.

if not IS_SOURCE_FLAG_PATH.is_file():
copy_files_helper('include')
copy_files_helper('src')
for submodule in (CURRENT_DIR.parent / 'external_libs').iterdir():
submodule_stem = submodule.stem
if submodule_stem == 'compute' and not use_gpu:
continue
copy_files_helper(Path('external_libs') / submodule_stem)

As described #5061, the Python packaging community is moving away from supporting such setup.py invocations.

I'm proposing that we use an sh-compatible shell script to create an isolated source directory prior to invoking any Python build tools for the following reasons:

How I tested this

In addition to all off the checks run in CI, I did the following:

@@ -23,7 +23,6 @@ clone_depth: 5

install:
- git submodule update --init --recursive # get `external_libs` folder
- set PATH=%PATH:C:\Program Files\Git\usr\bin;=% # delete sh.exe from PATH (mingw32-make fix)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally put in to work around the following error using mingw32-make.exe + CMake:

CMake Error at C:/Program Files (x86)/CMake/share/cmake-3.16/Modules/CMakeMinGWFindMake.cmake:12 (message):
  sh.exe was found in your PATH, here:
  C:/Program Files/Git/usr/bin/sh.exe
  For MinGW make to work correctly sh.exe must NOT be in your path.
  Run cmake from a shell that does not have sh.exe in your PATH.
  If you want to use a UNIX shell, then use MSYS Makefiles.
Call Stack (most recent call first):
  CMakeLists.txt:39 (project)

If we're going to encourage the use of a shell script to build a development version of the package, that's a problem.

Thankfully, I found that passing a flag through to CMake that says "don't worry, I'm confident having sh.exe on PATH won't be a problem" (as suggested in this Stack Overflow post) works for LightGBM's Python build!

@@ -46,41 +46,6 @@ def find_lib() -> List[str]:
return LIB_PATH


def copy_files(integrated_opencl: bool = False, use_gpu: bool = False) -> None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this copying-files-around stuff is unnecessary now the setup.py will only be invoked from the root of a directory containing all of the C/C++ source files + CMake scripts in the expected places.

@@ -160,7 +125,8 @@ def compile_cpp(
if use_mpi:
raise Exception('MPI version cannot be compiled by MinGW due to the miss of MPI library in it')
logger.info("Starting to compile with CMake and MinGW.")
silent_call(cmake_cmd + ["-G", "MinGW Makefiles"], raise_error=True,
# ref: https://stackoverflow.com/a/45104058/3986677
silent_call(cmake_cmd + ["-G", "MinGW Makefiles", "-DCMAKE_SH=CMAKE_SH-NOTFOUND"], raise_error=True,
Copy link
Collaborator Author

@jameslamb jameslamb Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the successful Appveyor MinGW build for evidence that this works: https://ci.appveyor.com/project/guolinke/lightgbm/builds/46835500/job/a4uo45pxxl8gwwti

Comment on lines +97 to +100
if [[ "$1" != *=* ]];
then shift;
fi
BOOST_INCLUDE_DIR="${1#*=}"
Copy link
Collaborator Author

@jameslamb jameslamb Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pattern allows both of these invocations to work corrrectly:

sh ./build-python.sh bdist_wheel --boost-include-dir /usr/local/src/Boost/include
sh ./build-python.sh bdist_wheel --boost-include-dir=/usr/local/src/Boost/include

as suggested here: https://unix.stackexchange.com/a/580258/550004

@jameslamb jameslamb changed the title WIP: [ci] replace setup.py with a shell script [ci] replace setup.py with a shell script Apr 20, 2023
@jameslamb jameslamb marked this pull request as ready for review April 20, 2023 04:44
@jameslamb jameslamb changed the title [ci] replace setup.py with a shell script [ci] [python-package] replace setup.py with a shell script Apr 20, 2023
@jameslamb jameslamb changed the title [ci] [python-package] replace setup.py with a shell script [ci] [python-package] replace 'python setup.py' with a shell script Apr 20, 2023
@jameslamb
Copy link
Collaborator Author

Alright I'm excited to say that this is ready for review!

If other maintainers agree with this proposal, it'll be a big step towards finally resolving #5061, and I think it'll make the next PR proposing the use of scikit-build-core a bit easier to review.

Since this is a significant breaking change, I'd ideally like approvals from all 3 of @shiyu1994 @guolinke and @jmoralez whenever you have time. Thank you for considering it!

@jameslamb
Copy link
Collaborator Author

@guolinke could you look through this and let me know what you think?

I know that removing the ability to run python setup.py {command} will be disruptive for some users but I think that that's going to be forced on us eventually by the Python packaging tools like setuptools and wheel, based on all the documentation in #5061, and that switching to a shell script means we'll never have to break "build the Python package from git sources" this way again.

@guolinke
Copy link
Collaborator

guolinke commented May 3, 2023

Sorry for missing this PR, will check it today

@jameslamb
Copy link
Collaborator Author

Thanks for the help everyone! This is a painful but I think necessary step towards keeping lightgbm up to date with changes in Python packaging practices.

And I hope this'll also be a step towards unlocking some other features that newer build systems offer, like cross-compilation for platforms we don't currently publish wheels for.

I strongly suspect that this one is going to be found from search engines, so after merging I'm going to lock discussion on this PR.

If you are reading this comment right now and want to say or ask something about this change, please comment on #5061 or open a new issue.

@jameslamb jameslamb merged commit a97c444 into master May 4, 2023
@jameslamb jameslamb deleted the python/use-a-script branch May 4, 2023 22:06
@microsoft microsoft locked as resolved and limited conversation to collaborators May 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants