Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] [ci] switch to PEP 517 / 518 builds (remove setup.py) (fixes #5061) #5759

Merged
merged 110 commits into from
Jun 15, 2023

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Mar 1, 2023

Fixes #5061.

This PR proposes removing the project's setup.py, and instead using scikit-build-core (https://github.com/scikit-build/scikit-build-core) as a build backend.

scikit-build-core has been developed with support from an NSF grant (link) and is being developed by the same team developing scikit-build, which is used by other popular stats and machine learning projects, like:

(I got that list from https://github.com/orgs/scikit-build/discussions/976)

scikit-build-core is purpose-built for Python packages that use CMake to compile one or more other artifacts (in our case, the lib_lightgbm shared library).

What would happen if we did nothing?

Users would have to pin to older versions of setuptools and pip to be able to install lightgbm (which will make lightgbm harder to use alongside other packages). We've already started seeing evidence of that pain:

Benefits of this approach

  • users can use the newest versions of pip, setuptools, and build in environments where they install lightgbm
  • no more need to maintain 400ish lines of Python code in setup.py
  • makes it easier to compile wheels for more platforms in the future (e.g. MUSL-based wheels for Alpine Linux, macosx-arm64 for the M1/M2 Macs)

What other approaches could we take?

These are all possible routes to resolve #5061.

I'm proposing using scikit-build-core because it's powerful while also significantly simplifying the code that we have to manage in this project, and because it's maintained by members of PyPA who will keep it up to date with the latest changes in Python packaging (i.e. the latest accepted PEPs related to packaging).

Existing features preserved with this approach

list of features (click me)

All of the follow commands are currently supported, and we should try to preserve as many of them as possible.

  • {command} is python setup.py right now
  • {dist} is either the name lightgbm (pull from PyPI) or a path to a *.tar.gz or *.whl

Installing from source tree.

{command} install
{command} install --precompile
{command} install --precompile --user
{command} install --mingw
{command} install --mpi
{command} install --nomp
{command} install --gpu
{command} install --gpu --opencl-include-dir=/usr/local/cuda/include/
{command} install --cuda
{command} install --hdfs
{command} install --bit32

Building an sdist.

{command} sdist
{command} sdist --formats gztar

Building wheels.

{command} bdist_wheel --plat-name=macosx --python-tag py3
{command} bdist_wheel --integrated-opencl --plat-name=$PLATFORM --python-tag py3
{command} bdist_wheel --integrated-opencl --plat-name=win-amd64 --python-tag py3
{command} bdist_wheel --gpu
{command} bdist_wheel --cuda
{command} bdist_wheel --mpi

Customized installation of an sdist (possibly from PyPI).

pip install {dist} --install-option=--nomp
pip install {dist} --install-option=--mpi
pip install {dist} --install-option=--gpu
pip install {dist} --install-option=--gpu --install-option="--opencl-include-dir=/usr/local/cuda/include/" --install-option="--opencl-library=/usr/local/cuda/lib64/libOpenCL.so"
pip install {dist} --install-option=--cuda
pip install {dist} --install-option=--hdfs
pip install {dist} --install-option=--mingw
pip install {dist} --install-option=--bit32
pip install {dist} --install-option=--time-costs
pip install {dist}[dask]
pip install --no-binary `:all:` {dist}
pip install --user {dist}

Existing features lost with this approach

  • automatically redirecting CMake logs to a file (code link)
    • CMake logs will now just be written to stdout directly, and users will have to redirect it themselves if they want to save it to a file
    • honestly, I think this is an improvement... no more seeing a remote build like a CI job fail without knowing the underlying CMake error
  • using MSBuild to compile with pip install {sdist}, then multiple versions of MSVC (link)
  • compiling the 32-bit version with pip install {sdist} (link)
    • as of this PR, the only way to do that type of build will be to clone the repo and run sh build-python.sh install --bit32

@jameslamb jameslamb marked this pull request as ready for review June 9, 2023 14:22
@jameslamb
Copy link
Collaborator Author

Happy to say this is finally ready for review!

I understand that what I'm proposing is a major breaking change... but I don't think we have a choice. We're relying on a pattern for the Python package which the Python packaging tools (pip, setuptools, build) no longer want to support: #5061 (comment)

Given how impactful this change could be, I won't merge it until there are at least two approvals from some mix of @shiyu1994 , @guolinke , and @jmoralez .

Thanks in advance for your thorough review on this, take all the time you need and please ask any questions you have. I'll do my best to answer them.

@@ -84,7 +84,7 @@ Build GPU Version

.. code:: sh

pip install lightgbm --install-option=--gpu
pip install lightgbm --config-setting=cmake.define.USE_GPU=ON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pip exposes --config-settings option: https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-C. Is --config-setting coming from somewhere else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, thank you! No you're absolutely right, this is a mistake on my part.

I was mixing two things... pip has --config-settings (with an s), and build has --config-setting (no s) (code link).

I was following from https://github.com/scikit-build/scikit-build-core/blob/a145985892e4c904f4e4f089f987b4fef491543d/README.md?plain=1#L112-L113 and didn't read closely enough.

Just pushed 41badff fixing that.

Copy link
Contributor

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it interesting that you were able to use scikit-build-core without adding any custom glue code. I suppose that build-python.sh does much of the heavy-lifting of packaging the right files in the right place?

For now, XGBoost adds a custom PEP 517 glue code to marshal C++ source files, but in the future I may consider writing a bash script instead, should we decide to transition to scikit-build-core. Thanks for the inspiration!

@jameslamb
Copy link
Collaborator Author

Thanks so much for the review help @hcho3 ! I know you're busy enough with the other projects you work on, I always really appreciate your help here.

build-python.sh does much of the heavy-lifting of packaging the right files in the right place

Yep, exactly! I described my perspective on this in #5837 if you want to see a bit more detail on how I'm thinking about it.

Essentially, I think any Python user of LightGBM whose needs aren't met by pip install (from PyPI) or conda install (from conda-forge) will already need to clone the repo and run some command(s). Making that "run a shell script" should be just as easy for those people as python setup.py install or pip install ., so it's worth switching to that in exchange for simplifying the build system for the artifacts we distribute via PyPI.

I'm also hoping that having a zero-glue-code setup with scikit-build-core will make it easier in the future to build wheels for more platforms, further reducing the need for people to build custom from the sources here in GitHub.

We've been used a similar approach with the R package in this repo for a few years, and I think it's worked well so far: https://github.com/microsoft/LightGBM/blob/master/build-cran-package.sh.

"Operating System :: POSIX",
"Operating System :: Unix",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't Python 3.6 be here given that we have requires-python = ">=3.6"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sure, I can add that.

It's not there in setup.py and I was just copying from that.

classifiers=['Development Status :: 5 - Production/Stable',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: MIT License',
'Natural Language :: English',
'Operating System :: MacOS',
'Operating System :: Microsoft :: Windows',
'Operating System :: POSIX',
'Operating System :: Unix',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3.10',
'Topic :: Scientific/Engineering :: Artificial Intelligence',
'Typing :: Typed'])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer to #4891

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, thanks @guolinke for reminding me ... in #4891 we dropped support, meaning:

  • dropped that trove classifier 'Programming Language :: Python :: 3.6' from the PyPI labels
  • do not have any CI jobs testing Python 3.6

And we decided not to move up to requires-python = ">=3.7" so that people could technically try to build for Python 3.6 if they needed that a bit longer (as @jmoralez also referred to in #5765 (comment)).

By the way, I tried tonight and it looks like lightgbm is still at least minimally compatible with Python 3.6 🎉 .

docker run \
    --rm \
    -it python:3.6 \
    bash

apt-get update
apt-get install \
    --no-install-recommends \
    -y \
        cmake

git clone --recursive \
    https://github.com/microsoft/LightGBM.git

cd LightGBM
sh ./build-python.sh install --nomp

pip install pandas
python ./examples/python-guide/simple_example.py

logs:

Loading data...
Starting training...
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002058 seconds.
You can set `force_col_wise=true` to remove the overhead.
Training until validation scores don't improve for 5 rounds
Did not meet early stopping. Best iteration is:
[20]	valid_0's l2: 0.198063	valid_0's l1: 0.431129
Saving model...
Starting predicting...
The RMSE of prediction is: 0.4450426449744025

I don't feel strongly either way about whether or not to have the Python 3.6 trove classifier. Since @StrikerRUS removed it in #4891, I think we should defer to his judgment and continue to omit it. Is that ok @jmoralez ? If, after reading this, you still feel strongly that it should be included since lightgbm at least minimally supports Python 3.6, I'll add it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ok, I agree with not adding it back.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, thanks for talking through it with us! In another PR, I might put up a minimal CI job that runs exactly that dockerized little script I just shared above. So we can rely on CI to catch issues of the form "this PR would be a hard compatibility break because it's not even syntactically valid Python 3.6".

I don't want to take on any more support than that for Python 3.6, but would be useful (I think) to get automated feedback at that level at least, and I'd expect that job shouldn't require too much maintenance burden.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those finding this conversation from search... here's the PR I promised: #5936.

@jameslamb
Copy link
Collaborator Author

Thanks again for the reviews (on this PR and the many that led up to it)!!!

Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] [python] 'setup.py install' is deprecated
4 participants