Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared libraries are not stripped in the manylinux1 wheels #19531

Closed
xhochy opened this issue Feb 4, 2018 · 8 comments
Closed

Shared libraries are not stripped in the manylinux1 wheels #19531

xhochy opened this issue Feb 4, 2018 · 8 comments
Labels
Build Library building on various platforms
Milestone

Comments

@xhochy
Copy link
Contributor

xhochy commented Feb 4, 2018

In all yet uploaded Pandas versions for manylinux1, the shared libraries are not stripped and thus ship a lot of bytes overhead.

# du -sh /opt/_internal/cpython-3.5.4/lib/python3.5/site-packages/pandas/_libs/
70M	/opt/_internal/cpython-3.5.4/lib/python3.5/site-packages/pandas/_libs/

Calling strip on these saves 53MiB of space.

# strip /opt/_internal/cpython-3.5.4/lib/python3.5/site-packages/pandas/_libs/*.so
# du -sh /opt/_internal/cpython-3.5.4/lib/python3.5/site-packages/pandas/_libs/
17M	/opt/_internal/cpython-3.5.4/lib/python3.5/site-packages/pandas/_libs/

If someone could point me to the official scripts that generate these wheels, I would volunteer to change the build process that we only include stripped binaries.

@jreback
Copy link
Contributor

jreback commented Feb 4, 2018

https://github.com/MacPython/pandas-wheels

is our repo for building

prob need to patch the upstream one though (which is where most of the actual building code is)

@TomAugspurger
Copy link
Contributor

https://github.com/matthew-brett/multibuild is the upstream repo.

@xhochy
Copy link
Contributor Author

xhochy commented Feb 4, 2018

Small size update with gcc-5 (currently working on setting up a fork of a the release builds). Here are the sizes for different configurations (first the unpacked wheel content, then the wheel itself).

Default build options:

104M    pandas
27M     pandas-0.23.0.dev0+218.g3f3b4e0-cp36-cp36m-linux_x86_64.whl

C{,XX}FLAGS='-Wl,-strip-all'

44M     pandas
11M     pandas-0.23.0.dev0+218.g3f3b4e0-cp36-cp36m-linux_x86_64.whl

C{,XX}FLAGS='-flto'

46M     pandas
12M     pandas-0.23.0.dev0+218.g3f3b4e0-cp36-cp36m-linux_x86_64.whl

C{,XX}FLAGS='-Wl,-strip-all' -flto

44M     pandas
11M     pandas-0.23.0.dev0+218.g3f3b4e0-cp36-cp36m-linux_x86_64.whl

Size-wise, the important option is -Wl,-strip-all'. Link-time optimisation only shrinks down the non-stripped wheel but does not make a difference on the stripped version. As LTO might also improve performance, I'm going to check if all tests pass and if there is a noticable difference in the benchmarks. Otherwise, I would only add -Wl,-strip-all' to the current build flags.

@jreback jreback added the Build Library building on various platforms label Feb 4, 2018
@jreback jreback added this to the 0.23.0 milestone Feb 4, 2018
@xhochy
Copy link
Contributor Author

xhochy commented Feb 4, 2018

The release build job using -Wl,-strip-all also passed: https://travis-ci.org/xhochy/pandas-wheels/builds/337220847 and the resulting wheel had the expected size of 11M instead of the currently uploaded 23M. The necessary change was: xhochy/multibuild@453308d

As a PR to multibuild would also affect other packages and there have been issues in the past (pypa/manylinux#119, should be resolved with using the latest multibuild commit and manylinux1 image), I'm going to check if the changes works for a number of other packages.

@jreback
Copy link
Contributor

jreback commented Feb 4, 2018

@xhochy are these changes that should be in our compilation steps or only the wheel building steps? (or both)?

@xhochy
Copy link
Contributor Author

xhochy commented Feb 4, 2018

-Wl,-strip-all should only be in the wheel building. Wheres -flto might be beneficial in general but that first needs to show any advantage.

@jreback
Copy link
Contributor

jreback commented Feb 4, 2018

can u open an issue against pandas wheels then?
and link the multi build hash

@jreback jreback closed this as completed Feb 4, 2018
@xhochy
Copy link
Contributor Author

xhochy commented Feb 4, 2018

@jreback will do once verified that it works with other packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

No branches or pull requests

3 participants