Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CPython 3.11, 3.12, and aarch64 processors #2331

Merged
merged 91 commits into from
Sep 11, 2024

Conversation

ddelange
Copy link
Contributor

@ddelange ddelange commented Jan 20, 2023

Hoi 👋

linux-aarch64 makes up for almost 10% of all platforms ref giampaolo/psutil#2103

aarch64 has already surpassed windows in terms of downloads for this package. Oracle, Amazon, Google, and Microsoft are all offering aarch64 cloud instances at an undeniable price point compared to amd/intel, so the demand will undoubtedly only grow

  • this PR is adapted from Add arm64 mac and linux wheels MagicStack/asyncpg#954
  • uses QEMU emulation for linux arm64 wheels: manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs 😅
  • manylinux2014 wheels are built with GCC 10, which I think does not guarantee proper functioning of pybind11 (docs).
    • so with this PR, linux wheels are built with GCC 12 (manylinux_2_28).
    • pip will only install these wheels on linux operating systems with glibc >= 2.28 (mostly all 2020+ linux distributions like debian 10 buster, ubuntu 20.04 focal, almalinux/rhel 8, ...).

the wheels from this PR can be installed with:

# comma separated list for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4
pip install --force-reinstall vaex

fixes #2366, fixes #2368, fixes #2397, closes #2427, fixes #2384

@maartenbreddels
Copy link
Member

Hoi 👋

exciting, will take a look early next week!

  • manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs

that worries me a bit.. :)

groeten,

Maarten

@ddelange
Copy link
Contributor Author

ddelange commented Jan 21, 2023

here are all timings: https://github.com/ddelange/vaex/actions/runs/3965720337/usage

depending on how often a month you release vaex, this could eat into the 2k free minutes of GH...

as the parallelization is maximised and they're pushed to PyPI as soon as they're built, most of the wheels will be available soon upon release regardless

here are all the wheels: distributions.zip

@ddelange
Copy link
Contributor Author

interestingly, that was 8260 minutes ^

apparently that's OK? then I don't understand their explanation 🤔 https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#included-storage-and-minutes

@ddelange
Copy link
Contributor Author

ddelange commented Jan 21, 2023

ah there is a fair amount of duplication in that usage table for whatever reason 🤯

@ddelange
Copy link
Contributor Author

a diff of current PyPI vs the zip above:

 vaex_core-4.16.1-cp310-cp310-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp310-cp310-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-win_amd64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_10_9_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_11_0_arm64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-win_amd64.whl
 vaex_core-4.16.1-cp36-cp36m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp36-cp36m-win_amd64.whl
 vaex_core-4.16.1-cp37-cp37m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp37-cp37m-win_amd64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp38-cp38-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-win_amd64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp39-cp39-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-win_amd64.whl

Comment on lines -16 to -23
namespace std {
template<>
struct hash<PyObject*> {
size_t operator()(const PyObject *const &o) const {
return PyObject_Hash((PyObject*)o);
}
};
}
Copy link
Contributor Author

@ddelange ddelange Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maartenbreddels any thoughts on this (incl me updating the pybind11 submodule)?

@@ -183,12 +183,14 @@ def __str__(self):
include_package_data=True,
ext_modules=([extension_vaexfast] if on_rtd else [extension_vaexfast, extension_strings, extension_superutils, extension_superagg]) if not use_skbuild else [],
zip_safe=False,
python_requires=">=3.6",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cibuildwheel parses this to determine which wheels to build

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @franz101

see also the diff above

@ddelange
Copy link
Contributor Author

I'm guessing this is blocked by #2339

@maartenbreddels
Copy link
Member

Just letting you know i'm very busy and had a vacation.
Yes, I'll try to get #2339 green first!

@ddelange
Copy link
Contributor Author

fwiw there are now third party free minutes on native arm64 machines, to get rid of the slow qemu builds

@ddelange ddelange changed the title Build aarch64 wheels Build aarch64 wheels and support python 3.11 Jul 10, 2023
@maartenbreddels
Copy link
Member

Could you try rebasing this?

@ddelange
Copy link
Contributor Author

@maartenbreddels already merged in master 👍

@ddelange
Copy link
Contributor Author

    ERROR: Could not find a version that satisfies the requirement vaex-core<4.17,>=4.17.0 (from vaex)
    ERROR: No matching distribution found for vaex-core<4.17,>=4.17.0

@maartenbreddels
Copy link
Member

Yeah, a bug/artifact or our release script. Should be good now.

@ddelange
Copy link
Contributor Author

ddelange commented Aug 3, 2023

hoi @maartenbreddels 👋

I pulled master and fixed merge conflicts, but it looks like CI is still not very happy. Seeing errors like hdf file missing on disk, and TypeError: train() got an unexpected keyword argument 'early_stopping_rounds'.

Do you think it might be related to this PR?

ddelange referenced this pull request in rapidfuzz/RapidFuzz Aug 10, 2023
@franz101
Copy link
Contributor

Just wondering here on the Python packaging. Python 3.6 and 3.7 are now deprecated on the other hand we can bump to 3.10 and 3.11?

@to-bee
Copy link

to-bee commented Aug 28, 2023

Do we have any updates on this MR?

@ddelange
Copy link
Contributor Author

ddelange commented Sep 1, 2023

HI @maartenbreddels 👋

Was your s3 account deleted by any chance?

vaex.open('s3://vaex/taxi/yellow_taxi_2009_2015_f32.hdf5?anon=true')

raises

FileNotFoundError: [Errno 2] Path does not exist 'vaex/taxi/yellow_taxi_2009_2015_f32.hdf5'. Detail: [errno 2] No such file or directory
image

@ddelange ddelange force-pushed the build-matrix branch 3 times, most recently from 5680eb9 to 2136629 Compare September 4, 2023 08:28
@maartenbreddels
Copy link
Member

Love the fix! I was afraid this didn't work, because you cannot release with references to files or URLs. but the top level setup.py does not see pypi, so it should be good.

I'm installing vaex on a windows machine to see if I can debug that segfault.

@maartenbreddels
Copy link
Member

I think 91263bc makes it impossible to release on pypi, they will refuse an upload.

@ddelange
Copy link
Contributor Author

ahh of course!

This reverts commit 91263bc.
@maartenbreddels
Copy link
Member

maartenbreddels commented Aug 30, 2024

I ran Windows + Python 3.8 100s of times, and I cannot reproduce the failure. I'm leaning to either:

  • Drop Python 3.8 support on windows only (is that possible?)
  • Release it anyway, assuming it is a 'fluke'

Ideas?

@ddelange
Copy link
Contributor Author

looks like not allowed. have a feeling it might be windows 3.8 threading implementation memory leak

@ddelange
Copy link
Contributor Author

ddelange commented Aug 30, 2024

you could simply skip publish of that vaex-core wheel, and don't upload tar.gz for vaex-core. it's the tensorflow approach

done in the commits below

.github/workflows/wheel.yml Outdated Show resolved Hide resolved
Comment on lines +81 to +86
# https://github.com/pypa/gh-action-pypi-publish#trusted-publishing
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@v1.9.0
if: startsWith(github.ref, 'refs/tags')
with:
skip-existing: true
Copy link
Contributor Author

@ddelange ddelange Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all you need to do for this to work is add vaexio/vaex as trusted publisher here: https://pypi.org/manage/project/vaex/settings/publishing (and to the other vaex pypi projects)

@ddelange
Copy link
Contributor Author

that did it. we're green 🏁

fwiw, pandas deprecated python 3.8 in August 2023 (v2.1.0)

@maartenbreddels
Copy link
Member

Whow, amazing. You did most of the work @ddelange thanks a lot for the work and support !

I'll set up the trusted publisher (probably next week) and see if we can make a release 🥳

@maartenbreddels maartenbreddels merged commit e3e1842 into vaexio:master Sep 11, 2024
51 checks passed
@EwoutH
Copy link
Contributor

EwoutH commented Sep 11, 2024

Wow it got merged! Congratulations!!

@erwanp
Copy link

erwanp commented Sep 13, 2024

Well done @ddelange @maartenbreddels @EwoutH ; that's a great news for our codes!

@HajimeKawahara
Copy link

Great news! Thanks a lot. @ddelange @maartenbreddels @EwoutH

@franz101
Copy link
Contributor

@Ben-Epstein wake up it's christmas

@ddelange
Copy link
Contributor Author

fyi followup PR #2434 should land soon, so release is imminent:)
landed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet