Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumPy v2 #4925

Draft
wants to merge 101 commits into
base: main
Choose a base branch
from
Draft

NumPy v2 #4925

wants to merge 101 commits into from

Conversation

agriyakhetarpal
Copy link
Member

@agriyakhetarpal agriyakhetarpal commented Jul 10, 2024

Description

This PR bumps the NumPy version in-tree to version 2.0.0, released on June 16, 2024. A lot of packages here are expected to fail with their builds because there has been an ABI change in NumPy, and many packages are being tracked in numpy/numpy#26191 – let us hope there won't be too many of them here. I would suggest that this goes into v0.27 rather than 0.26.2, not that there is a lot of time in 0.26.2 anyway... I shall try my best to update as many failing packages as I can in this PR, or open separate PRs for each of them if that would be better. I should be able to get an estimate once the CI runs. Versions of packages that are compatible with v2.0.0rc1 should be ABI-compatible with NumPy v2.

Checklists

@agriyakhetarpal
Copy link
Member Author

First failure is with building NumPy itself

FileNotFoundError: [Errno 2] No such file or directory:                         
'/tmp/tmpn7pa0ha3/numpy-2.0.0/numpy/core/include/numpy/numpyconfig.h'

which is just because it's included in _core/ and not in core/, and similarly _numpyconfig.h will be created there

@agriyakhetarpal
Copy link
Member Author

The new error shows that the static libs are placed in newer paths as well.

From https://github.com/numpy/numpy/blob/2176571e1c7e5040df7a3cdfd9f7d859f4feb014/numpy/_core/npymath.ini.in#L18 I see that it moved to _core/ as well, but libnpyrandom is at the same location: https://github.com/numpy/numpy/blob/2176571e1c7e5040df7a3cdfd9f7d859f4feb014/numpy/random/meson.build#L16

@agriyakhetarpal
Copy link
Member Author

I got NumPy to build with not a lot of effort, which is great. Next, I'm going to the NumPy dependents. SciPy v1.13.0 is the first version to support NumPy v2 (#4719), which I need to return to after #4920 gets merged. I'll explore the rest of the dependents right now.

@ryanking13
Copy link
Member

I got NumPy to build with not a lot of effort, which is great. Next, I'm going to the NumPy dependents. SciPy v1.13.0 is the first version to support NumPy v2 (#4719), which I need to return to after #4920 gets merged. I'll explore the rest of the dependents right now.

Cooool! That's awesome. I agree that this should go to 0.27. While we try our best to update packages as much as possible to support Numpy 2.0, I think it is also an option to disable some of them temporarily if they do not support Numpy 2.0.

@agriyakhetarpal
Copy link
Member Author

agriyakhetarpal commented Jul 11, 2024

Cooool! That's awesome. I agree that this should go to 0.27. While we try our best to update packages as much as possible to support Numpy 2.0, I think it is also an option to disable some of them temporarily if they do not support Numpy 2.0.

@ryanking13, thanks for your input, I can look into disabling packages once I have further progress. Would you know of a way to reliably run the Selenium code in test_numpy.py inside the provided Docker container? I seem to run into errors with pyodide_dist_dir everytime even when I spin up a fresh container. Running locally could help me debug faster.

@ryanking13
Copy link
Member

Could you share with me the error you get? Some of the package versions in our docker image (e.g. pytest) is outdated so it may cause some error, but I am not sure.

@agriyakhetarpal
Copy link
Member Author

Ah, I see it's a very standard error:

$ python packages/numpy/test_numpy.py

which gets me

Traceback (most recent call last):
  File "/src/packages/numpy/test_numpy.py", line 363, in <module>
    @run_in_pyodide(packages=["numpy"])
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/.docker_home/.local/lib/python3.12/site-packages/pytest_pyodide/decorator.py", line 408, in __init__
    pytest_assert_rewrites and package_is_built("pytest")
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/.docker_home/.local/lib/python3.12/site-packages/pytest_pyodide/decorator.py", line 22, in package_is_built
    return _package_is_built(package_name, pytest.pyodide_dist_dir)
                                           ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pytest' has no attribute 'pyodide_dist_dir'

I upgraded pytest and pytest-pyodide but didn't get anywhere with that, I'll spend some time debugging the JS interface error I just received

@agriyakhetarpal
Copy link
Member Author

NumPy test suite passing (not sure if this is the best way to handle the buffer), but I'll move on to the rest now.

@agriyakhetarpal
Copy link
Member Author

scikit-learn v1.4.2 was the first one to include support for NumPy v2.0.0, but we already have v1.5.0 in the main branch. Should I bump it to the latest release here, i.e., v1.5.1?

@agriyakhetarpal
Copy link
Member Author

agriyakhetarpal commented Jul 12, 2024

Here, I have updated OpenCV (opencv-python) and yt. As mentioned above, scikit-learn is already at a compatible version, so it seems that bumping SciPy to v1.13.0 (#4719) is the only issue holding up this PR, which is waiting on #4920. I assume that the versions of the rest of the packages we have right now will have been built against NumPy v2, but if they haven't been, I'll take care of that.

@ryanking13
Copy link
Member

python packages/numpy/test_numpy.py

I think you need to call pytest not python.

pytest packages/numpy/test_numpy.py

FYI: I generally prefer verbose mode + chrome browser when testing locally.

pytest -v --rt=chrome packages/numpy/test_numpy.py

@ryanking13
Copy link
Member

scikit-learn v1.4.2 was the first one to include support for NumPy v2.0.0, but we already have v1.5.0 in the main branch. Should I bump it to the latest release here, i.e., v1.5.1?

Updating the package version is not required, but updates are always welcome. If you're having trouble updating, you can leave it for later.

it seems that bumping SciPy to v1.13.0 (#4719) is the only issue holding up this PR

Awesome! Thank you for updating these packages!

@agriyakhetarpal
Copy link
Member Author

Updated scikit-learn to version 1.5.1 in 4ebc115 – tagging the recipe and package maintainer @lesteve for visibility

@agriyakhetarpal

This comment was marked as resolved.

@agriyakhetarpal
Copy link
Member Author

There's a failure with building LightGBM in CI, which I'm unable to reproduce locally – LightGBM builds perfectly fine for me...

@ryanking13
Copy link
Member

There's a failure with building LightGBM in CI, which I'm unable to reproduce locally – LightGBM builds perfectly fine for me...

Yeah it works fine for me too locally but it fails in CI occasionally. I am guessing that some other package building in parallel with lightgbm is affecting the build, but I am not 100% sure.

While we try to isolate package build as much as possible, some package affect each other. For example, when building scipy, it locates boost-cpp headers using pkg-config, and use it if they are found. So I when we build boost-cpp and scipy simultaneously, it is possible that some strange behavior happens.

I am guessing that something similar is happening in the lightGBM...

@agriyakhetarpal
Copy link
Member Author

agriyakhetarpal commented Nov 28, 2024

I might need some help with fixing Matplotlib here, I disabled PyArrow earlier as requested. I was initially going to do a new release for matplotlib-pyodide yesterday and earlier in the day today, but on second thought(s), I don't think investing time into upgrading Matplotlib to 3.8.4 is helpful because there are a lot of versions, so what I've done is that I adapted all of the necessary changes in a patch to make Matplotlib 3.5.2 support NumPy v2 (@rgommers and I discussed this recently where it was noted that Matplotlib is most likely not doing something crazy with NumPy at build-time), and most of them are API-only (please see comments in the patch). I'm not sure if this is the right approach to go forward with.

Edit: also, Matplotlib hasn't had an upper bound on the Python version in 3.5.2, so while it builds fine, 3.7.3 is the first release to actually build against Python 3.12 and provide cp312 wheels on PyPI. Do we need to upgrade to 3.7.3?

@rgommers
Copy link
Contributor

I don't think investing time into upgrading Matplotlib to 3.8.4 is helpful because there are a lot of versions, so what I've done is that I adapted all of the necessary changes in a patch to make Matplotlib 3.5.2 support NumPy v2 (@rgommers and I discussed this recently where it was noted that Matplotlib is most likely not doing something crazy with NumPy at build-time), and most of them are API-only (please see comments in the patch). I'm not sure if this is the right approach to go forward with.

I'm not sure about the details here, but I'd always ask "is this useful long term". Patching an old version of Matplotlib is in principle effort that only has a net negative effort on future work, especially if you're trying to patch package versions together that don't actually work together for regular installs from PyPI. So it seems wrong to me from first principles. I don't understand what the problem is with upgrading Matplotlib to latest, and then dealing with whatever is left (that's then Pyodide-specific issues)?

@agriyakhetarpal
Copy link
Member Author

I'm not sure about the details here, but I'd always ask "is this useful long term". Patching an old version of Matplotlib is in principle effort that only has a net negative effort on future work, especially if you're trying to patch package versions together that don't actually work together for regular installs from PyPI. So it seems wrong to me from first principles. I don't understand what the problem is with upgrading Matplotlib to latest, and then dealing with whatever is left (that's then Pyodide-specific issues)?

Upgrading Matplotlib to version 3.8.4 or later does bring Pyodide-specific issues because of the gap between versions – that is, more function signature mismatches with matplotlib-pyodide's WASM backend, in which no one has claimed expertise. I can try upgrading incrementally (to 3.6, 3.7.3, etc.) first, maybe that can help.

@rgommers
Copy link
Contributor

Upgrading Matplotlib to version 3.8.4 or later does bring Pyodide-specific issues

It's very well possible that it's (too) hard right now, but this upgrade has to be done anyway sooner or later, so it's not wasted effort. I'd suggest at least trying, and otherwise do the maximum version bump you can get done and report blockers as new issues, before patching an old version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants