Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] test_numpy_array_protocol test failures with numpy 2.0.0rc1 #41319

Closed
mgorny opened this issue Apr 21, 2024 · 4 comments
Closed

[Python] test_numpy_array_protocol test failures with numpy 2.0.0rc1 #41319

mgorny opened this issue Apr 21, 2024 · 4 comments

Comments

@mgorny
Copy link
Contributor

mgorny commented Apr 21, 2024

Describe the bug, including details regarding any error messages, version, and platform.

When running pyarrow 16.0.0 test suite against numpy 2.0.0rc1 on Gentoo Linux amd64, I'm seeing the following test failures:

========================================================= test session starts =========================================================
platform linux -- Python 3.11.9, pytest-8.1.1, pluggy-1.5.0 -- /var/tmp/portage/dev-python/pyarrow-16.0.0/work/apache-arrow-16.0.0/python-python3_11/install/usr/bin/python3.11
cachedir: .pytest_cache
rootdir: /var/tmp/portage/dev-python/pyarrow-16.0.0/temp
plugins: xdist-3.6.0
created: 12/12 workers
12 workers [7653 items]

scheduling tests via WorkStealingScheduling

[…]
============================================================== FAILURES ===============================================================
______________________________________________________ test_numpy_array_protocol ______________________________________________________
[gw6] linux -- Python 3.11.9 /var/tmp/portage/dev-python/pyarrow-16.0.0/work/apache-arrow-16.0.0/python-python3_11/install/usr/bin/pyth
on3.11

>   ???


pyarrow/array.pxi:1522: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyarrow/array.pxi:1587: in pyarrow.lib.Array.to_numpy
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   pyarrow.lib.ArrowInvalid: Needed to copy 1 chunks with 1 nulls, but zero_copy_only was True


pyarrow/error.pxi:91: ArrowInvalid

During handling of the above exception, another exception occurred:

    def test_numpy_array_protocol():
        # test the __array__ method on pyarrow.Array
        arr = pa.array([1, 2, 3])
        result = np.asarray(arr)
        expected = np.array([1, 2, 3], dtype="int64")
        np.testing.assert_array_equal(result, expected)
    
        # this should not raise a deprecation warning with numpy 2.0+
        result = np.array(arr, copy=False)
        np.testing.assert_array_equal(result, expected)
    
        result = np.array(arr, dtype="int64", copy=False)
        np.testing.assert_array_equal(result, expected)
    
        # no zero-copy is possible
        arr = pa.array([1, 2, None])
        expected = np.array([1, 2, np.nan], dtype="float64")
        result = np.asarray(arr)
        np.testing.assert_array_equal(result, expected)
    
        if Version(np.__version__) < Version("2.0"):
            # copy keyword is not strict and not passed down to __array__
>           result = np.array(arr, copy=False)

arr        = <pyarrow.lib.Int64Array object at 0x7fb5f9052500>
[
  1,
  2,
  null
]
expected   = array([ 1.,  2., nan])
result     = array([ 1.,  2., nan])

../work/apache-arrow-16.0.0/python-python3_11/install/usr/lib/python3.11/site-packages/pyarrow/tests/test_array.py:3328: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   ValueError: Unable to avoid a copy while creating a numpy array as requested.
E   If using `np.array(obj, copy=False)` replace it with `np.asarray(obj)` to allow a copy when needed


pyarrow/array.pxi:1524: ValueError
__________________________________________________ test_numpy_array_protocol[table] ___________________________________________________
[gw4] linux -- Python 3.11.9 /var/tmp/portage/dev-python/pyarrow-16.0.0/work/apache-arrow-16.0.0/python-python3_11/install/usr/bin/python3.11

constructor = <cyfunction table at 0x7fdfdbb7d150>

    @pytest.mark.parametrize("constructor", [pa.table, pa.record_batch])
    def test_numpy_array_protocol(constructor):
        table = constructor([[1, 2, 3], [4.0, 5.0, 6.0]], names=["a", "b"])
        expected = np.array([[1, 4], [2, 5], [3, 6]], dtype="float64")
    
        if Version(np.__version__) < Version("2.0"):
            # copy keyword is not strict and not passed down to __array__
>           result = np.array(table, copy=False)

constructor = <cyfunction table at 0x7fdfdbb7d150>
expected   = array([[1., 4.],
       [2., 5.],
       [3., 6.]])
table      = pyarrow.Table
a: int64
b: double
----
a: [[1,2,3]]
b: [[4,5,6]]

../work/apache-arrow-16.0.0/python-python3_11/install/usr/lib/python3.11/site-packages/pyarrow/tests/test_table.py:3249: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   ValueError: Unable to avoid a copy while creating a numpy array as requested (converting a pyarrow.Table always results in a copy).
E   If using `np.array(obj, copy=False)` replace it with `np.asarray(obj)` to allow a copy when needed


pyarrow/array.pxi:1575: ValueError
_______________________________________________ test_numpy_array_protocol[record_batch] _______________________________________________
[gw4] linux -- Python 3.11.9 /var/tmp/portage/dev-python/pyarrow-16.0.0/work/apache-arrow-16.0.0/python-python3_11/install/usr/bin/pyth
on3.11

constructor = <cyfunction record_batch at 0x7fdfdbb7d080>

    @pytest.mark.parametrize("constructor", [pa.table, pa.record_batch])
    def test_numpy_array_protocol(constructor):
        table = constructor([[1, 2, 3], [4.0, 5.0, 6.0]], names=["a", "b"])
        expected = np.array([[1, 4], [2, 5], [3, 6]], dtype="float64")
    
        if Version(np.__version__) < Version("2.0"):
            # copy keyword is not strict and not passed down to __array__
>           result = np.array(table, copy=False)

constructor = <cyfunction record_batch at 0x7fdfdbb7d080>
expected   = array([[1., 4.],
       [2., 5.],
       [3., 6.]])
table      = pyarrow.RecordBatch
a: int64
b: double
----
a: [1,2,3]
b: [4,5,6]

../work/apache-arrow-16.0.0/python-python3_11/install/usr/lib/python3.11/site-packages/pyarrow/tests/test_table.py:3249: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   ValueError: Unable to avoid a copy while creating a numpy array as requested (converting a pyarrow.RecordBatch always results in a copy).
E   If using `np.array(obj, copy=False)` replace it with `np.asarray(obj)` to allow a copy when needed


pyarrow/array.pxi:1575: ValueError

(in addition to the health check failure that I've reported in #41318)

Full build + test log: dev-python:pyarrow-16.0.0:20240421-124023.log.gz

Component(s)

Python

@bnavigator
Copy link

Same for openSUSE.
python-pyarrow_log.txt

@bnavigator
Copy link

Patch to make it work with the numpy 2 release candidate:

Index: arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_array.py
===================================================================
--- arrow-apache-arrow-16.0.0.orig/python/pyarrow/tests/test_array.py
+++ arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_array.py
@@ -3323,7 +3323,7 @@ def test_numpy_array_protocol():
     result = np.asarray(arr)
     np.testing.assert_array_equal(result, expected)
 
-    if Version(np.__version__) < Version("2.0"):
+    if Version(np.__version__) < Version("2.0.0rc1"):
         # copy keyword is not strict and not passed down to __array__
         result = np.array(arr, copy=False)
         np.testing.assert_array_equal(result, expected)
Index: arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_table.py
===================================================================
--- arrow-apache-arrow-16.0.0.orig/python/pyarrow/tests/test_table.py
+++ arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_table.py
@@ -3244,7 +3244,7 @@ def test_numpy_array_protocol(constructo
     table = constructor([[1, 2, 3], [4.0, 5.0, 6.0]], names=["a", "b"])
     expected = np.array([[1, 4], [2, 5], [3, 6]], dtype="float64")
 
-    if Version(np.__version__) < Version("2.0"):
+    if Version(np.__version__) < Version("2.0.0rc1"):
         # copy keyword is not strict and not passed down to __array__
         result = np.array(table, copy=False)
         np.testing.assert_array_equal(result, expected)

@jorisvandenbossche jorisvandenbossche added this to the 17.0.0 milestone Jun 6, 2024
jorisvandenbossche added a commit that referenced this issue Jun 13, 2024
### Rationale for this change

The tests are failing on windows when using numpy 2.0 RC, probably related to default integer bitwidth.

### What changes are included in this PR?

### Are these changes tested?

### Are there any user-facing changes?

* GitHub issue: #41319
* GitHub Issue: #41924

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
@jorisvandenbossche
Copy link
Member

Resolved by #42099

@mgorny
Copy link
Contributor Author

mgorny commented Jun 13, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants