Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: should we support np.allclose for ExtensionArrays? #37915

Open
arw2019 opened this issue Nov 17, 2020 · 1 comment
Open

ENH: should we support np.allclose for ExtensionArrays? #37915

arw2019 opened this issue Nov 17, 2020 · 1 comment
Labels
Compat pandas objects compatability with Numpy or Python functions Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. Needs Discussion Requires discussion from core team before further action ufuncs __array_ufunc__ and __array_function__

Comments

@arw2019
Copy link
Member

arw2019 commented Nov 17, 2020

xref https://github.com/pandas-dev/pandas/pull/33435/files#r406534850

I'm looking into picking up #33435. Currently the issue is that np.allclose throws when called on an EA:

In [1]: import numpy as np
   ...: import pandas as pd
   ...: 
   ...: A = pd.array([1, 2], dtype='Int64')
   ...: B = pd.array([1, 2], dtype='Int64')
   ...: np.allclose(A, B)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-01ae8e8fc321> in <module>
      4 A = pd.array([1, 2], dtype='Int64')
      5 B = pd.array([1, 2], dtype='Int64')
----> 6 np.allclose(A, B)

<__array_function__ internals> in allclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol, equal_nan)
   2187 
   2188     """
-> 2189     res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
   2190     return bool(res)
   2191 

<__array_function__ internals> in isclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in isclose(a, b, rtol, atol, equal_nan)
   2285     y = array(y, dtype=dt, copy=False, subok=True)
   2286 
-> 2287     xfin = isfinite(x)
   2288     yfin = isfinite(y)
   2289     if all(xfin) and all(yfin):

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

The error message is about the isfinite ufunc not being implemented. The root cause here is that np.isclose calls np.asanyarray on its inputs and the following triggers the same error:

In [14]: a = np.asanyarray(A)
    ...: b = np.asanyarray(B)
    ...: np.allclose(a, b)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-14-9c8d10e21030> in <module>
      1 a = np.asanyarray(A)
      2 b = np.asanyarray(B)
----> 3 np.allclose(a, b)

<__array_function__ internals> in allclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol, equal_nan)
   2187 
   2188     """
-> 2189     res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
   2190     return bool(res)
   2191 

<__array_function__ internals> in isclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in isclose(a, b, rtol, atol, equal_nan)
   2285     y = array(y, dtype=dt, copy=False, subok=True)
   2286 
-> 2287     xfin = isfinite(x)
   2288     yfin = isfinite(y)
   2289     if all(xfin) and all(yfin):

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according

The issue is that when np.asanyarray is called on EA input it returns an object dtype:

In [16]: A = pd.array([1, 2], dtype='Int64')
    ...: np.asanyarray(A)
Out[16]: array([1, 2], dtype=object)

As far as #33435 is concerned, one solution is to cast to a NumPy array before calling np.allclose. Would we, however, want to make np.allclose work directly on the integer/floating EAs?

@arw2019 arw2019 added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question ExtensionArray Extending pandas with custom dtypes or arrays. API Design Compat pandas objects compatability with Numpy or Python functions and removed Usage Question labels Nov 17, 2020
@mroeschke mroeschke added Enhancement Needs Discussion Requires discussion from core team before further action and removed API Design Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 14, 2021
@jbrockmendel
Copy link
Member

Would we, however, want to make np.allclose work directly on the integer/floating EAs?

IIUC this would require implementing __array_function__. This would be very nice to have, but I've had trouble implementing it bc there is no supported way to implement it just for a few functions (xref numpy/numpy#18186)

@jbrockmendel jbrockmendel added the ufuncs __array_ufunc__ and __array_function__ label Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. Needs Discussion Requires discussion from core team before further action ufuncs __array_ufunc__ and __array_function__
Projects
None yet
Development

No branches or pull requests

3 participants