Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support downcasting of nullable arrays #33435

Closed
wants to merge 6 commits into from

Conversation

yixinxiao7
Copy link

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Addressing GH #33013. Supports downcasting of nullable dtypes by transferring elements in extension array to ndarray to avoid TypeError thrown by np.allclose. Enhancement documented on v1.1.0 in the "other enhancements" section. Unit test written in test_to_numeric.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR

pd.to_numeric(pd.Series([1, 2, 3], dtype="Int64"), downcast="integer")
pd.to_numeric(pd.Series([1, 2], dtype="Int32"), downcast="signed")
pd.to_numeric(pd.Series([1, 2, 3], dtype="Int32"), downcast="float")
except TypeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than the try...except you would ideally do expected = ... and do tm.assert_frame_equal(result, expected) for the various cases; you will see this throughout other tests if you need a place to look

def test_support_downcast_of_nullable_dtypes():
# GH 33013
try:
pd.to_numeric(pd.Series([1, 2, 3], dtype="Int32"), downcast="integer")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you parametrize these instead?

@@ -239,7 +236,9 @@ def trans(x):
if (new_result == result).all():
return new_result
else:
if np.allclose(new_result, result, rtol=0):
# np.allclose raises TypeError on extension arrays
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure about this. I think allclose should still work and maybe that is the root cause instead, but let's see what others think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree , we might need to enable soething on EAs to make this work, but that is the root cause

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on enabling something on EAs? Would this be a new feature to it, or a current feature?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when this is on an EA (e.g. show an exampel with this line specifically)

@WillAyd WillAyd added the ExtensionArray Extending pandas with custom dtypes or arrays. label Apr 9, 2020
@@ -88,6 +88,7 @@ Other enhancements
- :class:`Series.str` now has a `fullmatch` method that matches a regular expression against the entire string in each row of the series, similar to `re.fullmatch` (:issue:`32806`).
- :meth:`DataFrame.sample` will now also allow array-like and BitGenerator objects to be passed to ``random_state`` as seeds (:issue:`32503`)
- :meth:`MultiIndex.union` will now raise `RuntimeWarning` if the object inside are unsortable, pass `sort=False` to suppress this warning (:issue:`33015`)
- :func:`to_numeric` will now support downcasting of nullable dtypes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 1.2

@@ -239,7 +236,9 @@ def trans(x):
if (new_result == result).all():
return new_result
else:
if np.allclose(new_result, result, rtol=0):
# np.allclose raises TypeError on extension arrays
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree , we might need to enable soething on EAs to make this work, but that is the root cause

@jreback jreback changed the title To numeric Support downcasting of nullable arrays Apr 10, 2020
@jreback jreback added NA - MaskedArrays Related to pd.NA and nullable extension arrays Numeric Operations Arithmetic, Comparison, and Logical operations labels Apr 10, 2020
@jreback
Copy link
Contributor

jreback commented Jun 14, 2020

can you merge master and address comments

@WillAyd
Copy link
Member

WillAyd commented Sep 10, 2020

@yixinxiao7 is this still active? Can you merge master and address comments?

@@ -88,6 +88,7 @@ Other enhancements
- :class:`Series.str` now has a `fullmatch` method that matches a regular expression against the entire string in each row of the series, similar to `re.fullmatch` (:issue:`32806`).
- :meth:`DataFrame.sample` will now also allow array-like and BitGenerator objects to be passed to ``random_state`` as seeds (:issue:`32503`)
- :meth:`MultiIndex.union` will now raise `RuntimeWarning` if the object inside are unsortable, pass `sort=False` to suppress this warning (:issue:`33015`)
- :func:`to_numeric` will now support downcasting of nullable dtypes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 1.2

@@ -239,7 +236,9 @@ def trans(x):
if (new_result == result).all():
return new_result
else:
if np.allclose(new_result, result, rtol=0):
# np.allclose raises TypeError on extension arrays
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when this is on an EA (e.g. show an exampel with this line specifically)

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@arw2019
Copy link
Member

arw2019 commented Nov 18, 2020

Closing as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays Numeric Operations Arithmetic, Comparison, and Logical operations Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants