Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: np.inf now causes Index to upcast from int to float #16996

Merged
merged 7 commits into from
Jul 18, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ Bug Fixes
~~~~~~~~~

- Fixes regression in 0.20, :func:`Series.aggregate` and :func:`DataFrame.aggregate` allow dictionaries as return values again (:issue:`16741`)
- Fixes inf upcast from integer indices in 0.20, previously :func:`Int64Index.__contains__` and `DataFrame.loc.__setitem__` raised an `OverflowError` when given `np.inf` (:issue:`16957`)
- Fixes bug where indexing with `np.inf` caused an `OverflowError` to be raised (:issue:`16957`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double-backticks on np.inf and OverflowError


Conversion
^^^^^^^^^^
Expand Down
27 changes: 16 additions & 11 deletions pandas/tests/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,7 @@ def test_inf_upcast(self):
df.loc[np.inf] = 3

# make sure we can look up the value
result = df.loc[np.inf, 0]
tm.assert_almost_equal(result, 3)
assert df.loc[np.inf, 0] == 3

result = df.index
expected = pd.Float64Index([1, 2, np.inf])
Expand Down Expand Up @@ -571,17 +570,23 @@ def test_astype_assignment_with_dups(self):
# result = df.get_dtype_counts().sort_index()
# expected = Series({'float64': 2, 'object': 1}).sort_index()

def test_coercion_with_contains(self):
@pytest.mark.parametrize("val,expected", [
(np.inf, False),
(np.nan, False),
(3.0, True),
(3.5, False),
(0.0, True),
])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Lots of good test cases.

However, I'd like to split these in two tests based on the expected result. This is because statements like "assert result is True" are less idiomatic. Rather, we should be saying "assert val in index" OR "assert val not in index."

It also means one fewer parameter in your pytest parametrization. 😄

def test_coercion_with_contains(self, val, expected):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite what I was going for here. What I actually was asking you to do was create example indices that contain np.inf or np.nan. You can by all means keep the ones you have, but include test cases where testing membership of np.inf or np.nan returns true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess I didn't understand that because I was explicitly using a pd.Int64Index, which should and does raise a TypeError if you try to initialize it with np.inf. I guess it should be fine if I use a normal pd.Index.

Copy link
Member

@gfyoung gfyoung Jul 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can by all means use pd.Index ! 👍

# Related to GH 16957
# Checking if Int64Index contains np.inf should catch the OverflowError
for val in [np.inf, np.nan]:
index = pd.Int64Index([1, 2, 3])
result = val in index
tm.assert_almost_equal(result, False)

index = pd.UInt64Index([1, 2, 3])
result = np.inf in index
tm.assert_almost_equal(result, False)
index = pd.Int64Index([1, 2, 3])
result = val in index
assert result is expected

index = pd.UInt64Index([1, 2, 3])
result = val in index
assert result is expected

def test_index_type_coercion(self):

Expand Down