Skip to content

Commit

Permalink
BUG: Fix behavior of argmax and argmin with inf (#16449)
Browse files Browse the repository at this point in the history
Closes #13595

The implementations of `nanargmin` and `nanargmax` in `nanops` were
forcing the `_get_values` utility function to always mask out infinite
values. For example, in `nanargmax`,

    >>> nanops._get_values(np.array([1, np.nan, np.inf]), True,
    isfinite=True, fill_value_typ='-inf')

    (array([  1., -inf, -inf]),
     array([False,  True,  True], dtype=bool),
     dtype('float64'),
     numpy.float64)

The first element of the result tuple (the masked version of the
values array) is used for actually finding the max or min argument. As
a result, infinite values could never be correctly recognized as the
maximum or minimum values in an array.

This also affects the behavior of `Series.idxmax` with string data (or
the `object` dtype generally). Previously, `nanargmax` would always
attempt to coerce its input to float, even when there were no missing
values. Now, it will not, and so will work correctly in the particular
case of a `Series` of strings with no missing values. However, because
it's difficult to ensure that `nanargmin` and `nanargmax` will behave
consistently for arbitrary `Series` of `object` with and without
missing values, these functions are now explicitly disallowed for
`object`.
  • Loading branch information
DGrady authored and Daniel Grady committed Aug 14, 2017
1 parent 06850a1 commit 9a3d62d
Show file tree
Hide file tree
Showing 4 changed files with 54 additions and 6 deletions.
3 changes: 2 additions & 1 deletion doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,7 @@ Other API Changes
- Removed the ``@slow`` decorator from ``pandas.util.testing``, which caused issues for some downstream packages' test suites. Use ``@pytest.mark.slow`` instead, which achieves the same thing (:issue:`16850`)
- Moved definition of ``MergeError`` to the ``pandas.errors`` module.
- The signature of :func:`Series.set_axis` and :func:`DataFrame.set_axis` has been changed from ``set_axis(axis, labels)`` to ``set_axis(labels, axis=0)``, for consistency with the rest of the API. The old signature is deprecated and will show a ``FutureWarning`` (:issue:`14636`)

- :func:`Series.argmin` and :func:`Series.argmax` will now raise a ``TypeError`` when used with ``object`` dtypes, instead of a ``ValueError`` (:issue:`13595`)

.. _whatsnew_0210.deprecations:

Expand Down Expand Up @@ -369,6 +369,7 @@ Reshaping
- Fixes regression from 0.20, :func:`Series.aggregate` and :func:`DataFrame.aggregate` allow dictionaries as return values again (:issue:`16741`)
- Fixes dtype of result with integer dtype input, from :func:`pivot_table` when called with ``margins=True`` (:issue:`17013`)
- Bug in :func:`crosstab` where passing two ``Series`` with the same name raised a ``KeyError`` (:issue:`13279`)
- :func:`Series.argmin`, :func:`Series.argmax`, and their counterparts on ``DataFrame`` and groupby objects work correctly with floating point data that contains infinite values (:issue:`13595`).

Numeric
^^^^^^^
Expand Down
8 changes: 4 additions & 4 deletions pandas/core/nanops.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,23 +486,23 @@ def reduction(values, axis=None, skipna=True):
nanmax = _nanminmax('max', fill_value_typ='-inf')


@disallow('O')
def nanargmax(values, axis=None, skipna=True):
"""
Returns -1 in the NA case
"""
values, mask, dtype, _ = _get_values(values, skipna, fill_value_typ='-inf',
isfinite=True)
values, mask, dtype, _ = _get_values(values, skipna, fill_value_typ='-inf')
result = values.argmax(axis)
result = _maybe_arg_null_out(result, axis, mask, skipna)
return result


@disallow('O')
def nanargmin(values, axis=None, skipna=True):
"""
Returns -1 in the NA case
"""
values, mask, dtype, _ = _get_values(values, skipna, fill_value_typ='+inf',
isfinite=True)
values, mask, dtype, _ = _get_values(values, skipna, fill_value_typ='+inf')
result = values.argmin(axis)
result = _maybe_arg_null_out(result, axis, mask, skipna)
return result
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/groupby/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -2339,7 +2339,7 @@ def test_non_cython_api(self):
assert_frame_equal(result, expected)

# idxmax
expected = DataFrame([[0], [nan]], columns=['B'], index=[1, 3])
expected = DataFrame([[0.0], [nan]], columns=['B'], index=[1, 3])
expected.index.name = 'A'
result = g.idxmax()
assert_frame_equal(result, expected)
Expand Down
47 changes: 47 additions & 0 deletions pandas/tests/series/test_operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -1857,3 +1857,50 @@ def test_op_duplicate_index(self):
result = s1 + s2
expected = pd.Series([11, 12, np.nan], index=[1, 1, 2])
assert_series_equal(result, expected)

@pytest.mark.parametrize(
"test_input,error_type",
[
(pd.Series([]), ValueError),
# For strings, or any Series with dtype 'O'
(pd.Series(['foo', 'bar', 'baz']), TypeError),
(pd.Series([(1,), (2,)]), TypeError),
# For mixed data types
(
pd.Series(['foo', 'foo', 'bar', 'bar', None, np.nan, 'baz']),
TypeError
),
]
)
def test_assert_argminmax_raises(self, test_input, error_type):
"""
Cases where ``Series.argmax`` and related should raise an exception
"""
with pytest.raises(error_type):
test_input.argmin()
with pytest.raises(error_type):
test_input.argmin(skipna=False)
with pytest.raises(error_type):
test_input.argmax()
with pytest.raises(error_type):
test_input.argmax(skipna=False)

def test_argminmax_with_inf(self):
# For numeric data with NA and Inf (GH #13595)
s = pd.Series([0, -np.inf, np.inf, np.nan])

assert s.argmin() == 1
assert np.isnan(s.argmin(skipna=False))

assert s.argmax() == 2
assert np.isnan(s.argmax(skipna=False))

# Using old-style behavior that treats floating point nan, -inf, and
# +inf as missing
with pd.option_context('mode.use_inf_as_na', True):
assert s.argmin() == 0
assert np.isnan(s.argmin(skipna=False))
assert s.argmax() == 0
np.isnan(s.argmax(skipna=False))

0 comments on commit 9a3d62d

Please sign in to comment.