Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: enforce indexing deprecations #49511

Merged
merged 7 commits into from
Nov 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.11.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,7 @@ Enhancements
- You can now select with a string from a DataFrame with a datelike index, in a similar way to a Series (:issue:`3070`)

.. ipython:: python
:okwarning:
:okexcept:

idx = pd.date_range("2001-10-1", periods=5, freq='M')
ts = pd.Series(np.random.rand(len(idx)), index=idx)
Expand Down
4 changes: 4 additions & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,7 @@ Removal of prior version deprecations/changes
- Removed argument ``kind`` from :meth:`Index.get_slice_bound`, :meth:`Index.slice_indexer` and :meth:`Index.slice_locs` (:issue:`41378`)
- Removed arguments ``prefix``, ``squeeze``, ``error_bad_lines`` and ``warn_bad_lines`` from :func:`read_csv` (:issue:`40413`, :issue:`43427`)
- Removed argument ``datetime_is_numeric`` from :meth:`DataFrame.describe` and :meth:`Series.describe` as datetime data will always be summarized as numeric data (:issue:`34798`)
- Disallow passing list ``key`` to :meth:`Series.xs` and :meth:`DataFrame.xs`, pass a tuple instead (:issue:`41789`)
- Disallow subclass-specific keywords (e.g. "freq", "tz", "names", "closed") in the :class:`Index` constructor (:issue:`38597`)
- Removed argument ``inplace`` from :meth:`Categorical.remove_unused_categories` (:issue:`37918`)
- Disallow passing non-round floats to :class:`Timestamp` with ``unit="M"`` or ``unit="Y"`` (:issue:`47266`)
Expand Down Expand Up @@ -381,6 +382,8 @@ Removal of prior version deprecations/changes
- Enforced disallowing a tuple of column labels into :meth:`.DataFrameGroupBy.__getitem__` (:issue:`30546`)
- Enforced disallowing setting values with ``.loc`` using a positional slice. Use ``.loc`` with labels or ``.iloc`` with positions instead (:issue:`31840`)
- Enforced disallowing positional indexing with a ``float`` key even if that key is a round number, manually cast to integer instead (:issue:`34193`)
- Enforced disallowing using a :class:`DataFrame` indexer with ``.iloc``, use ``.loc`` instead for automatic alignment (:issue:`39022`)
- Enforced disallowing ``set`` or ``dict`` indexers in ``__getitem__`` and ``__setitem__`` methods (:issue:`42825`)
- Enforced disallowing indexing on a :class:`Index` or positional indexing on a :class:`Series` producing multi-dimensional objects e.g. ``obj[:, None]``, convert to numpy before indexing instead (:issue:`35141`)
- Enforced disallowing ``dict`` or ``set`` objects in ``suffixes`` in :func:`merge` (:issue:`34810`)
- Enforced disallowing :func:`merge` to produce duplicated columns through the ``suffixes`` keyword and already existing columns (:issue:`22818`)
Expand All @@ -392,6 +395,7 @@ Removal of prior version deprecations/changes
- Enforced :meth:`Rolling.count` with ``min_periods=None`` to default to the size of the window (:issue:`31302`)
- Renamed ``fname`` to ``path`` in :meth:`DataFrame.to_parquet`, :meth:`DataFrame.to_stata` and :meth:`DataFrame.to_feather` (:issue:`30338`)
- Enforced disallowing indexing a :class:`Series` with a single item list with a slice (e.g. ``ser[[slice(0, 2)]]``). Either convert the list to tuple, or pass the slice directly instead (:issue:`31333`)
- Changed behavior indexing on a :class:`DataFrame` with a :class:`DatetimeIndex` index using a string indexer, previously this operated as a slice on rows, now it operates like any other column key; use ``frame.loc[key]`` for the old behavior (:issue:`36179`)
- Enforced the ``display.max_colwidth`` option to not accept negative integers (:issue:`31569`)
- Removed the ``display.column_space`` option in favor of ``df.to_string(col_space=...)`` (:issue:`47280`)
- Removed the deprecated method ``mad`` from pandas classes (:issue:`11787`)
Expand Down
22 changes: 10 additions & 12 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,8 +187,7 @@
)
from pandas.core.indexing import (
check_bool_indexer,
check_deprecated_indexers,
convert_to_index_sliceable,
check_dict_or_set_indexers,
)
from pandas.core.internals import (
ArrayManager,
Expand Down Expand Up @@ -3703,7 +3702,7 @@ def _iter_column_arrays(self) -> Iterator[ArrayLike]:
yield self._get_column_array(i)

def __getitem__(self, key):
check_deprecated_indexers(key)
check_dict_or_set_indexers(key)
key = lib.item_from_zerodim(key)
key = com.apply_if_callable(key, self)

Expand All @@ -3723,17 +3722,18 @@ def __getitem__(self, key):
elif is_mi and self.columns.is_unique and key in self.columns:
return self._getitem_multilevel(key)
# Do we have a slicer (on rows)?
indexer = convert_to_index_sliceable(self, key)
if indexer is not None:
if isinstance(key, slice):
indexer = self.index._convert_slice_indexer(
key, kind="getitem", is_frame=True
)
if isinstance(indexer, np.ndarray):
# reachable with DatetimeIndex
indexer = lib.maybe_indices_to_slice(
indexer.astype(np.intp, copy=False), len(self)
)
if isinstance(indexer, np.ndarray):
# GH#43223 If we can not convert, use take
return self.take(indexer, axis=0)
# either we have a slice or we have a string that can be converted
# to a slice for partial-string date indexing
return self._slice(indexer, axis=0)

# Do we have a (boolean) DataFrame?
Expand Down Expand Up @@ -3903,11 +3903,9 @@ def __setitem__(self, key, value):
key = com.apply_if_callable(key, self)

# see if we can slice the rows
indexer = convert_to_index_sliceable(self, key)
if indexer is not None:
# either we have a slice or we have a string that can be converted
# to a slice for partial-string date indexing
return self._setitem_slice(indexer, value)
if isinstance(key, slice):
slc = self.index._convert_slice_indexer(key, kind="getitem", is_frame=True)
return self._setitem_slice(slc, value)

if isinstance(key, DataFrame) or getattr(key, "ndim", None) == 2:
self._setitem_frame(key, value)
Expand Down
7 changes: 1 addition & 6 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3930,12 +3930,7 @@ class animal locomotion
labels = self._get_axis(axis)

if isinstance(key, list):
warnings.warn(
"Passing lists as key for xs is deprecated and will be removed in a "
"future version. Pass key as a tuple instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
raise TypeError("list keys are not supported in xs, pass a tuple instead")

if level is not None:
if not isinstance(labels, MultiIndex):
Expand Down
72 changes: 16 additions & 56 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -687,7 +687,7 @@ def _get_setitem_indexer(self, key):

if isinstance(key, tuple):
for x in key:
check_deprecated_indexers(x)
check_dict_or_set_indexers(x)

if self.axis is not None:
key = _tupleize_axis_indexer(self.ndim, self.axis, key)
Expand Down Expand Up @@ -813,7 +813,7 @@ def _ensure_listlike_indexer(self, key, axis=None, value=None) -> None:

@final
def __setitem__(self, key, value) -> None:
check_deprecated_indexers(key)
check_dict_or_set_indexers(key)
if isinstance(key, tuple):
key = tuple(list(x) if is_iterator(x) else x for x in key)
key = tuple(com.apply_if_callable(x, self.obj) for x in key)
Expand Down Expand Up @@ -1004,7 +1004,7 @@ def _getitem_nested_tuple(self, tup: tuple):
# we should be able to match up the dimensionality here

for key in tup:
check_deprecated_indexers(key)
check_dict_or_set_indexers(key)

# we have too many indexers for our dim, but have at least 1
# multi-index dimension, try to see if we have something like
Expand Down Expand Up @@ -1062,7 +1062,7 @@ def _convert_to_indexer(self, key, axis: AxisInt):

@final
def __getitem__(self, key):
check_deprecated_indexers(key)
check_dict_or_set_indexers(key)
if type(key) is tuple:
key = tuple(list(x) if is_iterator(x) else x for x in key)
key = tuple(com.apply_if_callable(x, self.obj) for x in key)
Expand Down Expand Up @@ -1499,12 +1499,9 @@ def _has_valid_setitem_indexer(self, indexer) -> bool:
raise IndexError("iloc cannot enlarge its target object")

if isinstance(indexer, ABCDataFrame):
warnings.warn(
"DataFrame indexer for .iloc is deprecated and will be removed in "
"a future version.\n"
"consider using .loc with a DataFrame indexer for automatic alignment.",
FutureWarning,
stacklevel=find_stack_level(),
raise TypeError(
"DataFrame indexer for .iloc is not supported. "
"Consider using .loc with a DataFrame indexer for automatic alignment.",
)

if not isinstance(indexer, tuple):
Expand Down Expand Up @@ -2493,40 +2490,6 @@ def _tupleize_axis_indexer(ndim: int, axis: AxisInt, key) -> tuple:
return tuple(new_key)


def convert_to_index_sliceable(obj: DataFrame, key):
"""
If we are index sliceable, then return my slicer, otherwise return None.
"""
idx = obj.index
if isinstance(key, slice):
return idx._convert_slice_indexer(key, kind="getitem", is_frame=True)

elif isinstance(key, str):

# we are an actual column
if key in obj.columns:
return None

# We might have a datetimelike string that we can translate to a
# slice here via partial string indexing
if idx._supports_partial_string_indexing:
try:
res = idx._get_string_slice(str(key))
warnings.warn(
"Indexing a DataFrame with a datetimelike index using a single "
"string to slice the rows, like `frame[string]`, is deprecated "
"and will be removed in a future version. Use `frame.loc[string]` "
"instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
return res
except (KeyError, ValueError, NotImplementedError):
return None

return None


def check_bool_indexer(index: Index, key) -> np.ndarray:
"""
Check if key is a valid boolean indexer for an object with such index and
Expand Down Expand Up @@ -2661,27 +2624,24 @@ def need_slice(obj: slice) -> bool:
)


def check_deprecated_indexers(key) -> None:
"""Checks if the key is a deprecated indexer."""
def check_dict_or_set_indexers(key) -> None:
"""
Check if the indexer is or contains a dict or set, which is no longer allowed.
"""
if (
isinstance(key, set)
or isinstance(key, tuple)
and any(isinstance(x, set) for x in key)
):
warnings.warn(
"Passing a set as an indexer is deprecated and will raise in "
"a future version. Use a list instead.",
FutureWarning,
stacklevel=find_stack_level(),
raise TypeError(
"Passing a set as an indexer is not supported. Use a list instead."
)

if (
isinstance(key, dict)
or isinstance(key, tuple)
and any(isinstance(x, dict) for x in key)
):
warnings.warn(
"Passing a dict as an indexer is deprecated and will raise in "
"a future version. Use a list instead.",
FutureWarning,
stacklevel=find_stack_level(),
raise TypeError(
"Passing a dict as an indexer is not supported. Use a list instead."
)
6 changes: 3 additions & 3 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@
from pandas.core.indexes.multi import maybe_droplevels
from pandas.core.indexing import (
check_bool_indexer,
check_deprecated_indexers,
check_dict_or_set_indexers,
)
from pandas.core.internals import (
SingleArrayManager,
Expand Down Expand Up @@ -914,7 +914,7 @@ def _slice(self, slobj: slice, axis: Axis = 0) -> Series:
return self._get_values(slobj)

def __getitem__(self, key):
check_deprecated_indexers(key)
check_dict_or_set_indexers(key)
key = com.apply_if_callable(key, self)

if key is Ellipsis:
Expand Down Expand Up @@ -1057,7 +1057,7 @@ def _get_value(self, label, takeable: bool = False):
return self.iloc[loc]

def __setitem__(self, key, value) -> None:
check_deprecated_indexers(key)
check_dict_or_set_indexers(key)
key = com.apply_if_callable(key, self)
cacher_needs_updating = self._check_is_chained_assignment_possible()

Expand Down
10 changes: 6 additions & 4 deletions pandas/tests/frame/indexing/test_getitem.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,8 +142,10 @@ def test_getitem_listlike(self, idx_type, levels, float_frame):
idx_check = list(idx_type(keys))

if isinstance(idx, (set, dict)):
with tm.assert_produces_warning(FutureWarning):
result = frame[idx]
with pytest.raises(TypeError, match="as an indexer is not supported"):
frame[idx]

return
else:
result = frame[idx]

Expand Down Expand Up @@ -467,9 +469,9 @@ def test_getitem_datetime_slice(self):
class TestGetitemDeprecatedIndexers:
@pytest.mark.parametrize("key", [{"a", "b"}, {"a": "a"}])
def test_getitem_dict_and_set_deprecated(self, key):
# GH#42825
# GH#42825 enforced in 2.0
df = DataFrame(
[[1, 2], [3, 4]], columns=MultiIndex.from_tuples([("a", 1), ("b", 2)])
)
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(TypeError, match="as an indexer is not supported"):
df[key]
22 changes: 11 additions & 11 deletions pandas/tests/frame/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1711,14 +1711,14 @@ def test_loc_on_multiindex_one_level(self):
tm.assert_frame_equal(result, expected)


class TestDepreactedIndexers:
class TestDeprecatedIndexers:
@pytest.mark.parametrize(
"key", [{1}, {1: 1}, ({1}, "a"), ({1: 1}, "a"), (1, {"a"}), (1, {"a": "a"})]
)
def test_getitem_dict_and_set_deprecated(self, key):
# GH#42825
# GH#42825 enforced in 2.0
df = DataFrame([[1, 2], [3, 4]], columns=["a", "b"])
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(TypeError, match="as an indexer is not supported"):
df.loc[key]

@pytest.mark.parametrize(
Expand All @@ -1733,22 +1733,22 @@ def test_getitem_dict_and_set_deprecated(self, key):
],
)
def test_getitem_dict_and_set_deprecated_multiindex(self, key):
# GH#42825
# GH#42825 enforced in 2.0
df = DataFrame(
[[1, 2], [3, 4]],
columns=["a", "b"],
index=MultiIndex.from_tuples([(1, 2), (3, 4)]),
)
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(TypeError, match="as an indexer is not supported"):
df.loc[key]

@pytest.mark.parametrize(
"key", [{1}, {1: 1}, ({1}, "a"), ({1: 1}, "a"), (1, {"a"}), (1, {"a": "a"})]
)
def test_setitem_dict_and_set_deprecated(self, key):
# GH#42825
def test_setitem_dict_and_set_disallowed(self, key):
# GH#42825 enforced in 2.0
df = DataFrame([[1, 2], [3, 4]], columns=["a", "b"])
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(TypeError, match="as an indexer is not supported"):
df.loc[key] = 1

@pytest.mark.parametrize(
Expand All @@ -1762,12 +1762,12 @@ def test_setitem_dict_and_set_deprecated(self, key):
((1, 2), {"a": "a"}),
],
)
def test_setitem_dict_and_set_deprecated_multiindex(self, key):
# GH#42825
def test_setitem_dict_and_set_disallowed_multiindex(self, key):
# GH#42825 enforced in 2.0
df = DataFrame(
[[1, 2], [3, 4]],
columns=["a", "b"],
index=MultiIndex.from_tuples([(1, 2), (3, 4)]),
)
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(TypeError, match="as an indexer is not supported"):
df.loc[key] = 1
11 changes: 4 additions & 7 deletions pandas/tests/frame/indexing/test_xs.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,7 @@ def test_xs_keep_level(self):
expected = df[:1]
tm.assert_frame_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = df.xs([2008, "sat"], level=["year", "day"], drop_level=False)
result = df.xs((2008, "sat"), level=["year", "day"], drop_level=False)
tm.assert_frame_equal(result, expected)

def test_xs_view(self, using_array_manager, using_copy_on_write):
Expand Down Expand Up @@ -225,8 +224,7 @@ def test_xs_with_duplicates(self, key, level, multiindex_dataframe_random_data):
expected = concat([frame.xs("one", level="second")] * 2)

if isinstance(key, list):
with tm.assert_produces_warning(FutureWarning):
result = df.xs(key, level=level)
result = df.xs(tuple(key), level=level)
else:
result = df.xs(key, level=level)
tm.assert_frame_equal(result, expected)
Expand Down Expand Up @@ -412,6 +410,5 @@ def test_xs_list_indexer_droplevel_false(self):
# GH#41760
mi = MultiIndex.from_tuples([("x", "m", "a"), ("x", "n", "b"), ("y", "o", "c")])
df = DataFrame([[1, 2, 3], [4, 5, 6]], columns=mi)
with tm.assert_produces_warning(FutureWarning):
with pytest.raises(KeyError, match="y"):
df.xs(["x", "y"], drop_level=False, axis=1)
with pytest.raises(KeyError, match="y"):
df.xs(("x", "y"), drop_level=False, axis=1)
10 changes: 4 additions & 6 deletions pandas/tests/indexes/datetimes/test_partial_slicing.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,12 +295,10 @@ def test_partial_slicing_dataframe(self):
expected = df["a"][theslice]
tm.assert_series_equal(result, expected)

# Frame should return slice as well
with tm.assert_produces_warning(FutureWarning):
# GH#36179 deprecated this indexing
result = df[ts_string]
expected = df[theslice]
tm.assert_frame_equal(result, expected)
# pre-2.0 df[ts_string] was overloaded to interpret this
# as slicing along index
with pytest.raises(KeyError, match=ts_string):
df[ts_string]

# Timestamp with resolution more precise than index
# Compatible with existing key
Expand Down
Loading