Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Enforce deprecation of pinning name in SeriesGroupBy.agg #57671

Merged
merged 3 commits into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,7 @@ Removal of prior version deprecations/changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- :func:`read_excel`, :func:`read_json`, :func:`read_html`, and :func:`read_xml` no longer accept raw string or byte representation of the data. That type of data must be wrapped in a :py:class:`StringIO` or :py:class:`BytesIO` (:issue:`53767`)
- :meth:`Series.dt.to_pydatetime` now returns a :class:`Series` of :py:class:`datetime.datetime` objects (:issue:`52459`)
- :meth:`SeriesGroupBy.agg` no longer pins the name of the group to the input passed to the provided ``func`` (:issue:`51703`)
- All arguments except ``name`` in :meth:`Index.rename` are now keyword only (:issue:`56493`)
- All arguments except the first ``path``-like argument in IO writers are now keyword only (:issue:`54229`)
- All arguments in :meth:`Index.sort_values` are now keyword only (:issue:`56493`)
Expand Down Expand Up @@ -238,7 +239,6 @@ Removal of prior version deprecations/changes
- Removed unused arguments ``*args`` and ``**kwargs`` in :class:`Resampler` methods (:issue:`50977`)
- Unrecognized timezones when parsing strings to datetimes now raises a ``ValueError`` (:issue:`51477`)


.. ---------------------------------------------------------------------------
.. _whatsnew_300.performance:

Expand Down
52 changes: 2 additions & 50 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,7 @@
)
import pandas.core.common as com
from pandas.core.frame import DataFrame
from pandas.core.groupby import (
base,
ops,
)
from pandas.core.groupby import base
from pandas.core.groupby.groupby import (
GroupBy,
GroupByPlot,
Expand Down Expand Up @@ -373,32 +370,7 @@ def aggregate(self, func=None, *args, engine=None, engine_kwargs=None, **kwargs)
index=self._grouper.result_index,
dtype=obj.dtype,
)

if self._grouper.nkeys > 1:
return self._python_agg_general(func, *args, **kwargs)

try:
return self._python_agg_general(func, *args, **kwargs)
except KeyError:
# KeyError raised in test_groupby.test_basic is bc the func does
# a dictionary lookup on group.name, but group name is not
# pinned in _python_agg_general, only in _aggregate_named
result = self._aggregate_named(func, *args, **kwargs)

warnings.warn(
"Pinning the groupby key to each group in "
f"{type(self).__name__}.agg is deprecated, and cases that "
"relied on it will raise in a future version. "
"If your operation requires utilizing the groupby keys, "
"iterate over the groupby object instead.",
FutureWarning,
stacklevel=find_stack_level(),
)

# result is a dict whose keys are the elements of result_index
result = Series(result, index=self._grouper.result_index)
result = self._wrap_aggregated_output(result)
return result
return self._python_agg_general(func, *args, **kwargs)

agg = aggregate

Expand Down Expand Up @@ -527,26 +499,6 @@ def _wrap_applied_output(
result.index = default_index(len(result))
return result

def _aggregate_named(self, func, *args, **kwargs):
# Note: this is very similar to _aggregate_series_pure_python,
# but that does not pin group.name
result = {}
initialized = False

for name, group in self._grouper.get_iterator(self._obj_with_exclusions):
# needed for pandas/tests/groupby/test_groupby.py::test_basic_aggregations
object.__setattr__(group, "name", name)

output = func(group, *args, **kwargs)
output = ops.extract_result(output)
if not initialized:
# We only do this validation on the first iteration
ops.check_result_array(output, group.dtype)
initialized = True
result[name] = output

return result

__examples_series_doc = dedent(
"""
>>> ser = pd.Series([390.0, 350.0, 30.0, 20.0],
Expand Down
10 changes: 0 additions & 10 deletions pandas/tests/groupby/test_reductions.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,16 +65,6 @@ def test_basic_aggregations(dtype):
with pytest.raises(pd.errors.SpecificationError, match=msg):
grouped.aggregate({"one": np.mean, "two": np.std})

group_constants = {0: 10, 1: 20, 2: 30}
msg = (
"Pinning the groupby key to each group in SeriesGroupBy.agg is deprecated, "
"and cases that relied on it will raise in a future version"
)
with tm.assert_produces_warning(FutureWarning, match=msg):
# GH#41090
agged = grouped.agg(lambda x: group_constants[x.name] + x.mean())
assert agged[1] == 21

# corner cases
msg = "Must produce aggregated value"
# exception raised is type Exception
Expand Down
Loading