Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REG: DataFrame/Series.transform with list and non-list dict values #40090

Merged
merged 3 commits into from
Feb 27, 2021

Conversation

rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented Feb 26, 2021

Currently on master, agg does some preprocessing on dict-like arguments that transform does not, leading to incorrect behavior in transform. This PR moves agg's preprocessing into validate_dictlike_arg which is used by both agg and transform, fixing the issue.

As identified in #40018, this is a regression from 1.1 -> 1.2 (almost certainly from one of my PRs, but I haven't checked which). Backporting this to 1.2 will probably require a bit of a different fix.

@rhshadrach rhshadrach added Regression Functionality that used to work in a prior pandas version Apply Apply, Aggregate, Transform, Map labels Feb 26, 2021
@jreback jreback added this to the 1.2.3 milestone Feb 26, 2021
@@ -327,6 +327,7 @@ Numeric
- Bug in :func:`select_dtypes` different behavior between Windows and Linux with ``include="int"`` (:issue:`36569`)
- Bug in :meth:`DataFrame.apply` and :meth:`DataFrame.agg` when passed argument ``func="size"`` would operate on the entire ``DataFrame`` instead of rows or columns (:issue:`39934`)
- Bug in :meth:`DataFrame.transform` would raise ``SpecificationError`` when passed a dictionary and columns were missing; will now raise a ``KeyError`` instead (:issue:`40004`)
- Bug in :meth:`DataFrame.transform` and :meth:`Series.transform` would have incorrect column labels when passed a dictionary with a mix of list and non-list values (:issue:`40018`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put in 1.2.3

@jreback
Copy link
Contributor

jreback commented Feb 26, 2021

seems fine. can you merge master. and then can backport (agreed may require some work on that branch).


Ensures that necessary columns exist if obj is a DataFrame, and
that a nested renamer is not passed.
that a nested renamer is not passed. Also normalizes to all lists
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not here if too many changes for a backport, but maybe change function name to say normalize_dictlike_arg

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins Thanks for the suggestion. That's actually how I started this out, but then changed it to validate recalling util._validators (some of those do only validate, but a bunch also normalize). I'm seeing a mix of normalize or validate used. I'll change to normalize here, but maybe we should make this consistent.

Related: #19171

@jreback jreback merged commit 47c6d16 into pandas-dev:master Feb 27, 2021
@jreback
Copy link
Contributor

jreback commented Feb 27, 2021

thanks @rhshadrach pls track the backport itself as it likely won't merge cleanly

@jreback
Copy link
Contributor

jreback commented Feb 27, 2021

@meeseeksdev backport 1.2.x

@lumberbot-app

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame/Series.transform with list and non-list dict values
3 participants