Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: dispatch Block.fillna to putmask/where #45911

Merged
merged 2 commits into from
Feb 11, 2022

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Among other things, we avoid re-calculating 'mask' when splitting.

@jreback jreback added Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code labels Feb 11, 2022
@jreback jreback added this to the 1.5 milestone Feb 11, 2022
@jreback jreback merged commit 769fc54 into pandas-dev:main Feb 11, 2022
@jbrockmendel jbrockmendel deleted the ref-share-fillna-2 branch February 11, 2022 02:44
Comment on lines +1489 to +1491
if self.dtype.kind == "m":
try:
res_values = self.values.fillna(value, limit=limit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for putting this behind a if self.dtype.kind == "m": check?

Because this now will call super().fillna() for all EAs, but the parent fillna() implementation doesn't actually call EA.fillna()

(this was catched in the geopandas test suite)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for putting this behind a if self.dtype.kind == "m": check?

to get the FutureWarning.

Copy link
Member

@jorisvandenbossche jorisvandenbossche Feb 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which future warning?

Let me rephrase my question: what's the reason for no longer calling EA.fillna() here for all EAs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which future warning?

The FutureWarning on L1494-1502 just below this.

Let me rephrase my question: what's the reason for no longer calling EA.fillna() here for all EAs?

Primarily code-sharing, with the side-benefit that we may actually respect the "inplace" keyword (see TODO comment L1508)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the point of the EA.fillna() method is that this gets called from the DataFrame/Series fillna, so that an EA can provide a specialized implementation. So I think we should keep that logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that an EA can provide a specialized implementation

does anyone (e.g. pyarrow) actually have a specialized implementation where it makes a real difference? e.g. is more performant than going through putmask/where?

the point of the EA.fillna() method is that this gets called from the DataFrame/Series fillna

I see "the point" of the EA interface in general as being to provide a minimal(ish) set of primitives from which we can implement the Series/DataFrame(/Index!) API. If Series.foobar can be implemented efficiently without EA.foobar, that is a reason to remove EA.foobar, not to use it unnecessarily.

(Though in this case we still use EA.fillna for pad/bfill/interpolate cases, so I'm not actually advocating getting rid of it)

Last, while the corrected 'inplace' behavior was not the motivator behind this change, I do think it is an important improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants