Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISC: add accessor attributes to Index for consistency with Series #17134

Open
jbrockmendel opened this issue Aug 1, 2017 · 11 comments
Open
Labels
API - Consistency Internal Consistency of API/Behavior API Design Enhancement

Comments

@jbrockmendel
Copy link
Member

Broken off of #17117 for discussion, xref #8162, #17061, recent mailing list thread.

With a datetime-like column we can access year, hour, ... with self.dt.foo. With a DatetimeIndex (PeriodIndex, ...) we access these attributes directly self.foo without the .dt. I'd like to add a .dt property to the appropriate Index subclasses so that these attributes can be accessed symmetrically. i.e. instead of:

if isinstance(obj, pd.Index):
    year = obj.year
elif isinstance(obj, pd.Series):
    year = obj.dt.year

we can just use year = obj.dt.year regardless.

The implementation is three lines in core.indexes.datetimelike.DatetimeIndexOpsMixin`:

    @property
    def dt(self):
        return self

Thoughts?

@TomAugspurger
Copy link
Contributor

I'd support this. It makes writing generic downstream code easier.

@jorisvandenbossche jorisvandenbossche changed the title DISC: Make Index Behave More Like Series DISC: add accessor attributes to Index for consistency with Series Aug 1, 2017
@jorisvandenbossche
Copy link
Member

I would also support the idea, although I wouldn't implement it like that, because this would give you not the selection of options on tab completion (and also gives the possibility to misuse the accessor and use a wrong method eg df.index.dt.mean())

@jbrockmendel
Copy link
Member Author

@jorisvandenbossche That's a good point. Shouldn't be too difficult to do implement the "right" way. It'll take some small edits to core.indexes.accessors.maybe_to_datetimelike. Implementing that will be easier if/when #17042 gets merged.

@jreback
Copy link
Contributor

jreback commented Aug 8, 2017

I think this is ok. This makes Index/Series more consistent which is a good. thing.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Aug 9, 2017

One thing to discuss is what it should return.

In the PR you just opened, you say:

it will return an object that is effectively index.to_series().dt

But it could also return the same as the plain index. method (which is what you originally proposed: a index.dt which would effectively return self) ? Both options are not the same:

In [72]: idx = pd.date_range("2017-01-01", periods=3)

In [74]: idx.day
Out[74]: Int64Index([1, 2, 3], dtype='int64')

In [75]: idx.to_series().dt.day
Out[75]: 
2017-01-01    1
2017-01-02    2
2017-01-03    3
Freq: D, dtype: int64

IMO it should return [74], not [75].
Although, this may defeat a bit your goal of not having to care about if something is an index or a Series? (as .dt will work on both, but still return something different)
But for that (having an index and column that really behave the same), it is maybe more worthwhile to work on #8162 instead of adding dt to the Index ?

@jreback
Copy link
Contributor

jreback commented Aug 9, 2017

this should for sure be [74]. In fact the impl is pretty trivial (and indicated at the top of the PR).

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate labels Aug 9, 2017
@jreback jreback added this to the Next Major Release milestone Aug 9, 2017
@jorisvandenbossche
Copy link
Member

and indicated at the top of the PR).

In one of the previous ones, but not in the latest: #17204 (I have to admit I am a bit lost in all the different PRs with related content)

@jbrockmendel
Copy link
Member Author

(I have to admit I am a bit lost in all the different PRs with related content)

Sorry for the inundation. I'll be slowing down shortly.

@jreback & @jorisvandenbossche
The implementation in #17204 behaves like [75]. The implementation with a property at the top of the PR behaves like [74]. I'm largely indifferent between the two, implemented the [75]-like version to address @jorisvandenbossche's comment about dir(index.dt).

The main advantage of the [75] version in #17204 is that it make index.dt actually go through the same code paths for Index and Series, so the similar behavior is not just cosmetic. I view that as a step towards e.g. #8162.

@jorisvandenbossche
Copy link
Member

I'll be slowing down shortly.

No need to slow down! It is just that it can be a confusing when multiple PRs try to solve related / the same problems in different ways, and then you need to carefully explain there what the PR does, what the difference / relationship is with the other open PRs.

The main advantage of the [75] version in #17204 is that it make index.dt actually go through the same code paths for Index and Series, so the similar behavior is not just cosmetic. I view that as a step towards e.g. #8162.

It makes the output for Series and Index indeed more similar, but, it makes the output for Index completely inconsistent with other index attributes. So I still think it should be like [74].
But as I said above, if you really want to be able to handle your index as it is a column/Series, I think #8162 is a good issue to work on.

@jreback
Copy link
Contributor

jreback commented Aug 10, 2017

This needs to just use the simpler implementation (e.g. Index.dt -> self. anything else is inconsistent. This should be very straightforward to do.

@jbrockmendel
Copy link
Member Author

I'm on board with @jreback's suggestion. @jorisvandenbossche is your original concern about the index.dt namespace a sticking point?

@jbrockmendel jbrockmendel added the API - Consistency Internal Consistency of API/Behavior label Dec 26, 2019
@mroeschke mroeschke added Enhancement and removed Compat pandas objects compatability with Numpy or Python functions labels Apr 10, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior API Design Enhancement
Projects
None yet
5 participants