Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Deprecate week, weekofyear in Series.dt,DatetimeIndex #33595

Merged
merged 15 commits into from
May 26, 2020

Conversation

mgmarino
Copy link
Contributor

@mgmarino mgmarino commented Apr 16, 2020

Closes #33503

This PR is a followup to #33220 and implements what was discussed there.

A few comments:

  • DatetimeIndex.isocalendar does not set the index on the returned dataframe,
    leading to some cumbersome syntax. This may be worth changing, please let me
    know what you think.

  • Should we also deprecate Timestamp.week/weekofyear?

  • closes #xxxx

  • tests added / passed

  • passes black pandas

  • passes git diff upstream/master -u -- "*.py" | flake8 --diff

  • whatsnew entry

@mgmarino mgmarino marked this pull request as ready for review April 16, 2020 20:05
@mgmarino mgmarino marked this pull request as draft April 16, 2020 20:05
stocks["week_id"] = pd.to_datetime(stocks.index).week
stocks["week_id"] = (
pd.to_datetime(stocks.index).isocalendar().set_index(stocks.index).week
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the cumbersome syntax I referenced in the PR message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pd.to_datetime(stocks.index) is just dates, ditto stocks.index, so this can become dates.isocalendar().set_index(dates)["week"]

@mgmarino mgmarino marked this pull request as ready for review April 16, 2020 20:34
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fine to me

Series.dt.weekofyear and Series.dt.week have been deprecated.
Please use Series.dt.isocalendar().week instead.
"""
import warnings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put this import at the top of the file (same everywhere)

)
week_series = self.isocalendar().week
week_series.name = self.name
if week_series.hasnans:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do it like this now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the current week accessor the array is converted to a float and masked if there are nans. This reproduces that behavior instead of passing back a UInt32 series.



def test_week_and_week_of_year_are_deprecated():
arr = DatetimeArray._from_sequence(["2000-01-03"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the issue number as a comment

@@ -375,6 +375,14 @@ def test_iter_readonly():
list(dti)


def test_week_and_weekofyear_are_deprecated():
idx = pd.date_range(start="2019-12-29", freq="D", periods=4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same



def test_week_and_weekofyear_are_deprecated():
series = pd.to_datetime(pd.Series(["2020-01-01"]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@jreback
Copy link
Contributor

jreback commented Apr 16, 2020

DatetimeIndex.isocalendar does not set the index on the returned dataframe,
leading to some cumbersome syntax. This may be worth changing, please let me
know what you think.

hmm you mean the index of the original dataframe? I think we do this on the accessors, so we should do this.

@jreback jreback added Deprecate Functionality to remove in pandas Datetime Datetime data dtype labels Apr 16, 2020
@@ -809,6 +809,32 @@ def indexer_between_time(

return mask.nonzero()[0]

@property
def weekofyear(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be inheriting this from the underlying DatetimeArray using inherit_names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the deprecation should be handled by the underlying DatetimeArray? I started trying to do it that way. The reason for doing it this way is to get the stacklevel right for the warning. If we are ok with having it "broken" for a direct call to DatetimeArray.week, then I can go that route.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will almost always be seen by users via the DatetimeIndex, so i think thats the case to prioritize getting the stacklevel right for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I moved the deprecation In DatetimeIndex to only be in DatetimeArray and optimized the stacklevel appropriately.

@jbrockmendel
Copy link
Member

DatetimeIndex.isocalendar does not set the index on the returned dataframe,
leading to some cumbersome syntax

agreed

@mgmarino
Copy link
Contributor Author

hmm you mean the index of the original dataframe? I think we do this on the accessors, so we should do this.

Yes, I mean that the DataFrame returned by DatetimeIndex.isocalendar() should probably have its index set to the DatetimeIndex. This is done e.g. in Series.dt.isocalendar(): the returned DataFrame has it's index set to that of the Series.

For Index classes, the current behavior of inherit_names when wrapping is to convert the result of the delegated call to the same type if possible (i.e. DatetimeIndex => DatetimeIndex) and to fall back to a generic Index if not.

DatetimeIndex.isocalendar does not set the index on the returned dataframe,
leading to some cumbersome syntax

agreed

Ok, the question would then be where to do this:

  • It can be done in inherit_names, but the code there would need to be adapted to handle a DataFrame (or something more generic?) and this is currently the only case for that.
  • Or the delegation can be done "manually" in an explicitly definedDatetimeIndex.isocalendar method.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am personally not sure a deprecation is needed, see #33503

@mgmarino mgmarino force-pushed the deprecate-week-of-year branch 2 times, most recently from 943e8f9 to 29ce5ac Compare April 20, 2020 20:49
@mgmarino
Copy link
Contributor Author

Build failing due to numba version. Will wait for #33686 or patch from @mroeschke

@mgmarino mgmarino force-pushed the deprecate-week-of-year branch 2 times, most recently from f651e61 to 6e90ae7 Compare April 21, 2020 05:35
@mgmarino
Copy link
Contributor Author

@jbrockmendel @jreback
Sorry for the delay, I believe all open comments have been addressed. Tests green.

Regarding the setting of the index, I would like to do that in a separate PR if there are no objections.

@@ -66,6 +66,9 @@ def test_nat_vector_field_access():
# on NaT/Timestamp for compat with datetime
if field == "weekday":
continue
if field in ["week", "weekofyear"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should properly parameterize these tests (if you would like to make an issue would be great), PR even better :->

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im happy to make another PR, will wait until this is merged.

FutureWarning,
stacklevel=3,
)
return pd.Int64Index(self.isocalendar().week)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should return ndarray here and be wrapped on DatetimeIndex to return Int64Index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, thanks.

@mgmarino mgmarino force-pushed the deprecate-week-of-year branch 2 times, most recently from 4539708 to 47c3d32 Compare April 21, 2020 16:50
- This is then wrapped by DatetimeIndex.week
@mgmarino
Copy link
Contributor Author

@mroeschke @jreback @jbrockmendel Just checking in here to see if anything else needs to be done re discussion/PR? Thanks!

@mgmarino
Copy link
Contributor Author

@mroeschke @jreback I would like to ping one more time, to try to keep this from getting too stale. If more discussion needs to be add about this, I'm totally fine with that, but I would like to also address the two issues that came up in the PR here. If this needs more time to merge, then I would go ahead and address those already.

@mroeschke
Copy link
Member

Sorry for the delay @mgmarino. I think everything looked good to me a while back. Looks like there's a merge conflict, but once that's resolved I think this should be good

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @jreback

@jreback jreback added this to the 1.1 milestone May 13, 2020
@jreback
Copy link
Contributor

jreback commented May 13, 2020

yeah this is ok, ping on green. @jorisvandenbossche unless any objections here.

@mgmarino
Copy link
Contributor Author

mgmarino commented May 13, 2020

Build failure due to a version bump of flake in the CI (now 3.8.1, last succeeded with 3.7.9), see #34150.

@mgmarino
Copy link
Contributor Author

@jreback back to green

@jreback
Copy link
Contributor

jreback commented May 25, 2020

this lgtm. we can merge on green.

@mroeschke mroeschke merged commit 3ae33c6 into pandas-dev:master May 26, 2020
@mroeschke
Copy link
Member

Thanks for sticking with this @mgmarino. Great PR as always!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR: weekofyear
5 participants