Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: resample fixes #12449

Closed
wants to merge 1 commit into from
Closed

BUG: resample fixes #12449

wants to merge 1 commit into from

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Feb 26, 2016

make sure .resample(...).plot() warns and returns a correct plotting object
make sure that .groupby(...).resample(....) is hitting warnings when appropriate

closes #12448

@jreback jreback added API Design Resample resample method Compat pandas objects compatability with Numpy or Python functions labels Feb 26, 2016
@jreback jreback added this to the 0.18.0 milestone Feb 26, 2016
@jreback
Copy link
Contributor Author

jreback commented Feb 26, 2016

@jorisvandenbossche this should take care of 1) and 3) in #12448 and doc 2)

@jreback jreback force-pushed the resample branch 3 times, most recently from 370d88b to 8e070c3 Compare February 26, 2016 14:07
@jreback
Copy link
Contributor Author

jreback commented Feb 27, 2016

@jorisvandenbossche ?

def plot(self, *args, **kwargs):
# for compat with prior versions, we want to
# have the warnings shown here and just have this work
return _maybe_process_deprecations(self, how='mean').plot(*args,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small remark, by passing how='mean' here, the user gets the message "FutureWarning: how in .resample() is deprecated", while he did not use 'how' (in eg s.resample('15min').plot())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, he DID use mean implicity. Oh you want a more meaningful warning? ok sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that is what I mean (a more meaningful warning). As this is now rather confusing to the user IMO, as the warning says something about how while the user has not explicitely used it (with other operations, you get a "resample is now a deferred operation, please use df.resample().mean() instead message, which is more appropriate here)

@jorisvandenbossche
Copy link
Member

It is possible that the previous workaround with warning ("resample is now a deferred ..") for accessing elements does not work anymore?

In [13]: s.resample('15min').mean()[0]
Out[13]: -0.27773169409289944

In [14]: s.resample('15min')[0]
KeyError: 'Column not found: 0'

I thought this last one worked (gave the same result), but warned?

@jorisvandenbossche
Copy link
Member

For case 3, I still not get the same as in 0.17.1. With this PR:

In [19]: df.groupby('group').resample('1D', fill_method='ffill')
C:\Anaconda\envs\devel\Scripts\ipython-script.py:1: FutureWarning: fill_method i
s deprecated to .resample()
the new syntax is .resample(...).ffill()
  if __name__ == '__main__':
Out[19]:
            val
date
2016-01-03    5
2016-01-10    6
2016-01-17    7
2016-01-24    8

So the actual grouper has disappeared

@jreback
Copy link
Contributor Author

jreback commented Feb 27, 2016

You get the same KeyError in 0.17.1 when you are using a DataFrame

In [5]: s.to_frame('foo').resample('15T')[0]

I will fix the Series behavior

@jreback
Copy link
Contributor Author

jreback commented Feb 27, 2016

@jorisvandenbossche

see #12486

I can't fix this now. I think its possible, but will require some work. This is NOT the same as a multi-groupby of a column / freq, because resampling does some auto-filling. To be honest this is prob a bit to magical in the past.

@jreback jreback force-pushed the resample branch 3 times, most recently from 199c3b4 to b131f8f Compare February 27, 2016 22:55
@jreback
Copy link
Contributor Author

jreback commented Mar 5, 2016

@jorisvandenbossche any comments?

@jorisvandenbossche
Copy link
Member

The error with the Series is still there (this is not necessarily related to the changes in this PR though):

In [21]: s.resample('2s').mean()[0]
Out[21]: 0.22864360896621477

In [22]: s.resample('2s')[0]
KeyError: 'Column not found: 0'

I think this should give something like iloc does?:

In [23]: s.resample('2s').iloc[0]
ValueError: .resample() is now a deferred operation
        use .resample(...).mean() instead of .resample(...)
        assignment will have no effect as you are working on a copy

@jorisvandenbossche
Copy link
Member

BTW, the "assignment will have no effect as you are working on a copy" in that error message is not really clear to me. It's not that it has no effect, as it raises an error.

@jorisvandenbossche
Copy link
Member

On the PR itself, plot change and whatsnew update looks good!

@jorisvandenbossche
Copy link
Member

Something else with getitem:

In [8]: rs['2010-01-01 09:00:00']

AbstractMethodError: This method must be defined in the concrete class of Series
GroupBy

But these getitem things are not really related to this PR, so I can open another issue for that, and then this PR can be merged

@jreback
Copy link
Contributor Author

jreback commented Mar 8, 2016

Are you sure you did with this PR? (for the getitem issue). This is explicity fixed/mentioned.

In [5]: s.resample('2s').mean()[0]
Out[5]: -0.0026492975424703812

In [6]: s.resample('2s')[0]
Out[6]: -0.0026492975424703812

@jreback
Copy link
Contributor Author

jreback commented Mar 8, 2016

took out the assignment part of the message

@jreback
Copy link
Contributor Author

jreback commented Mar 8, 2016

ok, both those errors fixed (and now they show the deprecation warning as well).

ping if ok and i'll merge (I need to squash)

@jorisvandenbossche
Copy link
Member

Looks good!

@jreback
Copy link
Contributor Author

jreback commented Mar 8, 2016

ok!

make sure .resample(...).plot() warns and returns a correct plotting object
make sure that .groupby(...).resample(....) is hitting warnings when appropriate

closes pandas-dev#12448
@benoit9126
Copy link
Contributor

benoit9126 commented Apr 19, 2016

Hi,

I have a FutureWarning from pandas 0.18.0 related to a resample operation which seems inappropriate:

import pandas as pd
import datetime as dt

df=pd.DataFrame(data=[1,3], index=[dt.timedelta(), dt.timedelta(minutes=3)])
df.resample('1T').interpolate(method='linear')

The result I get from the last line is correct but I also get the following warning:

FutureWarning: .resample() is now a deferred operation use .resample(...).mean() instead of .resample(...)

What is wrong with my syntax?
Thanks

@jorisvandenbossche
Copy link
Member

@benoit9126 No, I think this warning is correct, and I think what you want is:

df.resample('1T').mean().interpolate(method='linear')

interpolate is not a method that is available on a Resample object. You first have to indicate how you want to resample the values (in this case using mean does work because you downsample).

But that aside, it does maybe make sense to have interpolate available, as this can work similarly to to resample().fillna()

@jreback
Copy link
Contributor Author

jreback commented Apr 19, 2016

@benoit9126 as @jorisvandenbossche points out, in < 0.18.0 there was an implicit how=mean being done when you said df.resample('1T) w/o any how= arg.

@jorisvandenbossche yes, having a .interpolate method would be fine (and very straightforward to do actually). if you can open an issue with a nice example would be great.

@jorisvandenbossche
Copy link
Member

@jreback Something else I just noticed, seems like a bug in asfreq, but didn't yet test with master:

In [16]: df=pd.DataFrame(data=[1,3], index=[dt.timedelta(), dt.timedelta(minutes
=3)])
In [17]: df
Out[17]:
          0
00:00:00  1
00:03:00  3

In [18]: df.resample('1T').mean()
Out[18]:
            0
00:00:00  1.0
00:01:00  NaN
00:02:00  NaN
00:03:00  3.0

In [19]: df.resample('1T').asfreq()
Out[19]:
            0
00:00:00  1.0
00:01:00  NaN
00:02:00  NaN

Missing the the last value in asfreq?

@jreback
Copy link
Contributor Author

jreback commented Apr 19, 2016

@jorisvandenbossche hmm, that last does look like a bug, can you create an issue and i'll take a look. ty.

@benoit9126
Copy link
Contributor

benoit9126 commented Apr 19, 2016

Thanks a lot for the explanation.
Nevertheless, I keep my code as it is hoping that it won't throw a warning in pandas 0.18.2 😉

@jorisvandenbossche
Copy link
Member

@benoit9126 To be clear, there is no guarantee that #12925 will be implemented by 0.19, so it is possible that this code will start raising an error. Your responsibility :-) (or always welcome to put up a PR! I don't think it will be that hard to implement)

@benoit9126
Copy link
Contributor

@jorisvandenbossche To be honest I do not have enough time to implement this feature in a close future (especially because I have never carefully inspected the pandas core). If it changes, it will be a pleasure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Compat pandas objects compatability with Numpy or Python functions Resample resample method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Breaking examples due to resample refactor
3 participants