Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: fastpath on Period construction from PeriodIndex #5155

Closed
jreback opened this issue Oct 8, 2013 · 11 comments · Fixed by #7891
Closed

PERF: fastpath on Period construction from PeriodIndex #5155

jreback opened this issue Oct 8, 2013 · 11 comments · Fixed by #7891
Labels
Frequency DateOffsets Performance Memory or execution speed performance Period Period data type
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Oct 8, 2013

vbench already in timeseries/period_setitem
very inefficient because on the boxing of the PeriodIndex (to an object index), it recomputes the freq for each Period (even though it is passed), need a fastpath on this type of construction

In [1]: rng = period_range('1/1/1990', freq='S', periods=20000)

In [2]: df = DataFrame(index=range(len(rng)))

In [3]: def f():
   ...:     df['col'] = rng
   ...:     
   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    40000    0.055    0.000    0.095    0.000 abc.py:128(__instancecheck__)
    20000    0.043    0.000    0.345    0.000 period.py:66(__init__)
    20000    0.037    0.000    0.381    0.000 period.py:683(<lambda>)
    80031    0.037    0.000    0.131    0.000 {isinstance}
    20000    0.034    0.000    0.223    0.000 frequencies.py:76(get_freq_code)
    60003    0.029    0.000    0.029    0.000 _weakrefset.py:68(__contains__)
    20000    0.028    0.000    0.063    0.000 frequencies.py:464(_base_and_stride)
    20000    0.028    0.000    0.028    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
    20000    0.027    0.000    0.038    0.000 frequencies.py:767(_period_str_to_code)
    40000    0.023    0.000    0.144    0.000 common.py:1730(is_integer)
        1    0.016    0.016    0.398    0.398 {pandas.lib.map_infer}
    40007    0.010    0.000    0.010    0.000 {getattr}
    20000    0.009    0.000    0.012    0.000 frequencies.py:112(_get_freq_str)
    60000    0.008    0.000    0.008    0.000 {method 'get' of 'dict' objects}
    40000    0.006    0.000    0.006    0.000 {method 'group' of '_sre.SRE_Match' objects}
    20000    0.004    0.000    0.004    0.000 {method 'lower' of 'str' objects}
    20000    0.002    0.000    0.002    0.000 {method 'upper' of 'str' objects}
    20015    0.002    0.000    0.002    0.000 {len}
@jtratner
Copy link
Contributor

@jreback did you resolve this already? I might have to deal with this soon anyways, as I'm about to tackle Period and DatetimeIndex fixups.

@jreback
Copy link
Contributor Author

jreback commented Oct 15, 2013

nope
it shouldn't be too hard; Period needs a fast path where it just sets stuff

it works now but does lots of work when u have a long PeriodIndex

@jreback
Copy link
Contributor Author

jreback commented Oct 15, 2013

IOW it reinterprets the freq code on every Period construction even when u r passing valid values (but it doesn't know that)

@jtratner
Copy link
Contributor

You should also make sure you're fastpathing asobject (not sure if it matters for PI though)

@jreback
Copy link
Contributor Author

jreback commented Oct 15, 2013

not planning in this till 0.14

@jtratner
Copy link
Contributor

got it

On Mon, Oct 14, 2013 at 11:10 PM, jreback notifications@github.com wrote:

not planning in this till 0.14


Reply to this email directly or view it on GitHubhttps://github.com//issues/5155#issuecomment-26306475
.

@jreback
Copy link
Contributor Author

jreback commented Apr 9, 2014

revisit after #5148

@nehalecky
Copy link
Contributor

Any relevance: #2949?

@jreback
Copy link
Contributor Author

jreback commented Apr 24, 2014

that's a separate issue...this is a pretty easy fix

@nehalecky
Copy link
Contributor

Thanks, just wanted to throw it in the mix. I love (conceptually) PeriodIndexs, but they need a lot of tlc soon to be all they can be. :)

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Apr 25, 2014
@jreback jreback modified the milestones: 0.15.1, 0.15.0 Aug 3, 2014
@jreback
Copy link
Contributor Author

jreback commented Aug 3, 2014

@nehalecky turns out this was pretty trivial, see #7891

Period constructor has lots of inference going on. So when you know that you simply want to wrap an ordinal/freq its trivial to construct. (you can usually only do this AFTER its constructed, but that's usually when you pay the cost, eg. when passing to matplotlib or setting actual periods in a data frame).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Frequency DateOffsets Performance Memory or execution speed performance Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants