Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: ewm passed with times is so slow compared to ewm pased with int #39784

Closed
1 task
jasonzhang2s opened this issue Feb 13, 2021 · 3 comments · Fixed by #40072
Closed
1 task

PERF: ewm passed with times is so slow compared to ewm pased with int #39784

jasonzhang2s opened this issue Feb 13, 2021 · 3 comments · Fixed by #40072
Labels
Performance Memory or execution speed performance Window rolling, ewma, expanding
Milestone

Comments

@jasonzhang2s
Copy link

jasonzhang2s commented Feb 13, 2021

  • [yes ] I have checked that this issue has not already been reported.

  • [yes ] I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

# Your code here

idx=pd.date_range('20000101','20201231',periods=50000)

df=pd.DataFrame(data=range(50000),index=idx)

%%time
df.ewm(halflife=pd.Timedelta('100d'),times=df.index).mean()

CPU times: user 1min 13s, sys: 0 ns, total: 1min 13s
Wall time: 1min 13s


%%time
df.ewm(halflife=100).mean()

CPU times: user 1.72 ms, sys: 26 µs, total: 1.75 ms
Wall time: 1.16 ms

Problem description

ewm passed with times is so slow compared to ewm pased with int

Expected Output

ewm passed with time should be similar speed as ewm passed with int

version 1.2.2

@jasonzhang2s jasonzhang2s added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 13, 2021
@jreback
Copy link
Contributor

jreback commented Feb 13, 2021

thanks @jasonzhang2s

cc @mroeschke if you can take a look when you have a chance.

@jreback jreback added Performance Memory or execution speed performance Window rolling, ewma, expanding and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 13, 2021
@jreback jreback changed the title BUG: ewm passed with times is so slow compared to ewm pased with int PERF: ewm passed with times is so slow compared to ewm pased with int Feb 13, 2021
@jreback jreback added this to the 1.3 milestone Feb 13, 2021
@mroeschke
Copy link
Member

Looks like there had been an initial performance improvement over the original implementation. #37389

The ewma with times uses a different algorithm, which is O(n^2), than a regular ewma, which is O(n) because the weights need to be recomputed at each point due the variability of time-based weights. I am not sure if this can be done with an O(n) algorithm.

This op might be a good candidate for numba?

@lsgiorello
Copy link

Hi guys,

For most of applications, we can use an approximation using the recursive formula. The end result won't be the exact value but we can add this mode for real life uses.

We can do an online algorithm based on the Y_t = alpha_t * Y_t-1 + (1-alpha_t) * S_t-1.
As the product makes the exponent cumulative, it solves the issue and gives O(n) complexity.

It can be a "fast ewm" version we could allow. Doesn't give 100% of the information but works for most of applications and also, is what I can see being used on a daily basis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants