Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Compute joint null mask before calling rolling corr/cov stats #18246

Merged
merged 2 commits into from
Aug 22, 2024

Conversation

agossard
Copy link
Contributor

Fixes #18217

@agossard agossard changed the title Compute joint null mask before calling rolling corr/cov stats fix: Compute joint null mask before calling rolling corr/cov stats Aug 18, 2024
@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars and removed title needs formatting labels Aug 18, 2024
Copy link

codecov bot commented Aug 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.24%. Comparing base (a284174) to head (ff1a318).
Report is 36 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18246      +/-   ##
==========================================
+ Coverage   80.23%   80.24%   +0.01%     
==========================================
  Files        1500     1500              
  Lines      198871   198897      +26     
  Branches     2837     2837              
==========================================
+ Hits       159556   159604      +48     
+ Misses      38788    38768      -20     
+ Partials      527      525       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@agossard
Copy link
Contributor Author

Doubt the hypothesis test failure is due to this PR. Maybe it's being addressed by #18245 @MarcoGorelli ?

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely happy about the implementation as it leaves a lot to CSE, but it is fine for now. Let's first make it correct.

Thanks! Got one comment about the test.

pl.rolling_corr("a", "lag_a", window_size=10, min_periods=5, ddof=1).tail(1)
).item()

assert val_1 == val_2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test the actual value here?

pl.rolling_cov("a", "lag_a", window_size=10, min_periods=5, ddof=1).tail(1)
).item()

assert val_1 == val_2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test the actual value here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the test as suggested and pushed. However, I also spent some time trying to put together a hypothesis test that would cross check these corr and cov functions against numpy. I could not get it to pass, and have an example frame which yields correlation > 1.0 :-/

df = pl.DataFrame(
{
"a": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],
"b": [101.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.000061, 0.0],
}
)

df_corr = df.select(
pl.rolling_corr("a", "b", window_size=7, min_periods=5, ddof=1)
)

I don't have time to push more on this right now (or even this week maybe). But I will log a separate issue.

@ritchie46 ritchie46 merged commit 56b1219 into pola-rs:main Aug 22, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rolling_corr giving inconsistent results
2 participants