Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Make expressions containing Python UDFs serializable #18135

Merged
merged 6 commits into from
Sep 2, 2024

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented Aug 11, 2024

Changes:

  • Update Expr::AnonymousFunction to implement SerDe for Python functions.
  • Update Expr::RenameAlias to raise an error when serializing instead of simply being skipped. Can implement SerDe for this later.
  • Update the Polars Cloud prep functionality accordingly.

@github-actions github-actions bot added internal An internal refactor or improvement python Related to Python Polars rust Related to Rust Polars labels Aug 11, 2024
Copy link

codspeed-hq bot commented Aug 11, 2024

CodSpeed Performance Report

Merging #18135 will not alter performance

Comparing cloud-allow-lambda (2d9704f) with main (c03e760)

Summary

✅ 37 untouched benchmarks

Copy link

codecov bot commented Aug 11, 2024

Codecov Report

Attention: Patch coverage is 63.82979% with 34 lines in your changes missing coverage. Please review.

Project coverage is 79.87%. Comparing base (c03e760) to head (2d9704f).
Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-plan/src/dsl/expr_dyn_fn.rs 63.15% 21 Missing ⚠️
crates/polars-plan/src/dsl/python_udf.rs 63.88% 13 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18135      +/-   ##
==========================================
+ Coverage   79.86%   79.87%   +0.01%     
==========================================
  Files        1501     1501              
  Lines      202029   202069      +40     
  Branches     2868     2868              
==========================================
+ Hits       161342   161400      +58     
+ Misses      40140    40122      -18     
  Partials      547      547              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stinodego stinodego marked this pull request as draft August 12, 2024 08:35
@stinodego stinodego force-pushed the cloud-allow-lambda branch 5 times, most recently from 9f310f5 to bf444f8 Compare August 30, 2024 11:10
@stinodego stinodego marked this pull request as ready for review August 30, 2024 12:44
@stinodego stinodego changed the title refactor: Allow Python UDFs in Polars Cloud plans feat(python): Make LazyFrames containing Python UDFs serializable Aug 30, 2024
@github-actions github-actions bot added the enhancement New feature or an improvement of an existing feature label Aug 30, 2024
@stinodego stinodego changed the title feat(python): Make LazyFrames containing Python UDFs serializable feat(python): Make expressions containing Python UDFs serializable Aug 30, 2024
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Got some minor comments.

where
D: Deserializer<'a>,
{
use serde::de::Error;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could add a PythonRenamAlias as well, I presume?

Copy link
Member Author

@stinodego stinodego Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - we'd have to do something similar to PythonUdfExpression where we store the lambda as a PyObject.

But it's not super trivial and name.map and name.map_fields are implemented differently... I'd rather pick these up separately as I don't think those are too important and it will require some minor refactoring I think.

crates/polars-plan/src/dsl/python_udf.rs Outdated Show resolved Hide resolved
@ritchie46 ritchie46 merged commit 8a20588 into main Sep 2, 2024
26 checks passed
@ritchie46 ritchie46 deleted the cloud-allow-lambda branch September 2, 2024 09:47
@c-peters c-peters added the accepted Ready for implementation label Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature internal An internal refactor or improvement python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants