-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Enable Actor Pool UDFs by default #3488
Conversation
CodSpeed Performance ReportMerging #3488 will degrade performances by 26.92%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3488 +/- ##
==========================================
+ Coverage 77.00% 77.38% +0.38%
==========================================
Files 696 699 +3
Lines 86039 84942 -1097
==========================================
- Hits 66256 65736 -520
+ Misses 19783 19206 -577
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me with some overall comments:
- I think we should take a stab at wrapping user-provided functions in a no-op class (perhaps as a follow-on
- Let's check the user experience around error handling/reporting of initializations
expression: PyExpr, | ||
) -> dict[str, tuple[PartialStatefulUDF, InitArgsType]]: ... | ||
def bind_stateful_udfs(expression: PyExpr, initialized_funcs: dict[str, Callable]) -> PyExpr: ... | ||
def initialize_udfs(expression: PyExpr) -> PyExpr: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check what initialization errors look like using this instead of the bind_stateful_udfs
approach?
One thing I like about extract_partial_stateful_udf_py + bind_stateful_udfs
is that the initialization happens in Python-land, so we get nice backtraces and stuff if initialization code (which is in "user-space") errors out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For actor pool UDFs it's not as nice but that has always been the case. The native executor actually does a decent job but since things are in a separate process it's just not possible to do it as well
|
||
Terminates once it receives None. | ||
""" | ||
initialized_projection = ExpressionsProjection([e._initialize_udfs() for e in uninitialized_projection]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah curious what the error handling looks like here (e.g. bad init_args, runtime errors in __init__
)
Todo: