Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CHORE] Add check for stateful UDF outside of project #2771

Merged
merged 5 commits into from
Sep 4, 2024

Conversation

kevinzwang
Copy link
Member

@kevinzwang kevinzwang commented Aug 30, 2024

Add a parameter to resolve_expr and its related functions to allow for stateful UDFs in the expression. This is set to false everywhere except for projects.

Additionally, df['column name'] used to call resolve_exprs to check for validity and returned its output, which would be problematic if there were wildcards in the expression. Now, I've created a function that only does the validity check, and DataFrame.__getitem__ would just return col(name), which will be actually resolved later in the builder.

@kevinzwang kevinzwang requested a review from jaychia August 30, 2024 18:08
@github-actions github-actions bot added the chore label Aug 30, 2024
Copy link

codspeed-hq bot commented Aug 30, 2024

CodSpeed Performance Report

Merging #2771 will degrade performances by 55.47%

Comparing kevin/resolve-expr-stateful-udf (3dbea60) with main (a97d871)

Summary

❌ 2 regressions
✅ 14 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main kevin/resolve-expr-stateful-udf Change
test_count[1 Small File] 10.1 ms 22.6 ms -55.47%
test_show[100 Small Files] 55.8 ms 64.2 ms -13%

Copy link

codecov bot commented Aug 30, 2024

Codecov Report

Attention: Patch coverage is 85.56701% with 14 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@a97d871). Learn more about missing BASE report.
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-dsl/src/resolve_expr.rs 78.46% 14 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2771   +/-   ##
=======================================
  Coverage        ?   63.26%           
=======================================
  Files           ?     1001           
  Lines           ?   113771           
  Branches        ?        0           
=======================================
  Hits            ?    71980           
  Misses          ?    41791           
  Partials        ?        0           
Files with missing lines Coverage Δ
daft/dataframe/dataframe.py 86.00% <100.00%> (ø)
src/daft-dsl/src/expr.rs 74.37% <100.00%> (ø)
src/daft-dsl/src/lib.rs 100.00% <100.00%> (ø)
src/daft-dsl/src/python.rs 94.12% <100.00%> (ø)
...rc/daft-plan/src/logical_ops/actor_pool_project.rs 40.27% <100.00%> (ø)
src/daft-plan/src/logical_ops/agg.rs 52.63% <100.00%> (ø)
src/daft-plan/src/logical_ops/explode.rs 79.54% <100.00%> (ø)
src/daft-plan/src/logical_ops/filter.rs 58.33% <100.00%> (ø)
src/daft-plan/src/logical_ops/join.rs 91.54% <100.00%> (ø)
src/daft-plan/src/logical_ops/pivot.rs 71.42% <100.00%> (ø)
... and 7 more

src/daft-dsl/src/resolve_expr.rs Show resolved Hide resolved
src/daft-dsl/src/resolve_expr.rs Outdated Show resolved Hide resolved
Copy link

netlify bot commented Sep 3, 2024

Deploy Preview for daft-docs canceled.

Name Link
🔨 Latest commit 3dbea60
🔍 Latest deploy log https://app.netlify.com/sites/daft-docs/deploys/66d77331764b1300084f04d8

@kevinzwang kevinzwang requested a review from jaychia September 3, 2024 20:33
@kevinzwang kevinzwang merged commit c5a4adc into main Sep 4, 2024
40 of 41 checks passed
@kevinzwang kevinzwang deleted the kevin/resolve-expr-stateful-udf branch September 4, 2024 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants