-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update Parquet row filtering to handle type coercion #10716
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking definitely on the right path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jeffreyssmith2nd
I went over the code and test carefully and I think it looks good. I took the liberty of merging up to resolve a conflict.
I also verified that the test covers the code change by removing the code change locally and running the test. When I did so the test fails like this:
assertion failed: filtered.is_ok()
thread 'datasource::physical_plan::parquet::row_filter::test::test_filter_type_coercion' panicked at datafusion/core/src/datasource/physical_plan/parquet/row_filter.rs:549:9:
assertion failed: filtered.is_ok()
stack backtrace:
0: rust_begin_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panicking.rs:72:14
2: core::panicking::panic
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panicking.rs:145:5
3: datafusion::datasource::physical_plan::parquet::row_filter::test::test_filter_type_coercion
at ./src/datasource/physical_plan/parquet/row_filter.rs:549:9
4: datafusion::datasource::physical_plan::parquet::row_filter::test::test_filter_type_coercion::{{closure}}
at ./src/datasource/physical_plan/parquet/row_filter.rs:486:35
5: core::ops::function::FnOnce::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
6: core::ops::function::FnOnce::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
error: test failed, to rerun pass `--lib`
error: 1 target failed:
`--lib`
I have some suggestions for the test, but otherwise I think this PR is ready to go.
@Ted-Jiang and @thinkharderdev I wonder if you have a moment to review this too as I believe you actively use the filter pushdown into parquet
datafusion/core/src/datasource/physical_plan/parquet/row_filter.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @jeffreyssmith2nd
I plan to leave this open for another day in case there are additional comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice. lgtm
Thank you @jeffreyssmith2nd and @thinkharderdev |
Which issue does this PR close?
Closes #7925.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?