Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested predicate push down to Parquet Reader #7045

Closed
wants to merge 9 commits into from

Conversation

zhenxiao
Copy link
Collaborator

Built on top of
#6892

Currently Parquet TupleDomain is constructed based on HiveColumnHandle. This would not work if Nested predicate are pushed down, e.g.
select s.a from t where s.b > 10

In this implementation:
Analyze s.b, and put s.b > 10 as an optional Nested predicate in ExtrationResult
add Nested predicate to TableLayout
pass Nested predicate to File Scan, same as flat predicate
Skip reading row groups when Nested predicate does not match Parquet statistics

@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch 3 times, most recently from 6afa71e to c073b6b Compare January 18, 2017 01:43
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch 2 times, most recently from ffe3335 to 47dfe1c Compare January 25, 2017 14:04
@zhenxiao
Copy link
Collaborator Author

@dain @nezihyigitbasi @martint any comments or suggestions?

@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from a80cd4f to 6334ef9 Compare March 1, 2017 00:04
@zhenxiao
Copy link
Collaborator Author

zhenxiao commented Mar 1, 2017

@martint @dain @nezihyigitbasi comments or suggestions?

@dain
Copy link
Contributor

dain commented Mar 10, 2017

In an offline discussion we decided that @zhenxiao was going to investigate using synthetic virtual-columns in the connector to enable this push-down feature.

@dain dain removed their request for review March 10, 2017 18:42
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from 6334ef9 to 6422c0a Compare July 7, 2017 07:31
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from 6422c0a to 3aef172 Compare March 11, 2019 21:14
@zhenxiao zhenxiao force-pushed the parquet-nested-predicate branch from 3aef172 to 0f34d07 Compare March 11, 2019 22:52
@mbasmanova
Copy link
Contributor

I assume #13271 superceds this one, hence, closing.

@mbasmanova mbasmanova closed this Aug 27, 2019
@zhenxiao zhenxiao deleted the parquet-nested-predicate branch January 22, 2022 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants