You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
We have several forms of predicate pushdown in DataFusion's Parquet reader. The code path taken depends on the exact data layout and predicates defined
@itsjunetime is working on #4028 to improve performance by being more clever about some of these predicates.
The current code paths taken depend on
Row group size
Sort order of the data within the file
File repartitioning size (how many partitions are read)
Number of row groups
Datapage size
Use predicate pushdown?
Use predicate reordering?
Describe the solution you'd like
I would like some additional test coverage (for correctness) when reading from parquet files with the various forms of pushdown enabled. It is especially important to ensure correctness with the various pushdowns enabled.
Describe alternatives you've considered
I would like to have a test that
Creates multiple parquet files with different orderings / row group distribution etc
Runs the same query on the same input
Compares the results from the different queries and ensures it is the same
Parameters to check
Row group size
Sort order
Number of row groups
Datapage size
Use predicate pushdown
use predicate reordering
Additional context
No response
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge?
We have several forms of predicate pushdown in DataFusion's Parquet reader. The code path taken depends on the exact data layout and predicates defined
@itsjunetime is working on #4028 to improve performance by being more clever about some of these predicates.
The current code paths taken depend on
Describe the solution you'd like
I would like some additional test coverage (for correctness) when reading from parquet files with the various forms of pushdown enabled. It is especially important to ensure correctness with the various pushdowns enabled.
Describe alternatives you've considered
I would like to have a test that
Parameters to check
Additional context
No response
The text was updated successfully, but these errors were encountered: