-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not skipping hive partitioned data when using is_in
#13358
Comments
alexander-beedie
changed the title
Not skipping hive partitioned data when using isin
Not skipping hive partitioned data when using Jan 1, 2024
is_in
bchalk101
pushed a commit
to bchalk101/polars
that referenced
this issue
Jan 4, 2024
bchalk101
pushed a commit
to bchalk101/polars
that referenced
this issue
Jan 4, 2024
bchalk101
pushed a commit
to bchalk101/polars
that referenced
this issue
Jan 4, 2024
bchalk101
pushed a commit
to bchalk101/polars
that referenced
this issue
Jan 4, 2024
fix(rust): Fix hive partitioned files not being skipped (pola-rs#13358)
2 tasks
bchalk101
pushed a commit
to bchalk101/polars
that referenced
this issue
Jan 7, 2024
fix(rust): Fix hive partitioned files not being skipped (pola-rs#13358)
bchalk101
pushed a commit
to bchalk101/polars
that referenced
this issue
Jan 8, 2024
fix(rust): Fix hive partitioned files not being skipped (pola-rs#13358)
ritchie46
pushed a commit
that referenced
this issue
Jan 8, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Log output
Issue description
When using
is_in
for the partition column, all the files between the first partition and the last partition in theis_in
columns are labeled asparquet file must be read, statistics not sufficient for predicate.
Expected behavior
There is no reason to be reading so many files, only the first partition and the 100th partition should be read, and even with those, there should be no reason to read the actual file as there is no further filter than what is contained in the partition and the statistics of the parquet file.
Installed versions
The text was updated successfully, but these errors were encountered: