-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Derive predicate over base column for CAST(a_varchar AS date) > a_date_constant
condition
#12925
Comments
For the record, #12795 #12918 improve Iceberg pushdown capabilities, but only when the partition key is of a temporal type. In this issue, your Iceberg table doesn't seem partitioned and Hive table is partitioned. |
The iceberg table t1 is partitioned, just the show create table result can't tell it. Below is what I got from the metadata file: |
as far as I can understand, the push down does not succeed because the dt is a varchar type, while timestamp(6) is expected, at here |
the actual way to test this is to count number of source splits |
I am not sure it's addressed by #9830. As for the latest release 387, such query will keep coordinator busy for long time, and then timeout (the table is relative big, with millions of parquet files, and millions of partitions) |
do you have a stacktrace? |
here is one
|
and in Tableau, we got following error message: |
Edit: disregard. |
When We still could derive a predicate directly over |
CAST(a_varchar AS date) > a_date_constant
condition
BTW, for Iceberg i do strongly recommend to switch your column data type to be the issue may still be useful for existing data sets with suboptimal data type, and also useful for Hive. |
the table is a successor of a hive table, so we kept the data type. |
It will work okay but you loose the ability to use any date transforms: https://iceberg.apache.org/spec/#partition-transforms |
can we treat identity transform separately, so iceberg can match hive's perf in this scenario |
For iceberg table, if partition key in an expression (cast) in the where clause, then it's not pushed down to table scan. while hive table does.
below is the explain output of a query, which the partition key dt is casted to date (this is required by Tableau for incremental refresh), the predicate is not pushed down to table scan.
for a similar HIVE table , the condition is pushed down.
The text was updated successfully, but these errors were encountered: