You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are experiencing full table scans in a query that operates on a Delta Lake dataset using a Trino connection. The query involves multiple joins and filters on partitioned and Z-order indexed columns.
Despite the optimizations in place:
The created_date column is partitioned.
A Z-order index has been applied to the columns used in the join (b_id, c_id) and filter (entity_id, created_date).
The query still results in full table scans, regardless of the created_date range specified (e.g., 1 day, 1 week, or 2 weeks). This raises concerns about the effectiveness of Z-order indexing in our setup and the overall query performance.
We need to determine whether Z-order indexing is providing any performance benefits and identify alternative approaches to optimize the query execution plan and avoid full table scans.
Questions for Discussion:
Z-order Indexing Impact:
Does Z-order indexing improve query performance in this specific use case with Delta Lake and Trino?
Are there scenarios where Z-order indexing might not provide noticeable benefits?
Query Optimization:
Are there alternative approaches to optimize the query execution plan to prevent full table scans?
Trino-Specific Behaviors:
Does Trino’s query execution engine fully leverage Z-order indexing for data pruning in Delta Lake?
Are there specific configurations or best practices for Trino to better utilize Z-order indexes?
The text was updated successfully, but these errors were encountered:
Issue Description:
We are experiencing full table scans in a query that operates on a Delta Lake dataset using a Trino connection. The query involves multiple joins and filters on partitioned and Z-order indexed columns.
Despite the optimizations in place:
created_date
column is partitioned.b_id
,c_id
) and filter (entity_id
,created_date
).The query still results in full table scans, regardless of the
created_date
range specified (e.g., 1 day, 1 week, or 2 weeks). This raises concerns about the effectiveness of Z-order indexing in our setup and the overall query performance.Sample Query:
Problem Statement:
We need to determine whether Z-order indexing is providing any performance benefits and identify alternative approaches to optimize the query execution plan and avoid full table scans.
Questions for Discussion:
Z-order Indexing Impact:
Query Optimization:
Trino-Specific Behaviors:
The text was updated successfully, but these errors were encountered: