You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This loader is intended to work well with Delta's data skipping feature to enable efficient downstream queries in the warehouse. It should work like this:
The loader outputs a file with the load_tstamp field set to a single uniform value in the file
The Delta metadata file contains statistics, so it knows the load_stamp value for each file, without needing to open the file.
An incremental query is run, including a clause SELECT ? WHERE load_tstamp > ?.
The query engine is able to go directly to the relevant files, using the Delta metadata. It does not need to scan every file in the partition.
However... this feature only works if the load_tstamp is one of the first few columns in the table. Currently, load_tstamp column is the 129th column, which means we don't get the statistics and we don't get the efficient query.
The solution is to re-order the atomic columns when we first create the table.
The text was updated successfully, but these errors were encountered:
This loader is intended to work well with Delta's data skipping feature to enable efficient downstream queries in the warehouse. It should work like this:
load_tstamp
field set to a single uniform value in the fileload_stamp
value for each file, without needing to open the file.SELECT ? WHERE load_tstamp > ?
.However... this feature only works if the
load_tstamp
is one of the first few columns in the table. Currently,load_tstamp
column is the 129th column, which means we don't get the statistics and we don't get the efficient query.The solution is to re-order the atomic columns when we first create the table.
The text was updated successfully, but these errors were encountered: