Order chunks for compression by range_start

Previously, when compressing chunks the order was non-deterministic; it was usaully ordered by chunk id, but not guaranteed. And while typically chunk id corresponds with range_start, it is not always so: backfilling chunks can lead to larger chunk ids for older chunks. Generally, it seems preferable to compress chunks oldest to newest. When using the experimental compress_chunk_time_interval option, it is even more so preferable because it allows for the most efficient roll ups. If not, the chunks compressed out of order can "cut off" a rolled up chunk before it was completely full.
timescale · Jul 23, 2024 · c7a5c90 · c7a5c90
1 parent d18361e
commit c7a5c90
Showing 1 changed file with 4 additions and 0 deletions.
diff --git a/sql/policy_internal.sql b/sql/policy_internal.sql
@@ -94,6 +94,9 @@ BEGIN
       INNER JOIN pg_class pgc ON pgc.oid = show.oid
       INNER JOIN pg_namespace pgns ON pgc.relnamespace = pgns.oid
       INNER JOIN _timescaledb_catalog.chunk ch ON ch.table_name = pgc.relname AND ch.schema_name = pgns.nspname AND ch.hypertable_id = htid
+      INNER JOIN _timescaledb_catalog.chunk_constraint cc ON ch.id = cc.chunk_id
+      INNER JOIN _timescaledb_catalog.dimension d ON d.hypertable_id = ch.hypertable_id
+      INNER JOIN _timescaledb_catalog.dimension_slice ds ON d.id = ds.dimension_id AND cc.dimension_slice_id = ds.id
     WHERE
       NOT ch.dropped AND NOT ch.osm_chunk
       AND (
@@ -105,6 +108,7 @@ BEGIN
           )
         )
       )
+    ORDER BY ds.range_start
   LOOP
     IF chunk_rec.status = 0 THEN
       BEGIN