You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On TimescaleDb 0.9.2 (PG10.4) we observed a segfault on certain queries. I haven't tested on 0.10.0 yet, but I think the part of the code that crashes has not changed in the new version. I've narrowed the query down to the following:
explain
select * FROM normal_table ik
LEFT JOIN LATERAL (SELECT max(time0::bigint) FROM hypertable s WHERE ik.name = 'value' and s.time0 > now()) s ON TRUE
The query is a bit weird, as it has a WHERE clause inside the subquery which references the outer query (ik.name = 'value'). It seems the combination of ik.name='value' and the max aggregate function makes TimescaleDb crash. Removing either one results in a valid plan. Removing the condition on s.time0 > now() also yields a valid plan, but this is just because the constraint_aware_append is not triggered without this condition.
The first part of the stack trace:
#0 list_nth_cell (list=0x2d44bd0, n=47494368) at list.c:399 #1 0x00000000006239e2 in list_nth (list=, n=) at list.c:413 #2 0x00007f58d865d6bb in ca_append_begin (node=0x2d4a550, estate=0x2d486e8, eflags=17) at /home/florisvannee/timescaledb/src/constraint_aware_append.c:165 #3 0x00000000005f1712 in ExecInitCustomScan (cscan=cscan@entry=0x2d440b8, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at nodeCustom.c:105 #4 0x00000000005e54fd in ExecInitNode (node=0x2d440b8, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at execProcnode.c:282 #5 0x00000000005fe644 in ExecInitResult (node=node@entry=0x2d44408, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at nodeResult.c:223 #6 0x00000000005e52fd in ExecInitNode (node=0x2d44408, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at execProcnode.c:164
It crashes on the following line in ca_append_begin: RangeTblEntry *rte = rt_fetch(scan->scanrelid, estate->es_range_table);
The es_range_table is something with 6 entries and scanrelid is a very large number in a T_RESULT node. It tries to access the 47484376th element of the range table list.
I believe the aggregate function turns the children of the Append node into a T_RESULT (where the T_RESULT children contain the actual sequential/index scan on the relation). This is the plan when I change the query to use Postres partition exclusion instead of the constraint aware append:
Nested Loop Left Join (cost=14.02..10131.40 rows=720 width=93)
-> Seq Scan on normal_table ik (cost=0.00..17.20 rows=720 width=85)
-> Aggregate (cost=14.02..14.03 rows=1 width=8)
-> Append (cost=0.00..13.68 rows=135 width=8)
-> Result (cost=0.00..0.00 rows=1 width=8)
One-Time Filter: (ik.name = 'value'::text)
-> Seq Scan on hyper_table s (cost=0.00..0.00 rows=1 width=8)
Filter: (time0 > '2018-05-20 00:00:00.0 +0200'::timestamp9)
-> Result (cost=0.00..13.68 rows=134 width=8)
One-Time Filter: (ik.name = 'value'::text)
-> Seq Scan on _hyper_383_10587_chunk s_1 (cost=0.00..13.68 rows=134 width=8)
Filter: (time0 > '2018-05-20 00:00:00.0 +0200'::timestamp9)
To me it looks like a quick fix to avoid segfaults would be to move:
Inside the case block. This avoids the segfault. However, this is not the behavior you'd like, as it will not prune chunks from the plan. Ideally, there'd be a special case for T_RESULT which then looks at the children of T_RESULT to determine if the chunk can be pruned or not. This would result in correct behavior.
Any ideas on how best to approach this?
The text was updated successfully, but these errors were encountered:
fvannee
changed the title
Segfault on hypertable query in lateral join
Segfault on hypertable query in lateral join with aggregate
Jun 30, 2018
On TimescaleDb 0.9.2 (PG10.4) we observed a segfault on certain queries. I haven't tested on 0.10.0 yet, but I think the part of the code that crashes has not changed in the new version. I've narrowed the query down to the following:
The query is a bit weird, as it has a WHERE clause inside the subquery which references the outer query (ik.name = 'value'). It seems the combination of ik.name='value' and the max aggregate function makes TimescaleDb crash. Removing either one results in a valid plan. Removing the condition on s.time0 > now() also yields a valid plan, but this is just because the constraint_aware_append is not triggered without this condition.
The first part of the stack trace:
#0 list_nth_cell (list=0x2d44bd0, n=47494368) at list.c:399
#1 0x00000000006239e2 in list_nth (list=, n=) at list.c:413
#2 0x00007f58d865d6bb in ca_append_begin (node=0x2d4a550, estate=0x2d486e8, eflags=17) at /home/florisvannee/timescaledb/src/constraint_aware_append.c:165
#3 0x00000000005f1712 in ExecInitCustomScan (cscan=cscan@entry=0x2d440b8, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at nodeCustom.c:105
#4 0x00000000005e54fd in ExecInitNode (node=0x2d440b8, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at execProcnode.c:282
#5 0x00000000005fe644 in ExecInitResult (node=node@entry=0x2d44408, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at nodeResult.c:223
#6 0x00000000005e52fd in ExecInitNode (node=0x2d44408, estate=estate@entry=0x2d486e8, eflags=eflags@entry=17) at execProcnode.c:164
It crashes on the following line in ca_append_begin:
RangeTblEntry *rte = rt_fetch(scan->scanrelid, estate->es_range_table);
The es_range_table is something with 6 entries and scanrelid is a very large number in a T_RESULT node. It tries to access the 47484376th element of the range table list.
(gdb) print *estate->es_range_table
$1 = {type = T_List, length = 6, head = 0x2d44bb0, tail = 0x2d458a0}
(gdb) print *scan
$2 = {plan = {type = T_Result, startup_cost = 0.14999999999999999, total_cost = 7.2587499999999991, plan_rows = 1, plan_width = 8, parallel_aware = 0 '\000', parallel_safe = 0 '\000', plan_node_id = 7,
targetlist = 0x2d4adc8, qual = 0x0, lefttree = 0x2d4aeb8, righttree = 0x0, initPlan = 0x0, extParam = 0x2d4b4b8, allParam = 0x2d4b4d0}, scanrelid = 47494376}
I believe the aggregate function turns the children of the Append node into a T_RESULT (where the T_RESULT children contain the actual sequential/index scan on the relation). This is the plan when I change the query to use Postres partition exclusion instead of the constraint aware append:
Nested Loop Left Join (cost=14.02..10131.40 rows=720 width=93)
-> Seq Scan on normal_table ik (cost=0.00..17.20 rows=720 width=85)
-> Aggregate (cost=14.02..14.03 rows=1 width=8)
-> Append (cost=0.00..13.68 rows=135 width=8)
-> Result (cost=0.00..0.00 rows=1 width=8)
One-Time Filter: (ik.name = 'value'::text)
-> Seq Scan on hyper_table s (cost=0.00..0.00 rows=1 width=8)
Filter: (time0 > '2018-05-20 00:00:00.0 +0200'::timestamp9)
-> Result (cost=0.00..13.68 rows=134 width=8)
One-Time Filter: (ik.name = 'value'::text)
-> Seq Scan on _hyper_383_10587_chunk s_1 (cost=0.00..13.68 rows=134 width=8)
Filter: (time0 > '2018-05-20 00:00:00.0 +0200'::timestamp9)
To me it looks like a quick fix to avoid segfaults would be to move:
Inside the case block. This avoids the segfault. However, this is not the behavior you'd like, as it will not prune chunks from the plan. Ideally, there'd be a special case for T_RESULT which then looks at the children of T_RESULT to determine if the chunk can be pruned or not. This would result in correct behavior.
Any ideas on how best to approach this?
The text was updated successfully, but these errors were encountered: