You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As an engineer I totally get it - Presto has no place to store intermediary results.
(If we had ... we could create a temp table with an extra current_depth column and store/query intermediary results that way, we'd also need "looping" query execution. In the end we can query the whole temp table - UNION ALL - or do a DISTINCT scan over all columns but the depth - UNION.)
As a user, though, I would try something like the below and would either be surprised or disappointed.
The user is forced to pretty closely anticipate the recursion depth, or the query will either be inefficient (depth too high) or fail (depth too low).
WITH RECURSIVE fib (cur, prev) AS
(
SELECT 0, 1
UNION ALL
SELECT cur + prev, cur FROM fib WHERE cur + prev < 1000000
)
SELECT cur FROM fib ORDER BY cur;
In PostgreSQL:
cur
--------
0
[...]
832040
(31 rows)
Time: 0.836 ms
And now even when I change the 1000000 limit to 10, it still takes ~35s (since it's going by the max depth property).
I don't know offhand how to improve this, unless we invent a place to store intermediary data; but at the very least we should document the limitation very carefully.
The text was updated successfully, but these errors were encountered:
First off, I am excited about Presto supporting this now and I get why it works the way it does. Just filing this for a discussion.
Looking here: https://github.com/prestosql/presto/blob/340/presto-main/src/main/java/io/prestosql/sql/planner/QueryPlanner.java#L235, there's a single plan built with size O(n^2) and it's always building the plan to maxRecursionDepth.
This is not complaining, just pointing out that - as is - it's only partially useful.
As an engineer I totally get it - Presto has no place to store intermediary results.
(If we had ... we could create a temp table with an extra current_depth column and store/query intermediary results that way, we'd also need "looping" query execution. In the end we can query the whole temp table - UNION ALL - or do a DISTINCT scan over all columns but the depth - UNION.)
As a user, though, I would try something like the below and would either be surprised or disappointed.
The user is forced to pretty closely anticipate the recursion depth, or the query will either be inefficient (depth too high) or fail (depth too low).
In PostgreSQL:
In Presto:
Ok... So
set session max_recursion_depth=100;
Now:
Ok... So
set session max_recursion_depth=40;
Now it works, but it takes almost 35s:
And now even when I change the 1000000 limit to 10, it still takes ~35s (since it's going by the max depth property).
I don't know offhand how to improve this, unless we invent a place to store intermediary data; but at the very least we should document the limitation very carefully.
The text was updated successfully, but these errors were encountered: