Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan table function invocation with table arguments #14175

Merged
merged 6 commits into from
Dec 8, 2022

Conversation

kasiafi
Copy link
Member

@kasiafi kasiafi commented Sep 18, 2022

This PR adds the RelationPlanner part for table function invocation with table and descriptor arguments.
These types of arguments are not yet supported -- they cause the query to fail during the initial planning phase.

No docs or release notes required.

based on #14115

Example output from the PlanPrinter:
query:

SELECT * FROM TABLE(mock.system.different_arguments_function(
     INPUT_1 => TABLE(SELECT 'a') t1(c1) PARTITION BY c1 ORDER BY c1,
     INPUT_3 => TABLE(SELECT 'b') t3(c3) PARTITION BY c3,
     INPUT_2 => TABLE(VALUES 1) t2(c2),
     ID => BIGINT '2001',
     LAYOUT => DESCRIPTOR (x boolean, y bigint)
     COPARTITION (t1, t3))) t

fragment of the plan:

         └─ Project[]
            │   Layout: [some_column:boolean, expr:varchar(1), field:integer, expr_0:varchar(1)]
            │   Estimates: 
            └─ TableFunction[name = different_arguments_function]
               │   Layout: [some_column:boolean, expr:varchar(1), field:integer, expr_0:varchar(1)]
               │   Estimates: 
               │   Arguments:
               │   INPUT_1 => TableArgument{partition by: [expr], order by: [expr ASC NULLS LAST], pass through columns}
               │   INPUT_3 => TableArgument{partition by: [expr_0], prune when empty}
               │   INPUT_2 => TableArgument{row semantics, prune when empty, pass through columns}
               │   ID => ScalarArgument{type=bigint, value=2001}
               │   LAYOUT => DescriptorArgument{(X boolean, Y bigint)}
               │   Co-partition: [(INPUT_1, INPUT_3)]
               ├─ [INPUT_1] Project[]
               │  │   Layout: [expr:varchar(1)]
               │  │   Estimates: 
               │  └─ Project[]

@cla-bot cla-bot bot added the cla-signed label Sep 18, 2022
@github-actions github-actions bot added the docs label Sep 18, 2022
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch 2 times, most recently from 5216363 to 5cc676b Compare September 18, 2022 14:37
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch 7 times, most recently from fff6ce8 to b2aefe8 Compare October 1, 2022 15:01
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch 2 times, most recently from c08453b to ba2d670 Compare October 7, 2022 09:37
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch 4 times, most recently from 09dffde to 8913a4b Compare October 14, 2022 09:38
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch 2 times, most recently from 334339f to 5c5ce68 Compare October 20, 2022 08:41
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch from 5c5ce68 to 2a29508 Compare October 27, 2022 08:59
@kasiafi kasiafi requested a review from martint October 27, 2022 09:29
@@ -1886,6 +1924,10 @@ public NodeRepresentation addNode(
List<PlanCostEstimate> estimatedCosts = allNodes.stream()
.map(nodeId -> estimatedStatsAndCosts.getCosts().getOrDefault(nodeId, PlanCostEstimate.unknown()))
.collect(toList());
String argumentName = argumentNames.get(rootNode.getId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing the whole map of node id -> name and having this method look it up, it would be cleaner to pass the intended name directly. At the end of the day, for each branch of recursion when visiting a child, there's only one name that's valid.

This means that the Context just just contain a single name that represents the "tag" associated with the subtree. If present, it will be prepended to the root of that subtree when printed. It also makes it more generic and allows for tagging any subtree (e.g., for a JOIN, we might want to tag the left and right differently), etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the context had just a single tag instead of the map, I would have to process sources in a loop, creating a new context per child instead of just call processChildren.

However, my main reason for using a map by id was to make it more robust. The usual usage pattern for context is that we process the node, and then recursively process children, passing the same context. In the current design I can safely do it, as node ids are unique. If the context had just a tag, we would need to explicitly clear the context before the recursive call. It would be unintuitive for future contributors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the context had just a single tag instead of the map, I would have to process sources in a loop, creating a new context per child instead of just call processChildren.

That's ok. It would make the intention much clearer.

However, my main reason for using a map by id was to make it more robust. The usual usage pattern for context is that we process the node, and then recursively process children, passing the same context.

That's an abuse of what "context" in the case of the visitor is meant to do. The context is supposed to be specific to the node being visited (e.g., in filter pushdown, it contains the effective filter applied to the output of the current node). Sometimes, it's just easy to pass the same object, but that's not necessarily the best option in all cases.

If the context had just a tag, we would need to explicitly clear the context before the recursive call.

No, it'd be better to just create a new context with the proper tag.

private final boolean rowSemantics;
private final boolean pruneWhenEmpty;
private final boolean passThroughColumns;
private final Specification specification;
private final Optional<Specification> specification;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange for a table function to use a WindowNode.Specification. We should either move it out of WindowNode (preferable) or create a separate one (not ideal if they represent the same concept)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted Specification.

@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch from 2a29508 to 16c9f2f Compare November 21, 2022 12:33
@kasiafi kasiafi requested a review from martint November 21, 2022 16:17
@kasiafi
Copy link
Member Author

kasiafi commented Nov 28, 2022

@martint could you please take a look?

@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch from 16c9f2f to 2cf93aa Compare November 29, 2022 17:51
@kasiafi
Copy link
Member Author

kasiafi commented Nov 29, 2022

@martint I changed the PlanPrinter's context like you suggested. Please take a look.

@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch from 2cf93aa to a38a039 Compare December 1, 2022 11:14
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch from a38a039 to 91bff67 Compare December 6, 2022 13:43
@kasiafi kasiafi force-pushed the 390PlanTFNodeWithSources branch from 91bff67 to 8bd1717 Compare December 7, 2022 08:12
@kasiafi kasiafi requested a review from martint December 7, 2022 20:42
@kasiafi kasiafi merged commit 07e4d35 into trinodb:master Dec 8, 2022
@github-actions github-actions bot added this to the 404 milestone Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants