Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union All optimisation #15097

Closed
wants to merge 1 commit into from
Closed

Conversation

ankitdixit
Copy link
Contributor

@ankitdixit ankitdixit commented Aug 30, 2020

Test plan - (Please fill in how you tested your changes)

== RELEASE NOTES ==

General Changes
* Improve performance for queries that union all many constants.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 30, 2020

CLA Check
The committers are authorized under a signed CLA.

  • ✅ Ankit Dixit (6a201d8ebcdadb6de48722ae71a9ab97ec13f2a9)

@ankitdixit ankitdixit changed the title Union All optimisation [WIP] Union All optimisation Sep 6, 2020
@ajaygeorge
Copy link
Contributor

Hi @ankitdixit Thanks for working on this. Can you please fill in the test plan and look into the travis build failures to see if they are related to your changes.

@rongrong / @rschlussel Would you be able to review this PR?. This can fix the high stage count issue we see which causes the REMOTE_TASK_ERROR issues.

@rschlussel
Copy link
Contributor

fixes #13896

Copy link
Contributor

@rongrong rongrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test failure seems related. Please take a look. Thanks!

@ankitdixit ankitdixit force-pushed the unionall branch 3 times, most recently from d5ed95c to f5baa3d Compare September 28, 2020 08:57
@ankitdixit
Copy link
Contributor Author

@rongrong I have tried to address the comments.
The failure seems to be unrelated, can you please check?

@ankitdixit
Copy link
Contributor Author

The test failure seems related. Please take a look. Thanks!

Current set of failures are in presto-verifier and the test cases seem to pass locally.
Please let me know I we need to fix them.

@kaikalur
Copy link
Contributor

I would like to see some test queries with this pattern as well - not just the unit test.

@rongrong rongrong requested a review from kaikalur September 29, 2020 20:43
@ankitdixit
Copy link
Contributor Author

I would like to see some test queries with this pattern as well - not just the unit test.

@kaikalur Do you want me to paste some queries and plan here? I do n]ot understand what test queries mean here, can you please give me an example

{
List<PlanNode> values = captures.get(CHILDREN);
//Return if not union over ValuesNode
if (!(values.stream().map(x -> context.getLookup().resolve(x)).allMatch(x -> x instanceof ValuesNode))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need all the children to be values nodes. If any are values nodes, we can merge those and leave the other children as they were.

Copy link
Contributor Author

@ankitdixit ankitdixit Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschlussel As if now this optimizer replaces a Union node whose all children are valuesnode with a single values node.
What you are asking for is to subsume all valuesNode under union node ino the valuesnode.
I can :

  1. Modify this optimizer to convert ot valuesnode or subsume in union depending upon whether all child are valuesnode
  2. Implement a separate optimizer
    which is the preffered option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think option 1 is best.

@rschlussel
Copy link
Contributor

I would like to see some test queries with this pattern as well - not just the unit test.

@kaikalur Do you want me to paste some queries and plan here? I do n]ot understand what test queries mean here, can you please give me an example

@ankitdixit you can add queries with this pattern to AbstractTestQueries. That will test that the queries work end to end and validates correctness

@kaikalur
Copy link
Contributor

I would like to see some test queries with this pattern as well - not just the unit test.

@kaikalur Do you want me to paste some queries and plan here? I do n]ot understand what test queries mean here, can you please give me an example

I want to see actually something like

SELECT * FROM ( select 1, 2, 3 union all select * from (values (5,6,6),(10,20,30))

and it's output.

@ajaygeorge
Copy link
Contributor

Hi @ankitdixit , can we get this PR merged in a week or so. This would really help us with the Remote task errors that we are seeing.

@stale
Copy link

stale bot commented Jun 2, 2021

This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions!

@stale stale bot added the stale label Jun 2, 2021
@stale stale bot closed this Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants