Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PartitionedOutputOperator.finish() idempotent #14168

Merged
merged 1 commit into from
Sep 19, 2022

Conversation

losipiuk
Copy link
Member

The PartitionedOutputOperator.finish() was not idempotent in the case when only position counter was tracked in
PositionsAppenderPageBuilder. This could result in wrong query results.

Description

Non-technical explanation

Release notes

( ) This is not user-visible and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# General
* Fix bug which could manifest in wrong results for queries where all data scanned from tables 
  was processed in the leaf stage and was not passed for downstream. ({issue}`issuenumber`)

Copy link
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % minor comment

@@ -99,7 +99,9 @@ public boolean isEmpty()
public Page build()
{
if (channelAppenders.length == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe drop the if instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a test for this case (I can add it in a separate PR). Do you know what kind of query has zero-column partitioning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a test for this case (I can add it in a separate PR). Do you know what kind of query has zero-column partitioning?

I observed that with queries like this one:

SELECT o1.orderkey FROM orders o1 RIGHT JOIN (SELECT * FROM orders LIMIT 10) o2 ON true ORDER BY o1.orderkey LIMIT 20

where you are joining two tables, where neither join condition, nor output depend on any column from one of the tables.

As for the testing, it is problematic as I only observed problem when POO.finish() was called twice. And this does not happen often.

The PartitionedOutputOperator.finish() was not idempotent in the
case when only position counter was tracked in
PositionsAppenderPageBuilder. This could result in wrong query results.
@losipiuk losipiuk merged commit 390f4c8 into trinodb:master Sep 19, 2022
@github-actions github-actions bot added this to the 397 milestone Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants