Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: fix projections of joiners #40995

Merged
merged 3 commits into from
Sep 27, 2019

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Sep 23, 2019

colexec: fix projections of joiners

Hash and merge joiners handle the projection themselves for efficiency
reason. However, previously, they didn't do it correctly in all cases -
when in the requested projection some columns from the right come before
some columns from the left side, the output would not be correct because
the joiners always output all the columns from the left side first. Now
this is fixed.

Fixes: #40621.
Addresses: #38994.

Release justification: Category 2: Bug fixes and low-risk updates
to new functionality.

colexec: fix remapping of IVars during filter planning

Previously, the remapping could be done incorrectly. Consider a
case when we have @2 = @4 with idxVarMap = {-1, 0, -1, 1}. After
we replaced @4 with @2, we would have @2 = @2 with no way to
distinguish between the original and the already replaced ordinals.
Now this is fixed by inserting a special symbol # inside of the
newly replaced ordinal with a second step of removing it later.

Release justification: Category 2: Bug fixes and low-risk updates
to new functionality.

colexec: refactor creation of hash and merge joiners

A lot of the code in "planning" of the hash and merge joiners is
the same, so this commit extracts it out in a single function that
is used when creating both joiners.

Release justification: Category 1: Non-production code changes.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich requested review from asubiotto and a team September 23, 2019 21:04
@yuzefovich
Copy link
Member Author

This PR contains two commits, and I think it'll be easier to review them one at a time (the second one is merely a refactor with no changes to the planning logic).

@yuzefovich yuzefovich force-pushed the fix-joiner-projection branch 2 times, most recently from d5fea3f to 93e1229 Compare September 23, 2019 23:17
@yuzefovich
Copy link
Member Author

I need to fix one more bug here as well as refactor the code a little bit more, so please hold off on the reviews for now.

@yuzefovich yuzefovich force-pushed the fix-joiner-projection branch 2 times, most recently from 3833483 to 19a4137 Compare September 24, 2019 17:58
@yuzefovich
Copy link
Member Author

This PR is now RFAL.

@yuzefovich
Copy link
Member Author

Testing that would catch the bug that is fixed in the second commit here is added in a different PR.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, 1 of 1 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)


pkg/sql/colexec/execplan.go, line 116 at r3 (raw file):

// createJoiner adds a new hash or merge join with the argument function
// createJoinOpWithOnExprPlanning distinguishing between the two.
func (r *NewColOperatorResult) createJoiner(

It feels a bit awkward to make this a method of NewColOperatorResult. Could this maybe be a function that takes one as an argument?


pkg/sql/colexec/hashjoiner_test.go, line 933 at r1 (raw file):

			// The "core" of the test - we ask for a projection in which the columns
			// from the left and from the right are intertwined.
			OutputColumns: []uint32{3, 1, 0, 5, 4, 2},

The test comment seems to imply that the projections will be random (due to the word arbitrary). Maybe that's something you could do here? If it doesn't seem like it'll add too much, don't worry about it, but maybe update the comment.

Hash and merge joiners handle the projection themselves for efficiency
reason. However, previously, they didn't do it correctly in all cases -
when in the requested projection some columns from the right come before
some columns from the left side, the output would not be correct because
the joiners always output all the columns from the left side first. Now
this is fixed.

Release justification: Category 2: Bug fixes and low-risk updates
to new functionality.

Release note: None
Previously, the remapping could be done incorrectly. Consider a
case when we have `@2 = @4` with idxVarMap = {-1, 0, -1, 1}. After
we replaced `@4` with `@2`, we would have `@2 = @2` with no way to
distinguish between the original and the already replaced ordinals.
Now this is fixed by inserting a special symbol `#` inside of the
newly replaced ordinal with a second step of removing it later.

Release justification: Category 2: Bug fixes and low-risk updates
to new functionality.

Release note: None
A lot of the code in "planning" of the hash and merge joiners is
the same, so this commit extracts it out in a single function that
is used when creating both joiners.

Release justification: Category 1: Non-production code changes.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @jordanlewis)


pkg/sql/colexec/execplan.go, line 116 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

It feels a bit awkward to make this a method of NewColOperatorResult. Could this maybe be a function that takes one as an argument?

I see your point, done. I was concerned that this function takes in too many arguments (one of those is another function that takes many arguments and returns many things as well), so I thought of using a receiver to reduce the "argument burden" a little. I don't see a reasonable way to refactor createJoiner function (like no "coherent" struct stands out to me), but maybe it is fine as is since this function should be called only within execplan.go file from two places.


pkg/sql/colexec/hashjoiner_test.go, line 933 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

The test comment seems to imply that the projections will be random (due to the word arbitrary). Maybe that's something you could do here? If it doesn't seem like it'll add too much, don't worry about it, but maybe update the comment.

I updated the comment. It would be too much code to handle a truly random projection because we need to keep calls to b.ColVec(i).TYPE in sync with it, and I agree with you that it wouldn't add much to the coverage. The particular projection that I hardcoded seems sufficient.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yuzefovich
Copy link