Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: prevent columns reuse in Union and UnionAll #58477

Merged
merged 3 commits into from
Jan 7, 2021

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Jan 6, 2021

opt: fix columns in SplitScanIntoUnionScans constraint

This commit fixes a minor bug in SplitScanIntoUnionScans that resulted
in a scan's constraint containing columns not associated with the scan.
This did not affect the correctness of results. However it appears that
it did cause inaccurate stats calculations; I had to add histogram
buckets to the tests to coerce the optimizer into choosing the same
plan for the corresponding test.

Release note: None

opt: do not reuse columns for Unions in SplitScanIntoUnionScans

Unions generated in SplitScanIntoUnionScans no longer reuse column IDs
from their left children as output column IDs. Reusing column IDs in
this way has shown to be dangerous (see #58434).

Release note: None

opt: add Union column ID check to CheckExpr

A check has been added to CheckExpr that asserts that the output
columns of Unions and UnionAlls are not reused from the left or
right inputs of the union. Reusing columns in this way is dangerous
(see #58434).

Release note: None

@mgartner mgartner requested a review from a team as a code owner January 6, 2021 00:47
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@mgartner
Copy link
Collaborator Author

mgartner commented Jan 6, 2021

cc @DrewKimball: in case you want to take a look at this

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, 5 of 5 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, and @RaduBerinde)


pkg/sql/opt/xform/limit_funcs.go, line 252 at r2 (raw file):

	// makeNewUnion extends the Union tree rooted at 'last' to include 'newScan'.
	// The ColumnIDs of the original Scan are used by the resulting expression.
	//oldColList := scan.Relational().OutputCols.ToList()

dead code


pkg/sql/opt/xform/limit_funcs.go, line 290 at r2 (raw file):

			// Add the scan to the union tree. If it is the final union in the
			// tree, use the original scan's columns as the union's out columns.
			// Otherwise, created new output column IDs for the union.

created -> create


pkg/sql/opt/xform/testdata/rules/limit, line 92 at r1 (raw file):

    "null_count": 0,
    "histo_col_type": "int",
    "histo_buckets": [

[nit] the histogram row count adds up to more than the row count above.

(I've been meaning to add some histogram validation into our testutils but haven't done it yet)

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @RaduBerinde, and @rytaft)


pkg/sql/opt/xform/limit_funcs.go, line 252 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

dead code

Done.


pkg/sql/opt/xform/limit_funcs.go, line 290 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

created -> create

Done.


pkg/sql/opt/xform/testdata/rules/limit, line 92 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] the histogram row count adds up to more than the row count above.

(I've been meaning to add some histogram validation into our testutils but haven't done it yet)

1000 + 1000 + 998000 = 1000000 What am I missing?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, 1 of 1 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball and @RaduBerinde)


pkg/sql/opt/xform/testdata/rules/limit, line 92 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

1000 + 1000 + 998000 = 1000000 What am I missing?

Oh sorry I got confused -- I thought the 2000 was num_eq not upper_bound. Ignore me!

@mgartner
Copy link
Collaborator Author

mgartner commented Jan 6, 2021

No worries! TFTR.

bors r=rytaft

@craig
Copy link
Contributor

craig bot commented Jan 6, 2021

Build failed:

This commit fixes a minor bug in `SplitScanIntoUnionScans` that resulted
in a scan's constraint containing columns not associated with the scan.
This did not affect the correctness of results. However it appears that
it did cause inaccurate stats calculations; I had to add histogram
buckets to the tests to coerce the optimizer into choosing the same
plan for the corresponding test.

Release note: None
Unions generated in SplitScanIntoUnionScans no longer reuse column IDs
from their left children as output column IDs. Reusing column IDs in
this way has shown to be dangerous (see cockroachdb#58434).

Release note: None
A check has been added to `CheckExpr` that asserts that the output
columns of `Union`s and `UnionAll`s are not reused from the left or
right inputs of the union. Reusing columns in this way is dangerous
(see cockroachdb#58434).

Release note: None
@mgartner
Copy link
Collaborator Author

mgartner commented Jan 7, 2021

bors r=rytaft

@craig
Copy link
Contributor

craig bot commented Jan 7, 2021

Build succeeded:

@craig craig bot merged commit d8b5cb0 into cockroachdb:master Jan 7, 2021
@mgartner mgartner deleted the fix-split-scans branch January 7, 2021 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants