optbuilder: refactor semantic analysis of FOR UPDATE #111258

michae2 · 2023-09-26T08:45:02Z

optbuilder: a few minor tweaks to building of FOR UPDATE

Make a few minor tweaks to semantic analysis of FOR UPDATE locking
clauses.

Disallow multiple FOR UPDATE clauses on parenthesized queries. We do
not currently handle scopes of parenthesized queries correctly.
Because of this, we disallow multiple ORDER BY and LIMIT clauses on
parenthesized queries. The previous implementation of FOR UPDATE was
simple enough that we could work around this, but the upcoming
changes make it impossible to support.
Allow FOR UPDATE on statements with VALUES in the FROM list (but
continue to disallow FOR UPDATE on VALUES directly). This matches
Postgres.

Release note (sql change): Two minor changes to FOR UPDATE clauses:

Multiple FOR UPDATE clauses on fully-parenthesized queries are now
disallowed. For example, the following statements are now disallowed:

(SELECT 1 FOR UPDATE) FOR UPDATE;
SELECT * FROM ((SELECT 1 FOR UPDATE) FOR UPDATE) AS x;

whereas statements like the following are still allowed:

SELECT * FROM (SELECT 1 FOR UPDATE) AS x FOR UPDATE;
SELECT (SELECT 1 FOR UPDATE) FOR UPDATE;

This does not match PostgreSQL, which allows all of these, but does
match our behavior for ORDER BY and LIMIT.

FOR UPDATE is now allowed on statements with VALUES in the FROM list
or as a subquery. For example, the following statements are now allowed:

SELECT (VALUES (1)) FOR UPDATE;
SELECT * FROM (VALUES (1)) AS x FOR UPDATE;

Using FOR UPDATE directly on VALUES is still disallowed:

VALUES (1) FOR UPDATE;
(VALUES (1)) FOR UPDATE;
INSERT INTO t VALUES (1) FOR UPDATE;

This matches PostgreSQL.

optbuilder: refactor semantic analysis of FOR UPDATE

Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of
the data sources in a query's FROM list, depending on whether they have
targets (FOR UPDATE OF x). Without targets, they always apply within
subqueries in the FROM list. With targets, they apply within subqueries
if the subquery alias matches the target.

Because of this scope-like nature of FOR UPDATE and FOR SHARE, we
implement semantic analysis using a stack of locking items that grow as
we build inner subqueries deeper in the recursive optbuilder calls.

Prior to this change, we only used the stack of locking items during
buildScan, at the very bottom of the recursion. Because of this, calls
to lockingSpec.filter could afford to compress the stack into a single
locking item on our way deeper in the recursion.

As part of the upcoming fix for #75457, however, we will need to build a
new Lock operator when popping off locking items after returning from
the recursion. That Lock operator will need some information gathered
from buildScan at the bottom of the recursion.

To support this, we refactor the stack of locking items to be two
stacks: one that tracks all locking items in scope, and a second that
tracks which locking items currently apply. This will allow buildScan to
associate table information with the correct locking item(s), which can
then be used to build Lock operators when popping the locking items.

As a bonus, by using only the applied locking item stack in
validateLockingInFrom we can make validation of SELECT FOR UPDATE
queries a little more precise, which allows some queries we were
incorrectly disallowing.

Informs: #57031, #75457

Epic: CRDB-25322

Release note (sql change): Allow FOR UPDATE on some queries that were
previously disallowed. Queries that use the following operations are now
allowed to have FOR UPDATE OF as long as the prohibited operation is in
a subquery not locked by the FOR UPDATE OF:

UNION
INTERSECT
EXCEPT
DISTINCT
GROUP BY
HAVING
aggregations
window functions

For example, the following query is now allowed because the subquery
using the prohibited operations is not affected by the FOR UPDATE OF:

SELECT * FROM t,
  (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u
  FOR UPDATE OF t;

This matches PostgreSQL.

cockroach-teamcity · 2023-09-26T08:45:13Z

This change is

michae2 · 2023-09-29T22:19:52Z

TLDR: this changes

type lockingSpec []*tree.LockingItem

to

type lockingSpec []*lockingItem

And then replaces almost all uses of lockingSpec in select.go with lockingContext, which is:

type lockingContext struct {
  lockScope []*lockingItem
  locking lockingSpec
}

michae2 · 2023-10-03T16:09:03Z

The test failures have been flakes. This is RFAL.

DrewKimball

Great work! This is very well commented. I just have some test suggestions and a question.

Reviewed 10 of 10 files at r3, 10 of 10 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, @msirek, and @nvanbenschoten)

-- commits line 32 at r3:
Do we have tests somewhere for allowed cases like these?

pkg/sql/opt/optbuilder/locking.go line 160 at r4 (raw file):

// blankLockingScope is a sentinel locking item that, when pushed, prevents
// lockCtx.filter from matching targets outside it.

Is "it" the current scope?

pkg/sql/opt/optbuilder/select.go line 1148 at r4 (raw file):

	}

	for range lockingClause {

Huh, I didn't know this was possible.

pkg/sql/logictest/testdata/logic_test/select_for_update line 168 at r4 (raw file):


query II
SELECT * FROM (SELECT 1 UNION SELECT 1) a, i FOR UPDATE OF i

Could we do another test with FOR UPDATE OF a?

nvanbenschoten

Reviewed 2 of 2 files at r1, 2 of 10 files at r2, 10 of 10 files at r3, 10 of 10 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @msirek)

-- commits line 94 at r2:
These release notes are great.

pkg/sql/opt/optbuilder/select.go line 1100 at r1 (raw file):

	inScope *scope,
) (outScope *scope) {
	locking.apply(lockingClause)

lockingSpec.apply modifies the receiver, not the parameter. We're then passing the parameter to rejectIfLocking for VALUES clauses, but still passing the receiver for UNION clauses. That's intentional?

If so, consider a comment explaining why we aren't passing the entire lockingSpec down on the VALUES paths.

EDIT: this is different in the second commit. I believe it no longer applies.

pkg/sql/opt/optbuilder/select.go line 1148 at r4 (raw file):

	}

	for range lockingClause {

Consider giving this a comment, explaining the two different roles that it is playing.

msirek

Very nice work!

Reviewed 10 of 10 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @michae2)

pkg/sql/opt/optbuilder/locking.go line 122 at r4 (raw file):

	// might include locking items that do not currently apply because they have
	// an unmatched target.
	lockScope []*lockingItem

Is there a reason lockScope is a slice, while locking is a type with the same slice definition? Is it just to indicate locking is the Spec of actually applicable items? Just wondering if lockingSpec should be used in both cases.

Make a few minor tweaks to semantic analysis of FOR UPDATE locking clauses. 1. Disallow multiple FOR UPDATE clauses on parenthesized queries. We do not currently handle scopes of parenthesized queries correctly. Because of this, we disallow multiple ORDER BY and LIMIT clauses on parenthesized queries. The previous implementation of FOR UPDATE was simple enough that we could work around this, but the upcoming changes make it impossible to support. 2. Allow FOR UPDATE on statements with VALUES in the FROM list (but continue to disallow FOR UPDATE on VALUES directly). This matches Postgres. Release note (sql change): Two minor changes to FOR UPDATE clauses: 1. Multiple FOR UPDATE clauses on fully-parenthesized queries are now disallowed. For example, the following statements are now disallowed: ``` (SELECT 1 FOR UPDATE) FOR UPDATE; SELECT * FROM ((SELECT 1 FOR UPDATE) FOR UPDATE) AS x; ``` whereas statements like the following are still allowed: ``` SELECT * FROM (SELECT 1 FOR UPDATE) AS x FOR UPDATE; SELECT (SELECT 1 FOR UPDATE) FOR UPDATE; ``` This does not match PostgreSQL, which allows all of these, but does match our behavior for ORDER BY and LIMIT. 2. FOR UPDATE is now allowed on statements with VALUES in the FROM list or as a subquery. For example, the following statements are now allowed: ``` SELECT (VALUES (1)) FOR UPDATE; SELECT * FROM (VALUES (1)) AS x FOR UPDATE; ``` Using FOR UPDATE directly on VALUES is still disallowed: ``` VALUES (1) FOR UPDATE; (VALUES (1)) FOR UPDATE; INSERT INTO t VALUES (1) FOR UPDATE; ``` This matches PostgreSQL.

Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of the data sources in a query's FROM list, depending on whether they have targets (FOR UPDATE OF x). Without targets, they always apply within subqueries in the FROM list. With targets, they apply within subqueries if the subquery alias matches the target. Because of this scope-like nature of FOR UPDATE and FOR SHARE, we implement semantic analysis using a stack of locking items that grow as we build inner subqueries deeper in the recursive optbuilder calls. Prior to this change, we only used the stack of locking items during buildScan, at the very bottom of the recursion. Because of this, calls to `lockingSpec.filter` could afford to compress the stack into a single locking item on our way deeper in the recursion. As part of the upcoming fix for cockroachdb#75457, however, we will need to build a new Lock operator when popping off locking items after returning from the recursion. That Lock operator will need some information gathered from buildScan at the bottom of the recursion. To support this, we refactor the stack of locking items to be two stacks: one that tracks all locking items in scope, and a second that tracks which locking items currently apply. This will allow buildScan to associate table information with the correct locking item(s), which can then be used to build Lock operators when popping the locking items. As a bonus, by using only the applied locking item stack in `validateLockingInFrom` we can make validation of SELECT FOR UPDATE queries a little more precise, which allows some queries we were incorrectly disallowing. Informs: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Allow FOR UPDATE on some queries that were previously disallowed. Queries that use the following operations are now allowed to have FOR UPDATE OF as long as the prohibited operation is in a subquery not locked by the FOR UPDATE OF: - UNION - INTERSECT - EXCEPT - DISTINCT - GROUP BY - HAVING - aggregations - window functions For example, the following query is now allowed because the subquery using the prohibited operations is not affected by the FOR UPDATE OF: ``` SELECT * FROM t, (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u FOR UPDATE OF t; ``` This matches PostgreSQL.

michae2

TFTRs! I'll go ahead and bors this and then if there are more comments, I can make changes in the next PR.

bors r=DrewKimball,msirek

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, @msirek, and @nvanbenschoten)

-- commits line 32 at r3:

Previously, DrewKimball (Drew Kimball) wrote…

Do we have tests somewhere for allowed cases like these?

Oh, good call. I added these to pkg/sql/logictest/testdata/logic_test/select_for_update.

pkg/sql/opt/optbuilder/locking.go line 122 at r4 (raw file):

Previously, msirek (Mark Sirek) wrote…

Is there a reason lockScope is a slice, while locking is a type with the same slice definition? Is it just to indicate locking is the Spec of actually applicable items? Just wondering if lockingSpec should be used in both cases.

Good question. Yes, it is to differentiate between the applied items and the items that are in scope but not necessarily applied. For example, it wouldn't make sense to call .isSet() or .get() on the items that are in scope but not necessarily applied.

I also did it to make the diff a little more meaningful (locking lockingSpec has the same purpose both before and after this PR: to describe the currently-applied locking).

pkg/sql/opt/optbuilder/locking.go line 160 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Is "it" the current scope?

"It" is the blank locking scope that was pushed.

For example, if we have a statement like:

SELECT * FROM t, (SELECT * FROM s, t FOR UPDATE OF s FOR SHARE) AS u FOR UPDATE OF t;

Then when we're building the FROM clause of the subquery the lockCtx.lockScope will look like:

[ { strength: FOR UPDATE, targets: [t] }, blankLockingScope, { strength: FOR SHARE, targets: [] }, { strength: FOR UPDATE, targets: [s] } ]

And inside the subquery, only the FOR SHARE locking applies to t.

pkg/sql/opt/optbuilder/select.go line 1100 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

lockingSpec.apply modifies the receiver, not the parameter. We're then passing the parameter to rejectIfLocking for VALUES clauses, but still passing the receiver for UNION clauses. That's intentional?

If so, consider a comment explaining why we aren't passing the entire lockingSpec down on the VALUES paths.

EDIT: this is different in the second commit. I believe it no longer applies.

This was intentional, but yes it is confusing. Added a comment. For VALUES we only check the immediate statement for locking, whereas for UNION we check all outer statements as well. It is to match Postgres.

pkg/sql/opt/optbuilder/select.go line 1148 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Huh, I didn't know this was possible.

Me neither! crlfmt did this for me!

pkg/sql/opt/optbuilder/select.go line 1148 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Consider giving this a comment, explaining the two different roles that it is playing.

Done.

pkg/sql/logictest/testdata/logic_test/select_for_update line 168 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Could we do another test with FOR UPDATE OF a?

Done.

craig · 2023-10-06T08:04:58Z

Build succeeded:

Bazel Essential CI (Cockroach)

michae2 force-pushed the locking branch from 2d0cb8d to 7ae8e66 Compare September 29, 2023 22:05

michae2 requested review from msirek, nvanbenschoten and mgartner September 29, 2023 22:11

michae2 marked this pull request as ready for review September 29, 2023 22:11

michae2 requested a review from a team as a code owner September 29, 2023 22:11

michae2 force-pushed the locking branch 2 times, most recently from 4577677 to b964599 Compare October 3, 2023 04:38

DrewKimball approved these changes Oct 4, 2023

View reviewed changes

nvanbenschoten reviewed Oct 4, 2023

View reviewed changes

msirek approved these changes Oct 4, 2023

View reviewed changes

michae2 added 2 commits October 5, 2023 23:44

michae2 force-pushed the locking branch from b964599 to fb58047 Compare October 6, 2023 06:49

michae2 commented Oct 6, 2023

View reviewed changes

craig bot merged commit 4ab430e into cockroachdb:master Oct 6, 2023

michae2 deleted the locking branch October 6, 2023 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optbuilder: refactor semantic analysis of FOR UPDATE #111258

optbuilder: refactor semantic analysis of FOR UPDATE #111258

michae2 commented Sep 26, 2023 •

edited

Loading

cockroach-teamcity commented Sep 26, 2023

michae2 commented Sep 29, 2023

michae2 commented Oct 3, 2023

DrewKimball left a comment •

edited

Loading

nvanbenschoten left a comment

msirek left a comment

michae2 left a comment

craig bot commented Oct 6, 2023

optbuilder: refactor semantic analysis of FOR UPDATE #111258

optbuilder: refactor semantic analysis of FOR UPDATE #111258

Conversation

michae2 commented Sep 26, 2023 • edited Loading

cockroach-teamcity commented Sep 26, 2023

michae2 commented Sep 29, 2023

michae2 commented Oct 3, 2023

DrewKimball left a comment • edited Loading

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

msirek left a comment

Choose a reason for hiding this comment

michae2 left a comment

Choose a reason for hiding this comment

craig bot commented Oct 6, 2023

michae2 commented Sep 26, 2023 •

edited

Loading

DrewKimball left a comment •

edited

Loading