sql: make sure that "inner" plans use the LeafTxn if the "outer" does #98120

yuzefovich · 2023-03-07T04:45:11Z

This commit fixes a bug where "inner" plans could incorrectly use
the RootTxn when the "outer" plan used the LeafTxn. One example of such
situation is when the "main" query is using the streamer (and thus is
using the LeafTxn) and also has an apply join, but the apply join
iteration plans would use the RootTxn. This could lead to "concurrent
txn usage" detected on the RootTxn. This problem is fixed by auditing
all code paths that might run plans that can spin up "inner" plans and
plumbing the information that the LeafTxn must be used by those "inner"
plans via the planner (we don't really have any other more convenient
place to do that plumbing).

Note that when create the flow for the main query we only know for sure
whether it'll use the LeafTxn or not only after the flow setup is
complete, so we adjust an existing finishedSetupFn callback to check
the type of the txn that the flow ends up using and update the planner
accordingly.

This bug reliably reproduces when creating a materialized view, but for
some (unknown to me) reason just running the query as is doesn't seem to
trigger the bug (I tried stressing the query with no luck and decided it
wasn't worth spending more time on it). I also believe that even though
the underlying mechanism for the bug has been present since forever, it
was really introduced only when we enabled the streamer by default in
22.2 (since without the streamer we always use the RootTxn for flows
with apply joins or UDFs - they must be local).

Fixes: #97989.

Release note (bug fix): CockroachDB previously could encounter
"concurrent txn use detected" internal error in some rare cases, and
this is now fixed. The bug was introduced in 22.2.0.

cockroach-teamcity · 2023-03-07T04:45:20Z

This change is

This commit fixes a bug where "inner" plans could incorrectly use the RootTxn when the "outer" plan used the LeafTxn. One example of such situation is when the "main" query is using the streamer (and thus is using the LeafTxn) and also has an apply join, but the apply join iteration plans would use the RootTxn. This could lead to "concurrent txn usage" detected on the RootTxn. This problem is fixed by auditing all code paths that might run plans that can spin up "inner" plans and plumbing the information that the LeafTxn must be used by those "inner" plans via the planner (we don't really have any other more convenient place to do that plumbing). Note that when create the flow for the main query we only know for sure whether it'll use the LeafTxn or not only after the flow setup is complete, so we adjust an existing `finishedSetupFn` callback to check the type of the txn that the flow ends up using and update the planner accordingly. This bug reliably reproduces when creating a materialized view, but for some (unknown to me) reason just running the query as is doesn't seem to trigger the bug (I tried stressing the query with no luck and decided it wasn't worth spending more time on it). I also believe that even though the underlying mechanism for the bug has been present since forever, it was really introduced only when we enabled the streamer by default in 22.2 (since without the streamer we always use the RootTxn for flows with apply joins or UDFs - they must be local). Release note (bug fix): CockroachDB previously could encounter "concurrent txn use detected" internal error in some rare cases, and this is now fixed. The bug was introduced in 22.2.0.

DrewKimball

Nice work!

Reviewed 14 of 14 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)

yuzefovich · 2023-03-10T02:01:11Z

TFTR! And nice job figuring that bug out!

bors r+

craig · 2023-03-10T02:25:07Z

Build failed:

Bazel Essential CI (Cockroach)

yuzefovich · 2023-03-10T02:27:37Z

Unrelated flake.

bors r+

craig · 2023-03-10T02:57:37Z

Build succeeded:

Bazel Essential CI (Cockroach)

blathers-crl · 2023-03-10T02:57:44Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from ed7feb0 to blathers/backport-release-22.2-98120: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

When resuming a portal, we always reset the planner. However we still need the planner to respect the outer txn's situation, as we did in cockroachdb#98120. Release note: None

yuzefovich force-pushed the fix-apply-join-txn branch 2 times, most recently from 584f523 to c7f3b6a Compare March 8, 2023 17:31

yuzefovich added the backport-22.2.x label Mar 8, 2023

yuzefovich marked this pull request as ready for review March 8, 2023 17:31

yuzefovich requested a review from a team as a code owner March 8, 2023 17:31

yuzefovich requested review from a team March 8, 2023 17:31

yuzefovich requested a review from a team as a code owner March 8, 2023 17:31

yuzefovich requested review from miretskiy, msirek, mgartner and DrewKimball and removed request for a team, miretskiy and msirek March 8, 2023 17:31

yuzefovich force-pushed the fix-apply-join-txn branch from c7f3b6a to ed7feb0 Compare March 8, 2023 18:15

DrewKimball approved these changes Mar 10, 2023

View reviewed changes

craig bot merged commit d4a584e into cockroachdb:master Mar 10, 2023

yuzefovich deleted the fix-apply-join-txn branch March 10, 2023 02:59

yuzefovich mentioned this pull request Mar 10, 2023

release-22.2: sql: make sure that "inner" plans use the LeafTxn if the "outer" does #98406

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: make sure that "inner" plans use the LeafTxn if the "outer" does #98120

sql: make sure that "inner" plans use the LeafTxn if the "outer" does #98120

yuzefovich commented Mar 7, 2023 •

edited

Loading

cockroach-teamcity commented Mar 7, 2023

DrewKimball left a comment

yuzefovich commented Mar 10, 2023

craig bot commented Mar 10, 2023

yuzefovich commented Mar 10, 2023

craig bot commented Mar 10, 2023

blathers-crl bot commented Mar 10, 2023

sql: make sure that "inner" plans use the LeafTxn if the "outer" does #98120

sql: make sure that "inner" plans use the LeafTxn if the "outer" does #98120

Conversation

yuzefovich commented Mar 7, 2023 • edited Loading

cockroach-teamcity commented Mar 7, 2023

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich commented Mar 10, 2023

craig bot commented Mar 10, 2023

yuzefovich commented Mar 10, 2023

craig bot commented Mar 10, 2023

blathers-crl bot commented Mar 10, 2023

yuzefovich commented Mar 7, 2023 •

edited

Loading