-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/catalog/descs: unify and fix TxnWithExecutor #86427
Conversation
b446e62
to
2bf92b3
Compare
2bf92b3
to
f7e918c
Compare
f7e918c
to
06d7c6b
Compare
e07e4ee
to
1eb7cc6
Compare
The code was bogus before because it was trying to use the transaction after committing. That's not allowed. Also, the retry behavior was unsound. We make it sound by tracking when the invariant violation has happened and propagate that up for a true restart. Release justification: fixes very broken and needed code Release note: None
We need to track the jobs records and we need to run the jobs. Release note: None Release justification:
Release justification: testing only change Release note: None
It was not being used. Release justification: Import cleanup as part of a bug fix. Release note: None
1eb7cc6
to
99e2941
Compare
@ZhouXing19 and/or @rafiss can y'all review this? I'm reasonably pleased with where it ended up. @maryliag would very much like #86433 merged ASAP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing up the new interface, and LGTM! But maybe we want a stamp from @rafiss too.
Just wanted to confirm my understanding of some critical changes in this PR:
- Previously there were 2 sets of logic for validating the schema changes, one in
CollectionFactory.Txn()
and the other inex.commitSQLTransactionInternal()
. And even worse, inCollectionFactory.TxnWithExecutor()
, we used both of them. We now unify them to the latter one. - The retry invoked by a two-version invariant violation has been changed to first roll back the current
kv.Txn
, then propagate a retry error to the sql layer, and finally trigger the conn executor to reset the txn.
Do I understand it correctly?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @miretskiy)
-- commits
line 20 at r4:
nit: release justification is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @miretskiy)
pkg/sql/internal.go
line 1042 at r1 (raw file):
} defer ex.close(ctx, externalTxnClose) return ex.commitSQLTransactionInternal(ctx)
Should we call ex.commitSQLTransaction()
instead? Otherwise we're not applying the changes to provoke conn executor to reset txn when the two-version invariant is violated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @ZhouXing19)
pkg/sql/internal.go
line 1042 at r1 (raw file):
Otherwise we're not applying the changes to provoke conn executor to reset txn when the two-version invariant is violated.
The internal executor in question doesn't live past this function call, so what value does resetting its transaction have?
This isn't quite right. We always rolled back the transaction when the invariant was violated, we just did it with a more confusing API. Before this change, we called See Lines 691 to 700 in 00af7d7
And Lines 835 to 840 in 00af7d7
The code doesn't change the behavior in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! thanks for the new datadriven test too
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @ZhouXing19)
Previously, ZhouXing19 (Jane Xing) wrote…
nit: release justification is missing
i believe the justification is only needed in the PR (but confusingly there is a git hook that adds it to the commit msg too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the detailed explanation!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @ZhouXing19)
pkg/sql/internal.go
line 1042 at r1 (raw file):
Previously, ajwerner wrote…
Otherwise we're not applying the changes to provoke conn executor to reset txn when the two-version invariant is violated.
The internal executor in question doesn't live past this function call, so what value does resetting its transaction have?
Ah I see, sorry I misunderstood.
TFTR! bors r+ |
Build failed (retrying...): |
Build failed (retrying...): |
Build failed (retrying...): |
Build succeeded: |
86433: server: proper transaction state management in sql-over-http r=ajwerner a=ajwerner First 4 commits are #86427. Next commit is #86461. We need to construct the internal executor in the context of the transaction so that we can make sure that its side-effects are properly managed. Without this change, we'd be throwing away all of the extraTxnState between each statement. We'd fail to create the jobs (which we defer to the end of the transaction), and we'd fail to run those jobs and check for errors. We'd also fail to validate the two-version invariant or wait for one version. Fixes #86332 Release justification: Fixes critical bugs in new functionality. Release note: None Co-authored-by: Andrew Werner <awerner32@gmail.com>
The code was bogus before because it was trying to use the transaction
after committing. That's not allowed. Also, the retry behavior was unsound.
We make it sound by tracking when the invariant violation has happened
and propagate that up for a true restart.
Release justification: fixes very broken and needed code
Release note: None