concurrency: rollback locks sequence numbers on lock reacquisition for unreplicated locks #102269

arulajmani · 2023-04-25T18:29:30Z

Background

All operations performed by a transaction in CRDB have an associated sequence number, which are used to provide relative ordering for all operations. Savepoints are built on top of this concept -- whenever a savepoint is rolled back, associated sequence numbers are considered to be "ignored".

As of the time of writing, CRDB deviates from lock rollback compared to Postgres -- savepoint rollback does not result in lock rollbacks. This is because the client never issues an RPC to update any locks acquired under savepoints. However, the OnLockUpdate method on the LockManager, and the concurrency_control package in general, is designed to handle lock rollbacks.

Propopsal

Locks in the lock table store all sequence numbers at which they have been acquired. Calls to OnLockUpdate are passed a list of IgnoredSequenceNumbers. These are then used to prune the list of sequence numbers (holder.seqs) the lock is held at; if this list becomes empty, the lock is no longer held, and removed from the lock table.

In addition to the OnLockUpdate path, we should do this pruning on the lock (re-)acquisition path as well. This would entail teaching the arguments of OnLockAcquired (roachpb.LockAcquisition) to have access to a list of IgnoredSequenceNumbers. We could then make use of this list like we do in OnLockUpdate.

Jira issue: CRDB-27375

The text was updated successfully, but these errors were encountered:

When the lock table hears about lock acquisition at a newer epoch, it no longer needs to track lock history from prior epochs. Prior to this patch, however, we would only throw away lock history from prior epochs for the durability with which we were acquiring the lock. This meant that we could be needlessly tracking lock history for the other durability from a prior epoch. This patch changes this behavior -- we now get rid of all prior lock acquisition history from the previous epoch for both replicated and unreplicated locks. This change enables us to stop storing lock histories based on durability, which is what is proposed in the shared locks RFC. With this behavior change, I had to rework some of our existing tests that asserted the previous behavior. Some of the changes were non-trivial, but I tried to make sure we continued to test what the original test was intending to. While here, we also add an assertion to ensure the epoch of a lock holder never regresses. This codifies the requirement around epoch monotonicity from a comment on the `lockHolderInfo` struct. Informs cockroachdb#102269 Informs cockroachdb#91545 Release note: None

This patch extends the `roachpb.LockAcquisition` struct to include more information -- namely the strength with which the lock is being acquired and there are any ignored sequence numbers for the transaction acquiring the lock. For now, the strength is statically initialized to `lock.Intent`. We'll generalize this as we go about supporting multiple lock strengths in the lock table. The `AcquireLock` codepath is then changed to accept a `roachpb.LockAcquisition` struct. This doesn't change any functionality yet, but will enable us to rollback savepoints on lock acquisition and support multiple lock strengths in the lock table in upcoming patches. Informs #cockroachdb#102269 Informs cockroachdb#102270 Release note: None

arulajmani · 2023-05-05T21:40:21Z

@nvanbenschoten and I spoke about this at length today. Previously, the issue talked about pruning sequence numbers for both replicated and unreplicated locks on lock acquisition. However, this gets complex because of a few reason (that I'll describe below). So now we've changed our thinking to only rollback sequence numbers for unreplicated locks -- this is because the in-memory lock table is the source of truth for in-memory locks, not replicated locks.

Motivation

Locks can be acquired by a transaction multiple times, at multiple sequence numbers. Keeping track of the entire history of a lock's acquisition can thus grow unboundedly. Ideally, we would be able to compress this history while still preserving correctness.

The shared locks RFC (#101799) proposes that we only keep track of <strength, lowest sequence number (that hasn't been rolled back)> pairs. Put another way, the lock table can discard lock re-acquisitions at higher sequence numbers if the lock is held at the same strength at a lower sequence number, as long as that sequence number hasn't been rolled back.

Today, sequence numbers are only rolled back in tryUpdateLock. There is no guarantee we end up in this codepath after a savepoint rollback. In fact, it's quite common that we don't, given savepoint rollback doesn't try and eagerly resolve intents.

If we were not to consult ignored sequence numbers on the lock acquisition codepath, and instead blindly discard lock re-acquisitions at higher sequence numbers, we could (incorrectly) drop locks.

Challenges

While working on this, we discovered that lock re-acquisition for replicated locks (mvccPutInternal) does not filter ignored sequence numbers from its history; instead, it only decides not to move an older value into the IntentHistory if that value was written at a now rolled back sequence number. So rolling back is entirely "best effort" in that its incomplete. Relevant code:

cockroach/pkg/storage/mvcc.go

Lines 2051 to 2059 in 7c4867b

    
           // Only add the current provisional value to the intent 
        
           // history if the current sequence number is not ignored. There's no 
        
           // reason to add past committed values or a value already in the intent 
        
           // history back into it. 
        
           if curProvValRaw != nil { 
        
           	prevIntentValRaw := curProvValRaw 
        
           	prevIntentSequence := meta.Txn.Sequence 
        
           	buf.newMeta.AddToIntentHistory(prevIntentSequence, prevIntentValRaw) 
        
           }

This logic is impossible to replicate with what's proposed in the shared locks RFC. This means that the in-memory lock table's understanding of how a durable lock looks like can get out of sync. This can lead to all sorts of correctness bugs -- one can construct scenarios for bugs with infinite rediscovery loops or locks being incorrectly dropped (the original motivation for this issue).

Takeaways

All this points to the fact that keeping the in-memory state in sync with the replicated state (the source of truth) is fraught with subtle issues. We've seen this in other contexts as well. Instead of trying to work through and account for all these cases, we should be dumbed in the in-memory lock table. As a reminder to my future self, we should work under the principle that the in-memory lock table is just the source of truth for unreplicated locks and is dumb about replicated locks. That may mean that replicated locks might end up getting re-discovered by waiting requests -- not commonly, but that's fine.

With that in mind, the new proposal is to only filter the list of unreplicated locks by checking against ignored sequence numbers on the lock re-acquisition path. This then allows us to track just <strength, lowest sequence number (that hasn't been rolled back)> for unreplicated locks.

All this also means that we don't actually need to track sequence numbers for replicated locks in the in-memory lock table. Instead, we can just track whether a replicated lock exists or not. The way ignored sequence numbers are cleared for replicated locks is using a ResolveIntent request, which eventually cals into the in-memory lock table and into tryUpdateLock, where replicated locks are simply forgotten:

cockroach/pkg/kv/kvserver/concurrency/lock_table.go

Line 2249 in f4b07ba

if lock.Durability(i) == lock.Replicated || txn.Epoch > holder.txn.Epoch {

So tracking sequence numbers for replicated locks in the in-memory lock table doesn't achieve anything.

This patch extends the `roachpb.LockAcquisition` struct to include more information -- namely the strength with which the lock is being acquired and if there are any ignored sequence numbers for the transaction acquiring the lock. For now, the strength is statically initialized to `lock.Intent`. We'll generalize this as we go about supporting multiple lock strengths in the lock table. The `AcquireLock` codepath is then changed to accept a `roachpb.LockAcquisition` struct. This doesn't change any functionality yet, but will enable us to rollback savepoints on lock acquisition and support multiple lock strengths in the lock table in upcoming patches. Informs cockroachdb#102269 Informs cockroachdb#102270 Release note: None

101694: schemafeed: Add an optimization case where we don't wait on Peek or Pop r=Xiang-Gu a=Xiang-Gu This commit utilizes the recently introduced storage parameter "schema_locked" to achieve low-latency changefeed. Namely, we added logic to Peek/Pop to return "there is no table events" if the table schemas are locked without waiting for `highWater` to reach `atOrBefore`. This change allows us to reduce event commit-to-emit latency drastically from `changefeed.experimental_poll_interval` (which defaults to 1s) to milliseconds. Epic: CRDB-2258 Release note: None 102714: build: update nogo config to lint more generated code r=srosenberg,healthy-pod a=rickystewart To date we have used a regex like `cockroach/pkg/.*/.*\\.go` to identify first-party code, which does find our checked-in code, but does *not* match generated code, which is not staged at that path in the Bazel sandbox. Instead, these generated files are generally found in `bazel-out/.../bin`. In places where we've set `only_files`, in addition to the original regex, we add another regex to capture these generated files. Many packages are now (correctly) being flagged as missing package-level comments. In these cases I have added a very short package-level comment. Closes #102191. Epic: none Release note: None 102770: kv: extend lock acquisition struct to include more information r=nvanbenschoten a=arulajmani This patch extends the `roachpb.LockAcquisition` struct to include more information -- namely the strength with which the lock is being acquired and there are any ignored sequence numbers for the transaction acquiring the lock. For now, the strength is statically initialized to `lock.Intent`. We'll generalize this as we go about supporting multiple lock strengths in the lock table. The `AcquireLock` codepath is then changed to accept a `roachpb.LockAcquisition` struct. This doesn't change any functionality yet, but will enable us to rollback savepoints on lock acquisition and support multiple lock strengths in the lock table in upcoming patches. Informs ##102269 Informs #102270 Release note: None Co-authored-by: Xiang Gu <xiang@cockroachlabs.com> Co-authored-by: Ricky Stewart <rickybstewart@gmail.com> Co-authored-by: Arul Ajmani <arulajmani@gmail.com>

Lock re-acquisitions from requests that belong to prior epochs is not allowed. For replicated locks, mvccPutInternal rejects such requests with an error. However, for unreplicated locks where the in-memory lock table is the source of truth, no such check happens. This patch adds one. The comment on `lockHolderInfo.txn` field suggests this was always the intended behavior; it was just never enforced. While here, we also add an assertion to ensure a replicated lock acquisition never calls into the lock table with a transaction belonging to a prior epoch. We expect this case to be handled in mvccPutInternal, so unlike the unreplicated case, this deserves to be an assertion failure. This patch also updates a test that unintentionally re-acquired a lock using a txn from an older epoch. We also add a new test. Informs: cockroachdb#102269 Release note: None

This patch adds logic to prune the list of sequence numbers tracked by the lock table for unreplicated locks. This is done when some of the tracked sequence numbers are considered ignored, by virtue of a savepoint rollback. Note that we only do so for unreplicated locks, and only if an unreplicated lock is being reacquired. This is because the in-memory lock table is only the source of truth for in-memory locks; the mvcc keyspace is the source of truth for replicated ones. As such, trying to mimic the logic is hard/error-prone -- so we don't attempt to do so. Fixes cockroachdb#102269 Release note: None

Lock re-acquisitions from requests that belong to prior epochs is not allowed. For replicated locks, mvccPutInternal rejects such requests with an error. However, for unreplicated locks where the in-memory lock table is the source of truth, no such check happens. This patch adds one. The comment on `lockHolderInfo.txn` field suggests this was always the intended behavior; it was just never enforced. While here, we also add an assertion to ensure a replicated lock acquisition never calls into the lock table with a transaction belonging to a prior epoch. We expect this case to be handled in mvccPutInternal, so unlike the unreplicated case, this deserves to be an assertion failure. This patch also updates a test that unintentionally re-acquired a lock using a txn from an older epoch. We also add a new test. Informs: cockroachdb#102269 Release note: None

103387: grpcutil: improve gRPC logging r=knz a=tbg Extracted from #99191, where I ran a bunch of experiments to make sure the logging around connection failures is clean and helpful. I did manual testing on the changes in this PR. It's tricky to test vmodule-depth-thingies programmatically and I don't want to spend time building new infra here. - cli: don't lower gRPC severity threshold - grpcutil: adopt DepthLoggerV2 - sidetransport: suppress logging when peer down Release note: None Epic: CRDB-21710 103492: kv: add assertion to prevent lock re-acquisition at prior epochs r=nvanbenschoten a=arulajmani Lock re-acquisitions from requests that belong to prior epochs is not allowed. For replicated locks, mvccPutInternal rejects such requests with an error. However, for unreplicated locks where the in-memory lock table is the source of truth, no such check happens. This patch adds one. The comment on `lockHolderInfo.txn` field suggests this was always the intended behavior; it was just never enforced. While here, we also add an assertion to ensure a replicated lock acquisition never calls into the lock table with a transaction belonging to a prior epoch. We expect this case to be handled in mvccPutInternal, so unlike the unreplicated case, this deserves to be an assertion failure. This patch also updates a test that unintentionally re-acquired a lock using a txn from an older epoch. We also add a new test. Informs: #102269 Release note: None 103533: allocator2: fix normalization for the no constraint case r=kvoli a=sumeerbhola Informs #103320 Epic: CRDB-25222 Release note: None 103579: server: add comment pointing out TestGraphite slowness r=aliher1911 a=tbg Epic: none Release note: None 103581: server,rpc,circuit: assortment of small cleanups/improvements r=knz a=tbg Extracted from #99191. - server: fix a (harmless) test buglet - rpc: small comment tweak - rpc: fix incorrect comment - circuit: provide type for the anonymous Signal interface - rpc: rename a package - rpc: minor Goland lint appeasement - rpc: avoid shadowing `backoff` package - rpc: make some unused args anonymous - rpc: make it clear that TestStoreMetrics restarts stores Release note: None Epic: None Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: Arul Ajmani <arulajmani@gmail.com> Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>

103493: concurrency: rollback ignored sequence numbers on lock acquisition r=nvanbenschoten a=arulajmani This patch adds logic to prune the list of sequence numbers tracked by the lock table for unreplicated locks. This is done when some of the tracked sequence numbers are considered ignored, by virtue of a savepoint rollback. Note that we only do so for unreplicated locks, and only if an unreplicated lock is being reacquired. This is because the in-memory lock table is only the source of truth for in-memory locks; the mvcc keyspace is the source of truth for replicated ones. As such, trying to mimic the logic is hard/error-prone -- so we don't attempt to do so. Fixes #102269 Release note: None 103617: util/parquet: refactor random testing types r=miretskiy a=jayshrivastava This change refactors randomized testing to use `randgen.RandType`. `randgen.RandType` is better as it takes into account all allowable types which can appear in CRDB (ex. array of tuple). The previous code only generated random types which are supported by the writer which leaves a gap when new types are added. Now, the code defaults to all types and filters out unsupported ones. Informs: #99028 Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071 Release note: None 103668: roachtest: get debug zip with --include-range-info r=renatolabs a=tbg Noticed in #103552 that we are missing the ranges.jsons, which is sad. This flag is new as of #102289. Epic: none Release note: None Co-authored-by: Arul Ajmani <arulajmani@gmail.com> Co-authored-by: Jayant Shrivastava <jayants@cockroachlabs.com> Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>

arulajmani added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team A-read-committed Related to the introduction of Read Committed labels Apr 25, 2023

arulajmani mentioned this issue Apr 25, 2023

kv: implement SHARED lock strength for unreplicated locks #91545

Closed

arulajmani self-assigned this May 1, 2023

arulajmani mentioned this issue May 4, 2023

concurrency: discard lock state from prior epochs on lock acquisition #102759

Closed

arulajmani mentioned this issue May 4, 2023

kv: extend lock acquisition struct to include more information #102770

Merged

arulajmani changed the title ~~concurrency: rollback locks sequence numbers on lock reacquisition~~ concurrency: rollback locks sequence numbers on lock reacquisition for unreplicated locks May 5, 2023

arulajmani mentioned this issue May 10, 2023

rfc: introduce shared locks RFC #101799

Open

arulajmani mentioned this issue May 17, 2023

kv: add assertion to prevent lock re-acquisition at prior epochs #103492

Merged

arulajmani mentioned this issue May 17, 2023

concurrency: rollback ignored sequence numbers on lock acquisition #103493

Merged

craig bot closed this as completed in 9e78b0a May 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

concurrency: rollback locks sequence numbers on lock reacquisition for unreplicated locks #102269

concurrency: rollback locks sequence numbers on lock reacquisition for unreplicated locks #102269

arulajmani commented Apr 25, 2023 •

edited by cockroach-jira-scripts

Loading

arulajmani commented May 5, 2023

concurrency: rollback locks sequence numbers on lock reacquisition for unreplicated locks #102269

concurrency: rollback locks sequence numbers on lock reacquisition for unreplicated locks #102269

Comments

arulajmani commented Apr 25, 2023 • edited by cockroach-jira-scripts Loading

arulajmani commented May 5, 2023

arulajmani commented Apr 25, 2023 •

edited by cockroach-jira-scripts

Loading