storage/concurrency: push reservation holders to detect deadlocks #45567

nvanbenschoten · 2020-03-02T04:16:48Z

This is a partial reversion of #45420.

It turns out that there are cases where a reservation holder is a link in a dependency cycle. This can happen when a reservation holder txn is holding on to one reservation while blocking on a lock on another key. If any txns queued up behind the reservation did not push someone, they wouldn't detect the deadlock.

     range A      .     range B
                  .
       txnB       .       txnC              txnA
        |         .        |  ^________      |
        v         .        v            \    v
 [lock X: (txnA)] .  [lock Y: (txnB)]  [res Z: (txnC)]

It turns out that this segment of the dependency cycle is always local to a single concurrency manager, so it could potentially forward the push through the reservation links to shorten the cycle and prevent non-lock holders from ever being the victim of a deadlock abort. This is tricky though, so for now, we just push.

To address the issue that motivated #45420, we perform this form of push asynchronously while continuing to listen to state transitions in the lockTable. If the pusher is unblocked (see #45420 for an example of when that can happen), it simply cancels the push and proceeds with navigating the lockTable.

This PR also adds a set of datadriven deadlock tests to the concurrency manager test suite: testdata/concurrency_manager/deadlocks. These tests construct deadlocks due to lock ordering and request ordering and demonstrate how the deadlocks are resolved.

cockroach-teamcity · 2020-03-02T04:17:07Z

This change is

tbg

Just to check my understanding, a "reservation" is the state you're in if you're done waiting for some lock, but still blocking on locks on other keys, correct? I tried convincing myself of that from a comment in the code somewhere, but had a bit of trouble finding a comprehensive one. I'm also not sure the word "reservation" is really a concept in the code. Perhaps there's an opportunity to clarify here. (I looked at lock_table.go and lock_table_waiter.go.

but @sumeerbhola has a better shot at finding subtle issues here if they still exist.

Reviewed 4 of 12 files at r1, 2 of 12 files at r2, 31 of 40 files at r4, 27 of 30 files at r5, 2 of 3 files at r7, 1 of 1 files at r8, 4 of 4 files at r9, 1 of 1 files at r10, 4 of 4 files at r11, 8 of 8 files at r12.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)

pkg/storage/concurrency/lock_table_waiter.go, line 384 at r12 (raw file):

	h := w.pushHeader(req)
	_, err := w.ir.PushTransaction(ctx, ws.txn, h, pushType)

And cleanup is not necessary in this case (if there was truly a deadlock)?

nvanbenschoten

TFTR!

Just to check my understanding, a "reservation" is the state you're in if you're done waiting for some lock, but still blocking on locks on other keys, correct?

Yes, that's correct. Reservations are discussed in depth in

cockroach/pkg/storage/concurrency/lock_table.go

Line 457 in 6d5477b

// Reservations:

How do you think we can make that easier to find and understand for new readers? Would a top-level diagram on that file explaining the different concepts at play in the lockTable have helped?

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola and @tbg)

pkg/storage/concurrency/lock_table_waiter.go, line 384 at r12 (raw file):

Previously, tbg (Tobias Grieger) wrote…

And cleanup is not necessary in this case (if there was truly a deadlock)?

No, there's nothing to clean up. If we do end up stumbling into a lock for this txn then we'll push it using pushLockTxn and clear the lock through that code path. I'll add a comment.

sumeerbhola

It turns out that this segment of the dependency cycle is always local to a single concurrency manager, so it could potentially forward the push through the reservation links to shorten the cycle and prevent non-lock holders from ever being the victim of a deadlock abort. This is tricky though, so for now, we just push.

Are you suggesting collapsing paths in the lockTable that contain only reservations, to construct edges from waiters to lock holders and sending those edges out to the txn wait queue?
For example:

[lock X: (txnA)] <- req1 from txnB
[res req1 from txnB] <- req2 from txnA

And making the transitive path into an edge would give a self cycle

txnA <- txnA

So txnA would be aborted?

Reviewed 6 of 8 files at r12.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)

pkg/storage/concurrency/lock_table_waiter.go, line 270 at r12 (raw file):

				go w.watchForNotifications(pushCtx, pushCancel, newStateC)
				err = w.pushRequestTxn(pushCtx, req, timerWaitingState)
				pushCancel()

why this call to pushCancel -- isn't this where the synchronous push RPC has finished (with or without cancellation), so what is being cancelled here?

pkg/storage/concurrency/lock_table_waiter.go, line 380 at r12 (raw file):

	// block until either a) either the pushee or the pusher is aborted due
	// to a deadlock or b) the request exits the lock wait-queue and the
	// caller of this function cancels the push.

there can be a state transition from holding the reservation to holding the lock. Seems like that will also cause cancellation.
Should we cancel only when the waited-on txnID changes? I guess the nature of the push once it is holding the lock is different so it is better to cancel.

nvanbenschoten

Are you suggesting collapsing paths in the lockTable that contain only reservations, to construct edges from waiters to lock holders and sending those edges out to the txn wait queue?

Yes, that's what I'm suggesting. However, this doesn't necessarily need to lead to aborting txns. We could use this knowledge of local cycles or partially local cycles to break reservations and avoid aborting txns unnecessarily. That would be the real benefit of doing something like this, though I haven't thought through exactly how we'd want to do it.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola and @tbg)

pkg/storage/concurrency/lock_table_waiter.go, line 270 at r12 (raw file):
Yes, this is where the sync push has succeeded. context.WithCancel mandates that its cancelation function is always called to "free resources":

Canceling this context releases resources associated with it, so code should call cancel as soon as the operations running in this Context complete.

There's even a lint warning in go vet for this.

pkg/storage/concurrency/lock_table_waiter.go, line 380 at r12 (raw file):

Previously, sumeerbhola wrote…

there can be a state transition from holding the reservation to holding the lock. Seems like that will also cause cancellation.
Should we cancel only when the waited-on txnID changes? I guess the nature of the push once it is holding the lock is different so it is better to cancel.

Yeah, we could try to be a little smarter here about upgrading this form of push into the other form, but I don't think it's that important, especially if it makes the code more complex. We really shouldn't be performing this form of push outside of deadlock scenarios because kv.lock_table.deadlock_detection_push_delay is so large (and should be larger).

sumeerbhola

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @sumeerbhola and @tbg)

nvanbenschoten · 2020-03-02T20:34:59Z

Thanks for the reviews! I'll rebase this on master and merge once #45482 goes in.

… tests Leaving this as a TODO for the LockTable tests.

This is a partial reversion of cockroachdb#45420. It turns out that there are cases where a reservation holder is a link in a dependency cycle. This can happen when a reservation holder txn is holding on to one reservation while blocking on a lock on another key. If any txns queued up behind the reservation did not push ~someone~, they wouldn't detect the deadlock. ``` range A . range B . txnB . txnC txnA | . | ^________ | v . v \ v [lock X: (txnA)] . [lock Y: (txnB)] [res Z: (txnC)] ``` It turns out that this segment of the dependency cycle is always local to a single concurrency manager, so it could potentially forward the push through the reservation links to shorten the cycle and prevent non-lock holders from ever being the victim of a deadlock abort. This is tricky though, so for now, we just push. To address the issue that motivated cockroachdb#45420, we perform this form of push asynchronously while continuing to listen to state transitions in the lockTable. If the pusher is unblocked (see cockroachdb#45420 for an example of when that can happen), it simply cancels the push and proceeds with navigating the lockTable.

nvanbenschoten · 2020-03-03T02:30:42Z

bors r+

nvanbenschoten · 2020-03-03T03:21:15Z

bors r+

craig · 2020-03-03T04:32:22Z

Build failed (retrying...)

GitHub CI (Cockroach)

yuzefovich · 2020-03-03T05:07:52Z

Bors crashed because of my PR, so:

bors r+

craig · 2020-03-03T05:07:54Z

Already running a review

craig · 2020-03-03T05:46:27Z

Build succeeded

GitHub CI (Cockroach)

nvanbenschoten requested review from tbg and sumeerbhola March 2, 2020 04:16

nvanbenschoten force-pushed the nvanbenschoten/lockTableDeadlock branch 3 times, most recently from a92dd8d to 2b2f8eb Compare March 2, 2020 06:05

tbg approved these changes Mar 2, 2020

View reviewed changes

nvanbenschoten commented Mar 2, 2020

View reviewed changes

nvanbenschoten mentioned this pull request Mar 2, 2020

storage/concurrency: prep lock-table for 20.1 release #44976

Closed

35 tasks

sumeerbhola approved these changes Mar 2, 2020

View reviewed changes

nvanbenschoten commented Mar 2, 2020

View reviewed changes

sumeerbhola approved these changes Mar 2, 2020

View reviewed changes

nvanbenschoten added 2 commits March 2, 2020 19:51

storage/concurrency: use shortened UUID strings in ConcurrencyManager…

a569548

… tests Leaving this as a TODO for the LockTable tests.

nvanbenschoten force-pushed the nvanbenschoten/lockTableDeadlock branch from 2b2f8eb to 51178dc Compare March 3, 2020 00:51

craig bot merged commit 5f9a71a into cockroachdb:master Mar 3, 2020

nvanbenschoten deleted the nvanbenschoten/lockTableDeadlock branch March 3, 2020 05:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage/concurrency: push reservation holders to detect deadlocks #45567

storage/concurrency: push reservation holders to detect deadlocks #45567

nvanbenschoten commented Mar 2, 2020 •

edited

Loading

cockroach-teamcity commented Mar 2, 2020

tbg left a comment

nvanbenschoten left a comment

sumeerbhola left a comment

nvanbenschoten left a comment

sumeerbhola left a comment

nvanbenschoten commented Mar 2, 2020

nvanbenschoten commented Mar 3, 2020

nvanbenschoten commented Mar 3, 2020

craig bot commented Mar 3, 2020

yuzefovich commented Mar 3, 2020

craig bot commented Mar 3, 2020

craig bot commented Mar 3, 2020

storage/concurrency: push reservation holders to detect deadlocks #45567

storage/concurrency: push reservation holders to detect deadlocks #45567

Conversation

nvanbenschoten commented Mar 2, 2020 • edited Loading

cockroach-teamcity commented Mar 2, 2020

tbg left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

nvanbenschoten commented Mar 2, 2020

nvanbenschoten commented Mar 3, 2020

nvanbenschoten commented Mar 3, 2020

craig bot commented Mar 3, 2020

Build failed (retrying...)

yuzefovich commented Mar 3, 2020

craig bot commented Mar 3, 2020

craig bot commented Mar 3, 2020

Build succeeded

nvanbenschoten commented Mar 2, 2020 •

edited

Loading