Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: write to AbortSpan less, clean up AbortSpan more #42765

Merged

Conversation

nvanbenschoten
Copy link
Member

@nvanbenschoten nvanbenschoten commented Nov 26, 2019

This PR introduces two improvements to our handling of the AbortSpan. It also introduces the first set of comprehensive testing around AbortSpan entry state transitions, which was sorely missing.

This comes after a few different customer issues that at least at first glance appeared to be AbortSpan leaks. There's still more to do to resolve those, mostly by improving GC, but anything we can do here on the frontend to reduce the number of AbortSpan entries that need to be GCed in the first place helps.

Clean up span on non-poisoning, aborting EndTransaction request

Fixes #29128.

Before this change, an EndTransaction request sent to rollback a transaction record would not remove any AbortSpan entries, even if its own Poison flag was set to false. This allowed AbortSpan entries to leak. This commit fixes this behavior by removing the AbortSpan entry in this case.

There were concerns raised in #29128 about this being safe. It turns out that we already do this on every Range except the transaction record's Range during normal intent resolution, so this isn't introducing any new concerns.

Only write AbortSpan entries if intents are removed

This reduces the frequency of AbortSpan entries that can be abandoned even without a transaction coordinator failure. Specifically, it protects against the case where intent resolution races with a transaction coordinator cleaning up its own transaction record and intents. This can happen for both aborted and committed transactions.

In the first case, a pusher might find a transaction's intent and then find its record to be aborted after that transaction had cleanly rolled back its own intents. Even though the transaction's coordinator had already cleaned up and potentially "unpoisoned" AbortSpans, the pusher would happily re-introduce AbortSpan records when it goes to resolve the intents that were already cleaned up. These AbortSpan entries would be fully abandoned and have to wait out the GC.

Similarly, in the second case, the transaction might have committed. Here, the pushee might hit an intent and the txn coordinator might clean up and auto-GC its txn record before the pushee arrives at the txn record. Once the pushee gets there, it would mistake the txn for aborted (by design) and proceed to write an AbortSpan record where the intent it had once observed had been (not by design).

We can tell both of these cases by simply recognizing whether intent resolution actually succeeds. If intent resolution doesn't find an intent, then we might be in either case. That's fine, because we only need to ever poison the abort span if we actually remove an intent that could confuse a zombie transaction.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

The fact that we clear when !Poison seems problematic. This isn't new behavior, but it looks worth fixing (though not necessarily in this PR).

Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 3 of 3 files at r3, 15 of 15 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/storage/batcheval/cmd_resolve_intent.go, line 42 at r4 (raw file):

		txnID = t.IntentTxn.ID
	}
	if status == roachpb.ABORTED {

Could add a comment here that we won't even always write the key in this case, but we can't know this from this code. Your call


pkg/storage/batcheval/transaction_test.go, line 190 at r2 (raw file):

		{
			// NOTE: this request doesn't make sense, but we handle it. An abort
			// span shouldn't be present if the transaction is still committable.

In practice the commit would run into the abort span, but this check is outside of the code being tested, right?


pkg/storage/batcheval/transaction_test.go, line 306 at r2 (raw file):

		},
		{
			name:   "resolve intent, txn aborted, no poison, intent missing, abort span present",

Isn't this problematic? If a txn gets pushed first but the corresponding non-poisoning ResolveIntent gets delayed, then the txn gets aborted and poisoning happens, won't the delayed request unpoison accidentally?

The direct blame is broken due to code movement, but I think this is a problem I introduced at some point. Worth filing an issue. The bool should (as always) have been an enum NOOP, CLEAR, POISON.

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)


pkg/storage/batcheval/cmd_resolve_intent.go, line 42 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Could add a comment here that we won't even always write the key in this case, but we can't know this from this code. Your call

Done.


pkg/storage/batcheval/transaction_test.go, line 190 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

In practice the commit would run into the abort span, but this check is outside of the code being tested, right?

Well, an abort span should be possible to write unless the txn is already aborted, in which case the EndTxn would necessarily fail.


pkg/storage/batcheval/transaction_test.go, line 306 at r2 (raw file):

If a txn gets pushed first but the corresponding non-poisoning ResolveIntent gets delayed

I don't think there's a case where a PushTxn triggers a ResolveIntent without a Poison flag (even for PUSH_TIMESTAMP), so I think the only case where a ResolveIntent request can have Poison=false is when it was initiated by the transaction coordinator itself or when it was initiated by GC. Maybe I'm missing something.

The fact that we clear when !Poison seems problematic.

Isn't this kind of the point? When !Poison, we're saying that we 're operating as the transaction and we're aware of the ABORT so we're able to clear. When were you thinking we'd set Poison = NOOP?

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM good stuff

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @tbg)


pkg/storage/batcheval/cmd_end_transaction.go, line 521 at r7 (raw file):

	}

	if WriteAbortSpanOnResolve(txn.Status) {

I like txn.Status == ABORTED better. In fact I'd get rid of this function.


pkg/storage/batcheval/cmd_end_transaction.go, line 522 at r7 (raw file):

	if WriteAbortSpanOnResolve(txn.Status) {
		if err := SetAbortSpan(ctx, evalCtx, batch, ms, txn.TxnMeta, args.Poison); err != nil {

please add a comment that this surprisingly might clear the entry. Or better yet, split the SetAbortSpan() function in two. I think this function is the textbook example of a bool argument anti-pattern.


pkg/storage/batcheval/transaction_test.go, line 27 at r6 (raw file):

// TestSetAbortSpan tests the different ways that request can set, update, and
// delete AbortSpan entries.
func TestSetAbortSpan(t *testing.T) {

excellent test


pkg/storage/batcheval/transaction_test.go, line 43 at r6 (raw file):

	txn := roachpb.MakeTransaction("test", txnKey, 0, hlc.Timestamp{WallTime: 1}, 0)
	txnAbortSpanEntry := roachpb.AbortSpanEntry{

maybe name this newTxnAbortSpanEntry


pkg/storage/batcheval/transaction_test.go, line 50 at r6 (raw file):

	// Used to detect updates to the AbortSpan entry.
	txnPrev := txn.Clone()
	txnPrev.WriteTimestamp.Add(-1, 0)

can you remind the reader if the priority and write timestamp matter for anything or if they're just testing artifacts?


pkg/storage/batcheval/transaction_test.go, line 191 at r6 (raw file):

			// NOTE: this request doesn't make sense, but we handle it. An abort
			// span shouldn't be present if the transaction is still committable.
			name:   "end txn, commit, no poison, abort span present",

let's crash in this case


pkg/storage/batcheval/transaction_test.go, line 212 at r6 (raw file):

			// NOTE: this request doesn't make sense, but we handle it. An EndTxn
			// should never pass Commit = true and Poison = true.
			name:   "end txn, commit, poison, abort span present",

let's crash


pkg/storage/batcheval/transaction_test.go, line 262 at r6 (raw file):

			name: "resolve intent, txn pending, poison, intent missing, abort span missing",
			run: func(b engine.ReadWriter, rec EvalContext) error {
				return resolveIntent(b, rec, roachpb.PENDING, true /* poison */)

resolving an intent with PENDING doesn't make sense, right? Let's crash.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)


pkg/storage/batcheval/transaction_test.go, line 306 at r2 (raw file):

I don't think there's a case where a PushTxn triggers a ResolveIntent without a Poison flag (even for PUSH_TIMESTAMP),

Heh, it turns out you're right, but only because I had this same thought in 2017 and made that change:

// We always poison due to limitations of the API: not poisoning equals
// clearing the AbortSpan, and if our pushee transaction first got pushed
// for timestamp (by us), then (by someone else) aborted and poisoned, and
// then we run the below code, we're clearing the AbortSpan illegaly.
// Furthermore, even if our pushType is not PUSH_ABORT, we may have ended
// up with the responsibility to abort the intents (for example if we find
// the transaction aborted).
//
// To do better here, we need per-intent information on whether we need to
// poison.
if err := ir.ResolveIntents(ctx, resolveIntents,
ResolveOptions{Wait: false, Poison: true}); err != nil {
return cleanup, roachpb.NewError(err)
}

I was just mildly confused how this all worked (if we push an intent timestamp, certainly the txn should not get a TransactionAbortedError?) but it looks like even before your PR we only mutate the abort span when aborting. So in a roundabout way, we have already implemented the NOOP option. I don't love the messiness of this, but it seems to be working as intended already: if we're not removing intents, Poison is effectively ignored.

@tbg tbg self-requested a review December 12, 2019 08:16
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)


pkg/storage/batcheval/cmd_end_transaction.go, line 521 at r7 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I like txn.Status == ABORTED better. In fact I'd get rid of this function.

But there's more to this function that just txn.Status == ABORTED.


pkg/storage/batcheval/cmd_end_transaction.go, line 522 at r7 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

please add a comment that this surprisingly might clear the entry. Or better yet, split the SetAbortSpan() function in two. I think this function is the textbook example of a bool argument anti-pattern.

How about renaming it to UpdateAbortSpan? I'd rather not pull the condition out to all callers and force them to make decisions that don't really concern them.


pkg/storage/batcheval/transaction_test.go, line 43 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

maybe name this newTxnAbortSpanEntry

Done.


pkg/storage/batcheval/transaction_test.go, line 50 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

can you remind the reader if the priority and write timestamp matter for anything or if they're just testing artifacts?

Good idea, done.


pkg/storage/batcheval/transaction_test.go, line 191 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

let's crash in this case

If we're all about the KV layer providing an external API (which it sounds like we are) then I don't think we should crash on user input. This case doesn't do anything wrong, we just wouldn't even get to batch evaluation if there's an abort span present.


pkg/storage/batcheval/transaction_test.go, line 212 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

let's crash

Again, I don't think we should crash, but in this case, I could see some extra input validation and rejecting the request. Done.


pkg/storage/batcheval/transaction_test.go, line 262 at r6 (raw file):

resolving an intent with PENDING doesn't make sense, right?

No, it does. That's what happens during a PUSH_TIMESTAMP push.

@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/abortSpanResolve branch 2 times, most recently from 77d4805 to fbc5c1f Compare December 28, 2019 03:49
Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @tbg)


pkg/storage/batcheval/transaction_test.go, line 191 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If we're all about the KV layer providing an external API (which it sounds like we are) then I don't think we should crash on user input. This case doesn't do anything wrong, we just wouldn't even get to batch evaluation if there's an abort span present.

Exactly - we wouldn't get to batch evaluation if there's an abort span. Even if the KV layer were to be treated as an "external" API (where I'd litigate the meaning of "external"), the KV layer isn't the request evaluation functions. Committing a txn with an abort span entry set would be a bug, and a bad one (wouldn't it?). So what are we testing here? The test says // Not aborted, don't touch abort span.. What does "not aborted" mean? The txn is aborted. Whether or not the EndTransaction updates the abort span or not in this case seems like a frivolous concern.
Do you like this test? If I understand it correctly, I think we should get rid of it and just leave a note here about this case not being possible. And if it's easy enough, I do think I'd make sense to refuse the request (or crash), I guess at the evaluateBatch() level. But that higher level function is beyond the concern of this test.

@knz
Copy link
Contributor

knz commented Dec 28, 2019

I haven't looked into this PR yet but reflecting on the last two comments I thought I'd share my views:

  1. if an API needs to reject some invalid combinations of inputs and it was not designed to return error, then panic() is the next best thing before log.Fatal or os.Exit.

  2. in my book it's perfectly a-OK to implement unit tests that verify that an API rejects invalid input combinations. In this case, the test could ascertain that a panic is indeed thrown.

  3. we've started to grow a habit to throw error objects using panic(), instead of strings, and this will pay dividend down the line. If done here, it also enables the test to verify the contents of the exception being thrown.

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)


pkg/storage/batcheval/transaction_test.go, line 191 at r6 (raw file):

What does "not aborted" mean?

Would you like it better if it said "not aborting"?

I think we should get rid of it and just leave a note here about this case not being possible

We could get rid of the test case, but I think there's value in enumerating all combinations of inputs here and just explaining the test cases that should never be hit in practice. It makes it easier to ensure that we've exhaustively tested all valid cases and at least considered the validity of invalid ones. What's the cost of keeping these two cases?

And if it's easy enough, I do think I'd make sense to refuse the request (or crash)

We certainly don't want to read from RocksDB just to perform a local assertion that is already caught further up the stack.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @nvanbenschoten, and @tbg)


pkg/storage/batcheval/cmd_end_transaction.go, line 521 at r7 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

But there's more to this function that just txn.Status == ABORTED.

ack


pkg/storage/batcheval/cmd_end_transaction.go, line 522 at r7 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

How about renaming it to UpdateAbortSpan? I'd rather not pull the condition out to all callers and force them to make decisions that don't really concern them.

UpdateAbortSpan sounds good.


pkg/storage/batcheval/transaction_test.go, line 191 at r6 (raw file):

What's the cost of keeping these two cases?

The biggest cost is paid when this test fails one day and I start wracking my brain about what it is testing.
If you want to enumerate, enumerate it as a comment. The way I see it, the only test to write (if any) is a test that the request is rejected - at whatever layer.
But I'll shut up now.

We certainly don't want to read from RocksDB just to perform a local assertion that is already caught further up the stack.

Right, I wasn't suggesting we check anything in EndTxn. I was saying we should do it in evaluateBatch() - as I now see that we already do.


pkg/storage/batcheval/transaction_test.go, line 212 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Again, I don't think we should crash, but in this case, I could see some extra input validation and rejecting the request. Done.

Thanks. I think the comment is no longer necessary on this test and the one above, now that we expect an error. Or change it to It is an error to....


pkg/storage/batcheval/transaction_test.go, line 262 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

resolving an intent with PENDING doesn't make sense, right?

No, it does. That's what happens during a PUSH_TIMESTAMP push.

So what "doesn't make sense" here then? Can you add more words to the comment?

Up until this point, we had no testing around this. For instance, not a single
test would catch it if we removed the deletion path from SetAbortSpan.
…ction request

Fixes cockroachdb#29128.

Before this change, an EndTransaction request sent to rollback a transaction record
would not remove any AbortSpan entries, even if its own Poison flag was set to false.
This allowed AbortSpan entries to leak. This commit fixes this behavior by removing
the AbortSpan entry in this case.

There were concerns raised in cockroachdb#29128 about this being safe. It turns out that we already
do this on every Range except the transaction record's Range, so this isn't introducing
any new concerns.

Release note (bug fix): AbortSpan records are now cleaned up more
aggresively when doing so is known to be safe.
This reduces the frequency of AbortSpan entries that can be abandoned even
without a transaction coordinator failure. Specifically, it protects against
the case where intent resolution races with a transaction coordinator cleaning
up its own transaction record and intents. This can happen for both aborted and
committed transactions.

In the first case, a pusher might find a transaction's intent and then find its
record to be aborted after that transaction had cleanly rolled back its own intents.
Even though the transaction's coordinator had already cleaned up and potentially
"unpoisoned" AbortSpans, the pusher would happily re-introduce AbortSpan records when
it goes to resolve the intents that were already cleaned up. These AbortSpan entries
would be fully abandoned and have to wait out the GC.

Similarly, in the second case, the transaction might have committed. Here, the
pushee might hit an intent and the txn coordinator might clean up and auto-GC its
txn record before the pushee arrives at the txn record. Once the pushee gets there,
it would mistake the txn for aborted (by design) and proceed to write an AbortSpan
record where the intent it had once observed had been (not by design).

We can tell both of these cases by simply recognizing whether intent resolution actually
succeeds. If intent resolution doesn't find an intent, then we might be in either
case. That's fine, because we only need to ever poison the abort span if we actually
remove an intent that could confuse a zombie transaction.

Release note: None
Return an error if not.

Release note: None
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)


pkg/storage/batcheval/cmd_end_transaction.go, line 522 at r7 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

UpdateAbortSpan sounds good.

Done.


pkg/storage/batcheval/transaction_test.go, line 191 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

What's the cost of keeping these two cases?

The biggest cost is paid when this test fails one day and I start wracking my brain about what it is testing.
If you want to enumerate, enumerate it as a comment. The way I see it, the only test to write (if any) is a test that the request is rejected - at whatever layer.
But I'll shut up now.

We certainly don't want to read from RocksDB just to perform a local assertion that is already caught further up the stack.

Right, I wasn't suggesting we check anything in EndTxn. I was saying we should do it in evaluateBatch() - as I now see that we already do.

Ok, I removed the test cases.


pkg/storage/batcheval/transaction_test.go, line 212 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Thanks. I think the comment is no longer necessary on this test and the one above, now that we expect an error. Or change it to It is an error to....

Done.


pkg/storage/batcheval/transaction_test.go, line 262 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

So what "doesn't make sense" here then? Can you add more words to the comment?

I removed those comments. They actually do make sense because of the complexity @tbg brought up above.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 it

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @tbg)

@nvanbenschoten
Copy link
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Jan 2, 2020
42765: storage: write to AbortSpan less, clean up AbortSpan more r=nvanbenschoten a=nvanbenschoten

This PR introduces two improvements to our handling of the AbortSpan. It also introduces the first set of comprehensive testing around AbortSpan entry state transitions, which was sorely missing. 

This comes after a few different customer issues that at least at first glance appeared to be AbortSpan leaks. There's still more to do to resolve those, mostly by improving GC, but anything we can do here on the frontend to reduce the number of AbortSpan entries that need to be GCed in the first place helps.

### Clean up span on non-poisoning, aborting EndTransaction request

Fixes #29128.

Before this change, an EndTransaction request sent to rollback a transaction record would not remove any AbortSpan entries, even if its own Poison flag was set to false. This allowed AbortSpan entries to leak. This commit fixes this behavior by removing the AbortSpan entry in this case.

There were concerns raised in #29128 about this being safe. It turns out that we already do this on every Range except the transaction record's Range during normal intent resolution, so this isn't introducing any new concerns.

### Only write AbortSpan entries if intents are removed

This reduces the frequency of AbortSpan entries that can be abandoned even without a transaction coordinator failure. Specifically, it protects against the case where intent resolution races with a transaction coordinator cleaning up its own transaction record and intents. This can happen for both aborted and committed transactions.

In the first case, a pusher might find a transaction's intent and then find its record to be aborted after that transaction had cleanly rolled back its own intents. Even though the transaction's coordinator had already cleaned up and potentially "unpoisoned" AbortSpans, the pusher would happily re-introduce AbortSpan records when it goes to resolve the intents that were already cleaned up. These AbortSpan entries would be fully abandoned and have to wait out the GC.

Similarly, in the second case, the transaction might have committed. Here, the pushee might hit an intent and the txn coordinator might clean up and auto-GC its txn record before the pushee arrives at the txn record. Once the pushee gets there, it would mistake the txn for aborted (by design) and proceed to write an AbortSpan record where the intent it had once observed had been (not by design).

We can tell both of these cases by simply recognizing whether intent resolution actually succeeds. If intent resolution doesn't find an intent, then we might be in either case. That's fine, because we only need to ever poison the abort span if we actually remove an intent that could confuse a zombie transaction.

43383: storage: stop modifying requests and returning responses from TxnWaitQueue r=nvanbenschoten a=nvanbenschoten

This PR contains three pieces of cleanup that pave the way for pulling the TxnWaitQueue underneath the upcoming `pkg/storage/concurrency` package. It removes all cases where the queue mandates an update to the PushTxnRequest that entered it. It also removes all cases where the queue could field requests directly and return results for them. This restricts the TxnWaitQueue's interface and allows it to be pushed into the concurrency manager abstraction without bloating this new abstraction.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Contributor

craig bot commented Jan 2, 2020

Build succeeded

And happy new year from bors! 🎉

@craig craig bot merged commit c7bbc4f into cockroachdb:master Jan 2, 2020
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/abortSpanResolve branch January 2, 2020 21:27
@tbg
Copy link
Member

tbg commented Feb 28, 2020

In light of recent events, how nervous would we be about backporting this?

@nvanbenschoten
Copy link
Member Author

I feel ok about backporting this to v19.2. Almost all of this was testing, and the few non-testing changes that went in here seem safe.

Here's the PR: #45553. It was mostly clean, and the parts that weren't were because of renames and other non-behavioral changes.

tbg added a commit to tbg/cockroach that referenced this pull request Mar 2, 2020
If there is "a lot" of data in Sys{Bytes,Count}, then we are likely
experiencing a large abort span. The abort span is not supposed to
become that large, but it does happen and causes stability fallout,
usually due to a combination of shortcomings:

1. there's no trigger for GC based on abort span size alone (before
   this commit)
2. transaction aborts tended to create unnecessary abort span entries,
   fixed (and 19.2-backported) in:
   cockroachdb#42765
3. aborting transactions in a busy loop:
   cockroachdb#38088
   (and we suspect this also happens in user apps occasionally)
4. large snapshots would never complete due to the queue time limits
   (addressed in cockroachdb#44952).

In an ideal world, we would factor in the abort span into this method
directly, but until then the condition guarding this block will do.
At worst, there is some other reason for SysBytes become that large
while also incurring a large SysCount, but I'm not sure how this
would happen. The only other things in this span are the versioned
range descriptors (which follow regular GC and it's only ever adding
a SysCount of one), or transaction records (which expire like abort
span records).

Release note (bug fix): Range garbage collection will now trigger
based on a large abort span, adding defense-in-depth against ranges
growing large (and eventually unstable).
tbg added a commit to tbg/cockroach that referenced this pull request Mar 4, 2020
If there is "a lot" of data in Sys{Bytes,Count}, then we are likely
experiencing a large abort span. The abort span is not supposed to
become that large, but it does happen and causes stability fallout,
usually due to a combination of shortcomings:

1. there's no trigger for GC based on abort span size alone (before
   this commit)
2. transaction aborts tended to create unnecessary abort span entries,
   fixed (and 19.2-backported) in:
   cockroachdb#42765
3. aborting transactions in a busy loop:
   cockroachdb#38088
   (and we suspect this also happens in user apps occasionally)
4. large snapshots would never complete due to the queue time limits
   (addressed in cockroachdb#44952).

In an ideal world, we would factor in the abort span into this method
directly, but until then the condition guarding this block will do.
At worst, there is some other reason for SysBytes become that large
while also incurring a large SysCount, but I'm not sure how this
would happen. The only other things in this span are the versioned
range descriptors (which follow regular GC and it's only ever adding
a SysCount of one), or transaction records (which expire like abort
span records).

Release note (bug fix): Range garbage collection will now trigger
based on a large abort span, adding defense-in-depth against ranges
growing large (and eventually unstable).
tbg added a commit to tbg/cockroach that referenced this pull request Mar 5, 2020
If there is "a lot" of data in Sys{Bytes,Count}, then we are likely
experiencing a large abort span. The abort span is not supposed to
become that large, but it does happen and causes stability fallout,
usually due to a combination of shortcomings:

1. there's no trigger for GC based on abort span size alone (before
   this commit)
2. transaction aborts tended to create unnecessary abort span entries,
   fixed (and 19.2-backported) in:
   cockroachdb#42765
3. aborting transactions in a busy loop:
   cockroachdb#38088
   (and we suspect this also happens in user apps occasionally)
4. large snapshots would never complete due to the queue time limits
   (addressed in cockroachdb#44952).

In an ideal world, we would factor in the abort span into this method
directly, but until then the condition guarding this block will do.
At worst, there is some other reason for SysBytes become that large
while also incurring a large SysCount, but I'm not sure how this
would happen. The only other things in this span are the versioned
range descriptors (which follow regular GC and it's only ever adding
a SysCount of one), or transaction records (which expire like abort
span records).

Release note (bug fix): Range garbage collection will now trigger
based on a large abort span, adding defense-in-depth against ranges
growing large (and eventually unstable).
craig bot pushed a commit that referenced this pull request Mar 5, 2020
45573: storage: trigger GC based on SysCount/SysBytes r=ajwerner a=tbg

If there is "a lot" of data in Sys{Bytes,Count}, then we are likely
experiencing a large abort span. The abort span is not supposed to
become that large, but it does happen and causes stability fallout,
usually due to a combination of shortcomings:

1. there's no trigger for GC based on abort span size alone (before
   this commit)
2. transaction aborts tended to create unnecessary abort span entries,
   fixed (and 19.2-backported) in:
   #42765
3. aborting transactions in a busy loop:
   #38088
   (and we suspect this also happens in user apps occasionally)
4. large snapshots would never complete due to the queue time limits
   (addressed in #44952).

In an ideal world, we would factor in the abort span into this method
directly, but until then the condition guarding this block will do.
At worst, there is some other reason for SysBytes become that large
while also incurring a large SysCount, but I'm not sure how this
would happen. The only other things in this span are the versioned
range descriptors (which follow regular GC and it's only ever adding
a SysCount of one), or transaction records (which expire like abort
span records).

Release note (bug fix): Range garbage collection will now trigger
based on a large abort span, adding defense-in-depth against ranges
growing large (and eventually unstable).

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
tbg added a commit to tbg/cockroach that referenced this pull request Mar 5, 2020
If there is "a lot" of data in Sys{Bytes,Count}, then we are likely
experiencing a large abort span. The abort span is not supposed to
become that large, but it does happen and causes stability fallout,
usually due to a combination of shortcomings:

1. there's no trigger for GC based on abort span size alone (before
   this commit)
2. transaction aborts tended to create unnecessary abort span entries,
   fixed (and 19.2-backported) in:
   cockroachdb#42765
3. aborting transactions in a busy loop:
   cockroachdb#38088
   (and we suspect this also happens in user apps occasionally)
4. large snapshots would never complete due to the queue time limits
   (addressed in cockroachdb#44952).

In an ideal world, we would factor in the abort span into this method
directly, but until then the condition guarding this block will do.
At worst, there is some other reason for SysBytes become that large
while also incurring a large SysCount, but I'm not sure how this
would happen. The only other things in this span are the versioned
range descriptors (which follow regular GC and it's only ever adding
a SysCount of one), or transaction records (which expire like abort
span records).

Release note (bug fix): Range garbage collection will now trigger
based on a large abort span, adding defense-in-depth against ranges
growing large (and eventually unstable).
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Apr 30, 2020
Fixes cockroachdb#43707.
Fixes cockroachdb#48046.
Fixes cockroachdb#48189.

Part of the change made by cockroachdb#42765 was to clear AbortSpan entries on
non-poisoning, aborting EndTxn requests. Specifically, this change was
made in 1328787. The change forgot to update the corresponding span
declaration logic to reflect the fact that we were now writing to the
AbortSpan in cases where we previously weren't.

This was triggering an assertion in race builds that tried to catch this
kind of undeclared span access. The assertion failure was very rare
because it required the following conditions to all be met:
1. running a test with the race detector enabled
2. a txn (A) must have been aborted by another txn (B)
3. txn B must have cleared an intent on txn A's transaction record range
4. txn A must have noticed and issued a non-poisoning EndTxn(ABORT)

We should backport this when we get a change (once v20.1.0 has
stabilized), but I don't expect that this could actually cause any
issues. The AbortSpan update was strictly a matter of performance and we
should never be racing with another request that is trying to read the
same AbortSpan entry.
craig bot pushed a commit that referenced this pull request Apr 30, 2020
48190: sql: inject tenant ID in sqlServerArgs, pass through ExecutorConfig r=nvanbenschoten a=nvanbenschoten

Fixes #47903.
Informs #48123.

Also known as "the grand plumbing", this change replaces a few instances of `TODOSQLCodec` in `pkg/sql/sqlbase/index_encoding.go` and watches the house of cards fall apart. It then glues the world back together, this time using a properly injected tenant-bound SQLCodec to encode and decode all SQL table keys.

A tenant ID field is added to `sqlServerArgs`. This is used to construct a tenant-bound `keys.SQLCodec` during server creation. This codec morally lives on the `sql.ExecutorConfig`. In practice, it is also copied onto `tree.EvalContext` and `execinfra.ServerConfig` to help carry it around. SQL code is adapted to use this codec whenever it needs to encode or decode keys.

If all tests pass after this refactor, there is a good chance it got things right. This is because any use of an uninitialized SQLCodec will panic immediately when the codec is first used. This was helpful in ensuring that it was properly plumbed everywhere.

48245: kv: declare write access to AbortSpan on all aborting EndTxn reqs r=nvanbenschoten a=nvanbenschoten

Fixes #43707.
Fixes #48046.
Fixes #48189.

Part of the change made by #42765 was to clear AbortSpan entries on
non-poisoning, aborting EndTxn requests. Specifically, this change was
made in 1328787. The change forgot to update the corresponding span
declaration logic to reflect the fact that we were now writing to the
AbortSpan in cases where we previously weren't.

This was triggering an assertion in race builds that tried to catch this
kind of undeclared span access. The assertion failure was very rare
because it required the following conditions to all be met:
1. running a test with the race detector enabled
2. a txn (A) must have been aborted by another txn (B)
3. txn B must have cleared an intent on txn A's transaction record range
4. txn A must have noticed and issued a non-poisoning EndTxn(ABORT)

We should backport this when we get a change (once v20.1.0 has
stabilized), but I don't expect that this could actually cause any
issues. The AbortSpan update was strictly a matter of performance and we
should never be racing with another request that is trying to read the
same AbortSpan entry.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request May 4, 2020
Fixes cockroachdb#43707.
Fixes cockroachdb#48046.
Fixes cockroachdb#48189.

Part of the change made by cockroachdb#42765 was to clear AbortSpan entries on
non-poisoning, aborting EndTxn requests. Specifically, this change was
made in 1328787. The change forgot to update the corresponding span
declaration logic to reflect the fact that we were now writing to the
AbortSpan in cases where we previously weren't.

This was triggering an assertion in race builds that tried to catch this
kind of undeclared span access. The assertion failure was very rare
because it required the following conditions to all be met:
1. running a test with the race detector enabled
2. a txn (A) must have been aborted by another txn (B)
3. txn B must have cleared an intent on txn A's transaction record range
4. txn A must have noticed and issued a non-poisoning EndTxn(ABORT)

We should backport this when we get a change (once v20.1.0 has
stabilized), but I don't expect that this could actually cause any
issues. The AbortSpan update was strictly a matter of performance and we
should never be racing with another request that is trying to read the
same AbortSpan entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: clear abort span on non-poisoning, aborting EndTransaction request
5 participants