kv: don't unquiesce uninitialized replicas #73362

nvanbenschoten · 2021-12-02T00:31:21Z

In a support issue, we saw that 10s of thousands of uninitialized replicas were being ticked regularly and creating a large amount of background work on a node, driving up CPU. This commit updates the Raft quiescence logic to disallow uninitialized replicas from being unquiesced and Tick()'ing themselves.

Keeping uninitialized replicas quiesced even in the presence of Raft traffic avoids wasted work. We could Tick() these replicas, but doing so is unnecessary because uninitialized replicas can never win elections, so there is no reason for them to ever call an election. In fact, uninitialized replicas do not even know who their peers are, so there would be no way for them to call an election or for them to send any other non-reactive message. As a result, all work performed by an uninitialized replica is reactive and in response to incoming messages (see processRequestQueue).

There are multiple ways for an uninitialized replica to be created and then abandoned, and we don't do a good job garbage collecting them at a later point (see #73424), so it is important that they are cheap. Keeping them quiesced instead of letting them unquiesce and tick every 200ms indefinitely avoids a meaningful amount of periodic work for each uninitialized replica.

Release notes (bug fix): uninitialized replicas that are abandoned after an unsuccessful snapshot no longer perform periodic background work, so they no longer have a non-negligible cost.

cockroach-teamcity · 2021-12-02T00:31:28Z

This change is

nvanbenschoten · 2021-12-02T05:35:00Z

Perhaps it is important that the uninitialized replica calls an election, if only so that it gets told to clean itself up if it has been abandoned. Except that this might be broken, because uninitialized replicas are learners, so they aren't even promotable and so they won't call elections. I think I'm missing something here.

I think this is incorrect in two ways which offset. Uninitialized replicas don't know that they are learners, because they don't have access to the range descriptor. This means that they aren't disqualified for being "promotable" because of that, at least from their perspective. However, uninitialized replicas don't know the other members of their range and don't even track themselves in their Progress map, so they still aren't "promotable" and wouldn't even know who to talk to if they were and decided to call an election. So ticking an uninitialized replica doesn't do anything.

This also means that leader elections cannot be used to trigger the cleanup of uninitialized replicas.

@tbg's new state machine diagram of replica initialization also hints at there being no path for an uninitialized replica to be removed without going through an initialized replica.

nvanbenschoten · 2021-12-02T05:46:12Z

#47982 looks relevant. I guess that confirms the suspicion.

So I guess the next steps here are:

polishing this PR off (i.e. with a test), now that we're more confident that ticking uninitialized replicas is entirely useless. This makes uninitialized replicas cheaper and less disruptive.
landing kvserver: garbage collect uninitialized replica eventually #47982 so that we will eventually (after ~1h) GC these abandoned uninitialized replicas.

@ajwerner do you have interest in pushing that PR over the finish line?

tbg · 2021-12-02T08:40:19Z

Let's keep Andrew out of it, I'm sure he has enough to do.

I might be missing something, but isn't GC'ing uninitialized replicas difficult? We would need an index RangeID->RangeDesc which we don't have (meta ranges are keyed on EndKey, not RangeID) or do full meta scans every now and then.
We can't "just" remove uninitialized replicas on a hunch without having proved that they are no longer a member of the range as this amounts to reducing availability of the range in a way that is more permanent than having a replica wait for a snapshot (removing a voter and preventing it from being recreated). We can remove non-voters (and expect that to be the common case) but to confirm that we need to see the descriptor, too.

Also worth pausing to think about the benefit of replicaGC'ing uninit'ed replicas once we don't tick them. Once we no longer tick them, other than a sense of closure the main upshot is removing them from the in-memory maps. The disk state is negligibly different, RangeTombstone key vs HardState key.

tbg · 2021-12-02T08:43:49Z

We can definitely replicaGC uninitialized replicas that have a trivial HardState, i.e. in particular Vote is unset. That should be the majority of cases. But what if Vote isn't unset? We know this uninitialized replica was a voter and we strongly suspect that it's obsolete, but there could also be permanent problems getting a snapshot there and losing this voter might lose quorum in a pretty permanent way.

I lean towards doing a ~daily scan through the meta ranges per node and performing GC of uninitialized replicas based on that, since that is simple and effective and handles the problem fully.

tbg

Reviewed 4 of 4 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

-- commits, line 28 at r2:
It's by design in the sense that we can't replicaGC them for the reasons discussed in the main thread. There's definitely no desire to not be able to replicaGC them but it hasn't been a priority.

-- commits, line 31 at r2:
Correct, which might never happen, for example if the range doesn't exist any more (merge).

-- commits, line 35 at r2:
nit: they aren't always learners. You can also get uninit'ed voters with a quick split-rebalance succession. But the uninitialized replica doesn't even know what it is, it will have an empty config, see here:

cockroach/pkg/kv/kvserver/replica_raftstorage.go

Lines 66 to 69 in 4c8c0d4

    
           // For uninitialized ranges, membership is unknown at this point. 
        
           if raft.IsEmptyHardState(hs) || err != nil { 
        
           	return raftpb.HardState{}, raftpb.ConfState{}, err 
        
           }

So they certainly will not call an election, ever, since the progress map is empty. They also can't send a message to anyone in general, since that would require them to at least know about a peer, but these would have to be in the configuration. The only thing they can do is respond to incoming messages.

This indicates that by not ticking uninit'ed replicas, we're not regressing on their ability to "commonly" replicaGC themselves in response to a rejection of one of their outgoing messages, as this basically never happens in the first place anyway.

pkg/kv/kvserver/store_raft.go, line 638 at r2 (raw file):

			// Filter out uninitialized replicas. We could tick them, but doing so is
			// unnecessary because uninitialized replicas can never win elections, so

Might want to update this to reflect our finding that uninited replicas cannot even send non-reactive messages since they don't know who the peers are.

pkg/kv/kvserver/store_raft.go, line 647 at r2 (raw file):

			// them through the Raft scheduler and ticking them avoids a meaningful
			// amount of periodic work for each uninitialized replica.
			s.mu.RLock()

Reading this makes me wonder why we allow uninited replicas to be unquiesced. Not suggesting you change that, but maybe add a comment saying that it's just the way it is and there is no good reason and we don't like lots of things about uninitialized replicas and see #72374.

nvanbenschoten

isn't GC'ing uninitialized replicas difficult? We would need an index RangeID->RangeDesc which we don't have (meta ranges are keyed on EndKey, not RangeID) or do full meta scans every now and then.

This makes sense to me, thanks for explaining. This must mean that the approach in #47982 is broken. It seems to perform its consistent meta lookup with the replica's in-memory RangeDescriptor here. But if the replica is uninitialized, this descriptor will be empty. So it won't actually look up the right range.

Also worth pausing to think about the benefit of replicaGC'ing uninit'ed replicas once we don't tick them. Once we no longer tick them, other than a sense of closure the main upshot is removing them from the in-memory maps. The disk state is negligibly different, RangeTombstone key vs HardState key.

👍 that matches my understanding as well.

I lean towards doing a ~daily scan through the meta ranges per node and performing GC of uninitialized replicas based on that, since that is simple and effective and handles the problem fully.

This seems nice and simple.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)

pkg/kv/kvserver/store_raft.go, line 647 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Reading this makes me wonder why we allow uninited replicas to be unquiesced. Not suggesting you change that, but maybe add a comment saying that it's just the way it is and there is no good reason and we don't like lots of things about uninitialized replicas and see #72374.

That's a really interesting idea. If we allowed uninited replicas to quiesce (on their own, with no leader input) then we wouldn't need any of this special handling. These replicas would be removed from the unquiescedReplicas map and stay out of it until they heard from a peer. I'll explore that alternate approach, as it seems pretty clean and conceptually consistent.

erikgrinaker · 2021-12-03T10:28:39Z

Thanks for all your work on this @nvanbenschoten! Wrote up an issue to handle GC in #73424.

nvanbenschoten · 2021-12-06T02:26:28Z

Thanks for opening the issue @erikgrinaker! I updated the code comment to point to that issue.

@tbg I rewrote this PR according to your suggestion. Instead of letting uninitialized replicas unquiesce but then filtering them out in the tick loop, we now no longer let uninitialized replicas unquiesce. This seems cleaner to me, as it introduces a reasonable relationship between state machine initialization and quiescence and uses the existing quiescence logic to prevent these replicas from ticking, instead of introducing an additional orthogonal condition for ticking. PTAL.

tbg

Reviewed 2 of 4 files at r4, 9 of 9 files at r9, 4 of 4 files at r10, 4 of 4 files at r11, 4 of 4 files at r12, 7 of 7 files at r13, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/kv/kvserver/client_raft_test.go, line 4105 at r13 (raw file):

	defer tc.Stopper().Stop(ctx)

	_, desc, err := tc.Servers[0].ScratchRangeEx()

Add a comment about using an expiration-based lease or, my preference if possible, switch to ScratchRange.

pkg/kv/kvserver/client_raft_test.go, line 4110 at r13 (raw file):

	require.NoError(t, tc.WaitForSplitAndInitialization(key))

	// Block a snapshot.

// Block incoming snapshots on s2 until channel is signaled.

pkg/kv/kvserver/client_raft_test.go, line 4114 at r13 (raw file):

	handlerFuncs := noopRaftHandlerFuncs()
	handlerFuncs.snapErr = func(header *kvserver.SnapshotRequest_Header) error {
		<-blockSnapshot

give up on stopper.ShouldQuiesce()?

pkg/kv/kvserver/client_raft_test.go, line 4117 at r13 (raw file):

		return nil
	}
	s1, err := tc.Server(1).GetStores().(*kvserver.Stores).GetStore(tc.Server(1).GetFirstStoreID())

s2? s1 is on tc.Server(0). Indexing, always a joy.

pkg/kv/kvserver/client_raft_test.go, line 4125 at r13 (raw file):

	})

	// Try to up-replicate to s1. Should block on a learner snapshot after the new

s2 here and below?

pkg/kv/kvserver/client_raft_test.go, line 4130 at r13 (raw file):

	// receiving Raft traffic from the leader.
	replicateErrChan := make(chan error)
	go func() {

nit: I know it doesn't matter here but if we have a stopper might as well use it.

pkg/kv/kvserver/replica.go, line 306 at r13 (raw file):

		//
		// Replica objects always begin life in a quiescent state, as the field is
		// set to true in the Replica constructor newUnloadedReplica. They unquiesce

Ohh, interesting. I had no idea. That explains your earlier commit. I take back my comment about || internalRaftGroup == nil then.

pkg/kv/kvserver/replica.go, line 310 at r13 (raw file):

		// unquiesceWithOptionsLocked, which are called in response to Raft traffic.
		//
		// Only initialized replicas that have a non-nil internalRaftGroup are

This makes it sound like there are initialized replicas with a nil raft group, which raises the question of how these are allowed to unquiesce (since they are not according to this comment). So either init'ed replicas with nil group don't exist, or whether the group is nil or not doesn't matter for purposes of unquiescing?

pkg/kv/kvserver/replica_metrics.go, line 62 at r11 (raw file):

	raftStatus := r.raftStatusRLocked()
	leaseStatus := r.leaseStatusAtRLocked(ctx, now)
	quiescent := r.mu.quiescent || r.mu.internalRaftGroup == nil

Interesting. Have I missed a commit? Because, wouldn't an uninitialized replica start out with quiescent = false now when it was previously true? As is, this seems like a behavior change to me. Not a terribly important one (since this is only for ReplicaMetrics and internalRaftGroup is populated anyway within ~10 minutes (scan interval)) but still, this doesn't quite check out with the commit message.

pkg/kv/kvserver/replica_raft_quiesce.go, line 52 at r13 (raw file):

}

func (r *Replica) unquiesceWithOptionsLocked(campaignOnWake bool) {

maybeUnquiesceWithOptionsLocked to indicate that it may not be possible?

pkg/kv/kvserver/replica_raft_quiesce.go, line 116 at r13 (raw file):

		// lazily in response to any Raft traffic (see stepRaftGroup) or KV request
		// traffic (see maybeInitializeRaftGroup). If it has yet to be initialized,
		// let it remain quiesced. The Raft group will be initialized soon enough.

This is pretty annoying; we morally say that a replica unquiesces when it needs to use raft, but then we also say this isn't true, you need some folks bypassing this check and creating the group first? This is pretty complex and I'm not at all clear on why this works out in practice (I assume we have the right calls in the actual critical path).

I think we should initialize the raft group right away. Is now perhaps the right time to make that change? Doing this lazily was a half-baked way to avoid election storms back when this was still a problem and possibly before we even had quiescence. Now that quiescence is the default, what's the benefit of the lazy raft group?

I understand that we're looking to backport this, so maybe a fast-follow on master only?

pkg/kv/kvserver/store_create_replica.go, line 351 at r13 (raw file):

	// quiesced, so we don't need to campaign or wake the leader. We just want
	// to start ticking.
	lockedRepl.unquiesceWithOptionsLocked(false /* campaignOnWake */)

Beating the same drum as before, but now we have to know that the internal raft group is no longer nil (probably because we hit withReplicaForRequest?)

Did you have to add this line in response to any failures? It makes sense, but shouldn't be necessary, right?

nvanbenschoten

TFTR! I finally got around to addressing your comments.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)

pkg/kv/kvserver/client_raft_test.go, line 4105 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Add a comment about using an expiration-based lease or, my preference if possible, switch to ScratchRange.

ScratchRangeEx isn't an expiration-based lease, it's the "extended" version of ScratchRange that returns the left and right descriptor instead of a start key. This made more sense to me than the logic I see in some places where we ScatchRange and then take the returned start key a look up the corresponding descriptor that we just dropped in ScratchRange.

pkg/kv/kvserver/client_raft_test.go, line 4110 at r13 (raw file):

// Block incoming snapshots on s2 until channel is signaled.
Done.

pkg/kv/kvserver/client_raft_test.go, line 4114 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

give up on stopper.ShouldQuiesce()?

Done.

pkg/kv/kvserver/client_raft_test.go, line 4117 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

s2? s1 is on tc.Server(0). Indexing, always a joy.

Done.

pkg/kv/kvserver/client_raft_test.go, line 4125 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

s2 here and below?

Done.

pkg/kv/kvserver/client_raft_test.go, line 4130 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

nit: I know it doesn't matter here but if we have a stopper might as well use it.

Done.

pkg/kv/kvserver/replica.go, line 306 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Ohh, interesting. I had no idea. That explains your earlier commit. I take back my comment about || internalRaftGroup == nil then.

Me neither until writing this up. But I think it makes sense for things to work like that.

pkg/kv/kvserver/replica.go, line 310 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This makes it sound like there are initialized replicas with a nil raft group, which raises the question of how these are allowed to unquiesce (since they are not according to this comment). So either init'ed replicas with nil group don't exist, or whether the group is nil or not doesn't matter for purposes of unquiescing?

The dependency is the other way around. There are initialized replicas with a nil raft group. They unquiesce by first initializing their raft group in withRaftGroupLocked and then trying to unquiesce. But let's get rid of this craziness in #73715.

pkg/kv/kvserver/replica_metrics.go, line 62 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Interesting. Have I missed a commit? Because, wouldn't an uninitialized replica start out with quiescent = false now when it was previously true? As is, this seems like a behavior change to me. Not a terribly important one (since this is only for ReplicaMetrics and internalRaftGroup is populated anyway within ~10 minutes (scan interval)) but still, this doesn't quite check out with the commit message.

I think this comment came before your "I take back my comment", but in case it wasn't clear, all replicas (initialized or uninitialized) start with quiescent = true. They can then only unquiesce if r.mu.internalRaftGroup != nil, which has always been the case and is not changing in this PR. So !r.mu.quiescent implies r.mu.internalRaftGroup != nil, meaning that this was redundant.

pkg/kv/kvserver/replica_raft_quiesce.go, line 52 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

maybeUnquiesceWithOptionsLocked to indicate that it may not be possible?

Done. I also added a return value that we can use for assertions in a few places. PTAL at the new commit.

pkg/kv/kvserver/replica_raft_quiesce.go, line 116 at r13 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This is pretty annoying; we morally say that a replica unquiesces when it needs to use raft, but then we also say this isn't true, you need some folks bypassing this check and creating the group first? This is pretty complex and I'm not at all clear on why this works out in practice (I assume we have the right calls in the actual critical path).

I think we should initialize the raft group right away. Is now perhaps the right time to make that change? Doing this lazily was a half-baked way to avoid election storms back when this was still a problem and possibly before we even had quiescence. Now that quiescence is the default, what's the benefit of the lazy raft group?

I understand that we're looking to backport this, so maybe a fast-follow on master only?

I think I see what you're saying. The idea would be that if quiescence is the default and we only unquiesce in all the same places that would otherwise initialize the raft group, then this lazy raft group creation isn't doing anything anymore, and so the dependency from internalRaftGroup != nil -> !quiescent is unnecessary complexity. I think I agree with that in theory, as it simplifies the states that a replica can be in significantly.

But I'd like to avoid expanding the scope of this PR any further, so a fast-follow sounds good. I opened #73715 to track the work.

pkg/kv/kvserver/store_create_replica.go, line 351 at r13 (raw file):

but now we have to know that the internal raft group is no longer nil (probably because we hit withReplicaForRequest?)

Yes, that's correct. If we made it here then it was in response to a raft message and so the internal raft group is no longer nil.

Did you have to add this line in response to any failures? It makes sense, but shouldn't be necessary, right?

No, this wasn't in response to any failures. In fact, it's not strictly necessary because this method is only ever called inside of handleRaftReadyRaftMuLocked in response to snapshots, which also calls withRaftGroupLocked with a true return value for unquiesceAndWakeLeader before completing. So if we ever initialize ourselves inside of handleRaftReadyRaftMuLocked and don't unquiesce before getting to the end, that will take care of unquiescing for us.

But this isn't entirely superfluous — there is a benefit to calling this here instead of waiting. That's because doing so allows us to call unquiesceWithOptionsLocked and to pass !campaignOnWake. So since we know we don't need to campaign or wake the leader, it's better to unquiesce now and prevent the other code path from doing so.

I added both of these points to the comment.

pkg/kv/kvserver/store_raft.go, line 638 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Might want to update this to reflect our finding that uninited replicas cannot even send non-reactive messages since they don't know who the peers are.

Done.

tbg

Reviewed 8 of 8 files at r14, 1 of 1 files at r15, 1 of 1 files at r16, 6 of 6 files at r17, 10 of 10 files at r18, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/kv/kvserver/store_create_replica.go, line 369 at r18 (raw file):

	// the range, so we can unquiesce without campaigning or waking the leader.
	if !lockedRepl.maybeUnquiesceWithOptionsLocked(false /* campaignOnWake */) {
		return errors.Errorf("expected replica %s to unquiesce after initialization", desc)

errors.AssertionFailedf? If we hit this the world will end, right? Can't imagine this will be handled gracefully by the callers.

This is a small refactor that eliminates a call to `HLC.Now` on each replica tick. Due to the issue fixed by the following commit, we were ticking 1000s of uninitialized replicas in fast succession and as a result, these clock accesses were a major source of mutex contention. We shouldn't be ticking uninitialized replicas, but since we now know that this is an expensive part of Replica.tick, we might as well make it cheaper.

Refactor that does not change behavior.

A replica cannot unquiesce if its internalRaftGroup is nil, so this condition was redundant but confusing.

In a [support issue](cockroachlabs/support#1340), we saw that 10s of thousands of uninitialized replicas were being ticked regularly and creating a large amount of background work on a node, driving up CPU. This commit updates the Raft quiescence logic to disallow uninitialized replicas from being unquiesced and Tick()'ing themselves. Keeping uninitialized replicas quiesced even in the presence of Raft traffic avoids wasted work. We could Tick() these replicas, but doing so is unnecessary because uninitialized replicas can never win elections, so there is no reason for them to ever call an election. In fact, uninitialized replicas do not even know who their peers are, so there would be no way for them to call an election or for them to send any other non-reactive message. As a result, all work performed by an uninitialized replica is reactive and in response to incoming messages (see processRequestQueue). There are multiple ways for an uninitialized replica to be created and then abandoned, and we don't do a good job garbage collecting them at a later point (see cockroachdb#73424), so it is important that they are cheap. Keeping them quiesced instead of letting them unquiesce and tick every 200ms indefinitely avoids a meaningful amount of periodic work for each uninitialized replica. Release notes (bug fix): uninitialized replicas that are abandoned after an unsuccessful snapshot no longer perform periodic background work, so they no longer have a non-negligible cost.

This commit renames the `unquiesceXYZ` methods to include a "maybe" prefix to indicate that they may not succeed in unquiescing the replica. The commit also introduces a boolean return value to these methods to inform the caller about whether the unquiesce attempt was successful. Finally, the commit uses this return value in `maybeMarkReplicaInitializedLockedReplLocked` to assert that after initializing a replica, it did unquiesce immediately.

nvanbenschoten

TFTR!

bors r=tbg

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)

pkg/kv/kvserver/store_create_replica.go, line 369 at r18 (raw file):

Previously, tbg (Tobias Grieger) wrote…

errors.AssertionFailedf? If we hit this the world will end, right? Can't imagine this will be handled gracefully by the callers.

Done.

craig · 2021-12-15T00:21:54Z

Build succeeded:

GitHub CI (Cockroach)

blathers-crl · 2021-12-15T00:22:44Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from 95a7671 to blathers/backport-release-21.1-73362: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.

error creating merge commit from 95a7671 to blathers/backport-release-21.2-73362: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

After cockroachdb#73362, all uninitialized replicas are quiescent. The callers of addToReplicasByRangeIDLocked are either inserting an unitialized Replica, or a newly created initialized Replica which has not been linked to Store yet and hence has not been unquiesced either. The quiescence check is thus a no-op, and can be removed from addToReplicasByRangeIDLocked. As a second "proof" for why it can be removed: - rangeID is added to unquiescedReplicas in 2 places (Replica.maybeUnquiesce*), both of which also set r.mu.quiescent to false a few lines above. - rangeID is removed from unquiescedReplicas in 2 places (Replica.quiesceLocked and Replica removal in Store.unlinkReplicaByRangeIDLocked). The first one sets r.mu.quiescent to true a few lines apart, and the second one outright removes the replica. Thus, it is already true that r.mu.quiescent == false iff the replica's rangeID is in unquiescedReplicas (with a little exception of when r is being removed which does not impact our check). Release note: none

nvanbenschoten requested a review from tbg December 2, 2021 00:31

tbg approved these changes Dec 2, 2021

View reviewed changes

nvanbenschoten commented Dec 2, 2021

View reviewed changes

erikgrinaker mentioned this pull request Dec 3, 2021

kvserver: garbage collect uninitialized replicas #73424

Open

nvanbenschoten force-pushed the nvanbenschoten/tickUninit branch from 6a7631b to 32b96f4 Compare December 6, 2021 02:21

nvanbenschoten changed the title ~~kv: don't tick uninitialized replicas~~ kv: don't unquiesce uninitialized replicas Dec 6, 2021

nvanbenschoten marked this pull request as ready for review December 6, 2021 02:26

nvanbenschoten requested a review from a team as a code owner December 6, 2021 02:26

nvanbenschoten force-pushed the nvanbenschoten/tickUninit branch from 32b96f4 to 6b66392 Compare December 6, 2021 19:16

tbg self-requested a review December 7, 2021 11:02

tbg approved these changes Dec 7, 2021

View reviewed changes

nvanbenschoten mentioned this pull request Dec 11, 2021

replication: explore eliminating lazy internalRaftGroup initialization in favor of quiescence #73715

Closed

nvanbenschoten force-pushed the nvanbenschoten/tickUninit branch from 6b66392 to b4f814d Compare December 11, 2021 17:54

nvanbenschoten commented Dec 11, 2021

View reviewed changes

tbg approved these changes Dec 14, 2021

View reviewed changes

nvanbenschoten added 5 commits December 14, 2021 18:04

kv: clean up unquiescedReplicas insertion in addReplicaToRangeMapLocked

28027ab

Refactor that does not change behavior.

kv: ignore internalRaftGroup in Replica.Metric

e400459

A replica cannot unquiesce if its internalRaftGroup is nil, so this condition was redundant but confusing.

nvanbenschoten force-pushed the nvanbenschoten/tickUninit branch from b4f814d to 95a7671 Compare December 14, 2021 23:05

nvanbenschoten commented Dec 14, 2021

View reviewed changes

nvanbenschoten added the backport-21.1.x label Dec 14, 2021

nvanbenschoten added the backport-21.2.x label Dec 14, 2021

craig bot merged commit acf40e9 into cockroachdb:master Dec 15, 2021

nvanbenschoten deleted the nvanbenschoten/tickUninit branch December 22, 2021 01:10

This was referenced Dec 22, 2021

release-21.2: kv: don't unquiesce uninitialized replicas #74185

Merged

release-21.1: kv: don't unquiesce uninitialized replicas #74186

Closed

kvserver: garbage collect uninitialized replica eventually #47982

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: don't unquiesce uninitialized replicas #73362

kv: don't unquiesce uninitialized replicas #73362

nvanbenschoten commented Dec 2, 2021 •

edited

Loading

cockroach-teamcity commented Dec 2, 2021

nvanbenschoten commented Dec 2, 2021

nvanbenschoten commented Dec 2, 2021

tbg commented Dec 2, 2021 •

edited

Loading

tbg commented Dec 2, 2021 •

edited

Loading

tbg left a comment

nvanbenschoten left a comment

erikgrinaker commented Dec 3, 2021

nvanbenschoten commented Dec 6, 2021

tbg left a comment

nvanbenschoten left a comment

tbg left a comment

nvanbenschoten left a comment

craig bot commented Dec 15, 2021

blathers-crl bot commented Dec 15, 2021

	// For uninitialized ranges, membership is unknown at this point.
	if raft.IsEmptyHardState(hs) \|\| err != nil {
	return raftpb.HardState{}, raftpb.ConfState{}, err
	}

kv: don't unquiesce uninitialized replicas #73362

kv: don't unquiesce uninitialized replicas #73362

Conversation

nvanbenschoten commented Dec 2, 2021 • edited Loading

cockroach-teamcity commented Dec 2, 2021

nvanbenschoten commented Dec 2, 2021

nvanbenschoten commented Dec 2, 2021

tbg commented Dec 2, 2021 • edited Loading

tbg commented Dec 2, 2021 • edited Loading

tbg left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

erikgrinaker commented Dec 3, 2021

nvanbenschoten commented Dec 6, 2021

tbg left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

craig bot commented Dec 15, 2021

blathers-crl bot commented Dec 15, 2021

nvanbenschoten commented Dec 2, 2021 •

edited

Loading

tbg commented Dec 2, 2021 •

edited

Loading

tbg commented Dec 2, 2021 •

edited

Loading