Skip to content

Commit

Permalink
storage: remove learners one by one
Browse files Browse the repository at this point in the history
We don't support removing multiple learners atomically just yet, though
\cockroachdb#40268 will fix this (likely in raft). That PR though is obstructed by cockroachdb#40207
because we'll need that first to be able to bump raft again (assuming we
don't want to fork). Instead of dealing with all that upfront, let's
just not remove multiple learners at once right now so that we can flip
the default for atomic replication changes to on.

If anyone is still trying to remove only learners atomically, they will
fail. However the replicate queues place a high priority on removing
stray learners whenever it finds them, so this wouldn't be a permanent
problem. Besides, we never add multiple learners at once so it's
difficult to get into that state in the first place.

Without this commit, TestLearnerAdminRelocateRange fails once atomic
replication changes are enabled.

Release note: None
  • Loading branch information
tbg committed Sep 3, 2019
1 parent f3773a2 commit 00a1d32
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 5 deletions.
18 changes: 13 additions & 5 deletions pkg/storage/replica_command.go
Original file line number Diff line number Diff line change
Expand Up @@ -2079,6 +2079,7 @@ func removeTargetFromSlice(
func removeLearners(
ctx context.Context, db *client.DB, desc *roachpb.RangeDescriptor,
) (*roachpb.RangeDescriptor, error) {
origDesc := desc
learners := desc.Replicas().Learners()
if len(learners) == 0 {
return desc, nil
Expand All @@ -2088,13 +2089,20 @@ func removeLearners(
targets[i].NodeID = learners[i].NodeID
targets[i].StoreID = learners[i].StoreID
}
chgs := roachpb.MakeReplicationChanges(roachpb.REMOVE_REPLICA, targets...)
log.VEventf(ctx, 2, `removing learner replicas %v from %v`, targets, desc)
newDesc, err := db.AdminChangeReplicas(ctx, desc.StartKey, *desc,
roachpb.MakeReplicationChanges(roachpb.REMOVE_REPLICA, targets...))
if err != nil {
return nil, errors.Wrapf(err, `removing learners from %s`, desc)
// NB: unroll the removals because at the time of writing, we can't atomically
// remove multiple learners. This will be fixed in:
//
// https://github.com/cockroachdb/cockroach/pull/40268
for i := range chgs {
var err error
desc, err = db.AdminChangeReplicas(ctx, desc.StartKey, *desc, chgs[i:i+1])
if err != nil {
return nil, errors.Wrapf(err, `removing learners from %s`, origDesc)
}
}
return newDesc, nil
return desc, nil
}

// adminScatter moves replicas and leaseholders for a selection of ranges.
Expand Down
3 changes: 3 additions & 0 deletions pkg/storage/replica_learner_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -783,6 +783,9 @@ func TestLearnerAdminRelocateRange(t *testing.T) {
})
defer tc.Stopper().Stop(ctx)

_, err := tc.Conns[0].Exec(`SET CLUSTER SETTING kv.atomic_replication_changes.enabled = true`)
require.NoError(t, err)

scratchStartKey := tc.ScratchRange(t)
ltk.withStopAfterLearnerAtomic(func() {
_ = tc.AddReplicasOrFatal(t, scratchStartKey, tc.Target(1))
Expand Down

0 comments on commit 00a1d32

Please sign in to comment.