Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/kvserver: TestReplicateQueueRebalance failed [non-zero HardState.Commit on uninitialized] #77030

Closed
cockroach-teamcity opened this issue Feb 25, 2022 · 21 comments · Fixed by #77335
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Feb 25, 2022

kv/kvserver.TestReplicateQueueRebalance failed with artifacts on master @ 94e64ce989a374e796cfaad95cb34b3702b9a6e2:

=== RUN   TestReplicateQueueRebalance
    test_log_scope.go:79: test logs captured to: /artifacts/tmp/_tmp/751d67000aac5f3394c2369309253f02/logTestReplicateQueueRebalance3009854166
    test_log_scope.go:80: use -show-logs to present logs inline
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss

/cc @cockroachdb/kv

This test on roachdash | Improve this report!

Jira issue: CRDB-13375

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Feb 25, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Feb 25, 2022
@erikgrinaker
Copy link
Contributor

setting non-zero HardState.Commit on uninitialized replica [n1,s1,r1/6:{-}]. HS={Term:7 Vote:0 Commit:110}

@sumeerbhola @tbg This seems like it could be related to the recent Raft changes.

@sumeerbhola
Copy link
Collaborator

Here are the log statements pertaining to range r1 on n1,s1. The failure happens due to setting non-zero HardState.Commit on uninitialized replica [n1,s1,r1/6:{-}]. HS={Term:7 Vote:0 Commit:110}, so replicaID 6. The stack shows the normal raft ready processing path beginning in the raft scheduler. This is the first mention of r1/6. About 60ms prior to this r1/1 was removed from this store [n1,s1,r1/1:‹/{Min-System/NodeL…}›,raft] 582 removing replica r1/1. This was done using removeInitializedReplicaRaftMuLocked. It says VOTER_DEMOTING_LEARNER -- I am assuming this was an initialized learner. Perhaps we didn't properly remove the HardState and are finding it when dealing with r1/6. But my understanding was that a learner can't vote, so can have a non-zero HardState.Commit but not a non-zero Term, and this failure shows a Term:7. Oh, I see -- it was a fully voting replica earlier ([(n1,s1):1 (n5,s5):2 (n4,s4):3]) and is being removed.

I220225 08:34:01.676010 923461 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›] 528  proposing SIMPLE(l2) [(n5,s5):2LEARNER]: after=[(n1,s1):1 (n5,s5):2LEARNER] next=3
I220225 08:34:01.692071 923461 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›] 535  proposing SIMPLE(v2) [(n5,s5):2]: after=[(n1,s1):1 (n5,s5):2] next=3
I220225 08:34:01.699306 923461 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›] 540  proposing SIMPLE(l3) [(n4,s4):3LEARNER]: after=[(n1,s1):1 (n5,s5):2 (n4,s4):3LEARNER] next=4
I220225 08:34:01.714994 923461 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›] 551  proposing SIMPLE(v3) [(n4,s4):3]: after=[(n1,s1):1 (n5,s5):2 (n4,s4):3] next=4
I220225 08:34:01.722606 923461 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›] 558  proposing SIMPLE(l4) [(n3,s3):4LEARNER]: after=[(n1,s1):1 (n5,s5):2 (n4,s4):3 (n3,s3):4LEARNER] next=5
I220225 08:34:01.733910 923461 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›] 568  proposing ENTER_JOINT(r1 l1 v4) [(n3,s3):4VOTER_INCOMING], [(n1,s1):1VOTER_DEMOTING_LEARNER]: after=[(n1,s1):1VOTER_DEMOTING_LEARNER (n5,s5):2 (n4,s4):3 (n3,s3):4VOTER_INCOMING] next=5
I220225 08:34:01.751129 923873 kv/kvserver/pkg/kv/kvserver/store_remove_replica.go:150 ⋮ [n1,s1,r1/1:‹/{Min-System/NodeL…}›,raft] 582  removing replica r1/1
I220225 08:34:01.816498 923849 1@kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 616  the server is terminating due to a fatal error (see the DEV channel for details)
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617  setting non-zero HardState.Commit on uninitialized replica [n1,s1,r1/6:{-}]. HS={Term:7 Vote:0 Commit:110}
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !goroutine 923849 [running]:
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x0)
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0x8a
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0xc005428360, {{{0xc009b48360, 0x24}, {0x44de35e, 0x1}, {0x0, 0x0}, {0x0, 0x0}}, 0x16d6fa68c99cd78a, ...})
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/util/log/clog.go:237 +0xb8
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/util/log.logfDepth({0x5712018, 0xc0106ac810}, 0x1, 0x4, 0x0, {0x459ab12, 0x45}, {0xc00f3e5848, 0x2, 0x2})
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/util/log/channels.go:60 +0x385
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/util/log.Fatalf(...)
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/bazel-out/k8-dbg/bin/pkg/util/log/log_channels_generated.go:834
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked(0xc0106ae000, {0x5712018, 0xc0106ac810}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 +0x123b
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReady(0xc00f3e5e70, {0x5712018, 0xc0106ac810}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:556 +0x15b
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processReady(0xc004399b00, 0x5712018)
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:512 +0xd7
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).worker(0xc000142c30, {0x5712018, 0xc00dbdad80})
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:305 +0x25d
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:494 +0x16f
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:483 +0x445
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !For more context, check log files in: /artifacts/tmp/_tmp/751d67000aac5f3394c2369309253f02/logTestReplicateQueueRebalance3009854166
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !****************************************************************************
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !This node experienced a fatal error (printed above), and as a result the
F220225 08:34:01.816536 923849 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 ⋮ [n1,s1,r1/6:{-},raft] 617 !process is terminating.

@sumeerbhola
Copy link
Collaborator

  • None of my recent PRs (raft truncation, RaftReplicaID, RangeAppliedState.RaftAppliedIndexTerm), added any place where we set HardState, or change when HardState is removed.
  • I am trying to reproduce locally, but haven't yet been successful
  • There are many paths that end in removeInitializedReplicaRaftMuLocked and I am having some trouble in figuring out which ones are conclusively removing HardState. If destroyRaftMuLocked were called, we would see a log statement here, but we don't
    log.Infof(ctx, "removed %d (%d+%d) keys in %0.0fms [clear=%0.0fms commit=%0.0fms]",
    .

@sumeerbhola
Copy link
Collaborator

@tbg do you have any thoughts?

@erikgrinaker
Copy link
Contributor

Thanks for having a look @sumeerbhola, we'll pick this up.

@erikgrinaker
Copy link
Contributor

Got a couple of repros with TestReplicateQueueDecommissioningNonVoters/remove within about 150 runs (3 minutes), running a bisection.

@erikgrinaker erikgrinaker self-assigned this Mar 1, 2022
@tbg
Copy link
Member

tbg commented Mar 1, 2022

Gonna also go through the logs to see if I can understand any of this.

Edit: trimmed logs

kvservertest.log
test.log

@tbg
Copy link
Member

tbg commented Mar 1, 2022

First of all, why don't we get all stack traces? Is GOTRACEBACK not set correctly? I don't see it here but also not sure how it works now with bazel. cc @rickystewart

@tbg
Copy link
Member

tbg commented Mar 1, 2022

I think the history makes sense, with the exception of the crash. r1 is on n1 initially (makes sense), then it gets moved off and the replica is destroyed (makes sense). It then gets re-added as a learner (why not) and now things get weird because it receives a Commit: 110 from someone (or otherwise learns about this so that it pops out of ready handling). This something could've been regular comms from the leader, it could also be the snapshot. That shouldn't happen: one of the (undocumented but real I think) invariants is that raft won't communicate a commit index that isn't also in the follower's log.

@tbg
Copy link
Member

tbg commented Mar 1, 2022

  • are we perhaps calling ReportSnapshotStatus before the snapshot has actually applied? I.e. could we trick the raft leader into thinking that index 100 is now on the follower when this isn't the case yet. ReportSnapshotStatus moves the follower to StateProbe at some index (the one a snapshot was originally requested for, if any) so any subsequent MsgApp would carry the leader's .Commit. (So strictly here goes the invariant: for a regular follower, it does now look as though the leader may, say, communicate Commit: 100 when trying to give entry 50 to a follower, this "committing ahead of its log"). However, followers only use .Commit if they can actually append the entry (see (*raft).handleAppendEntries), so the invariant is morally still true here (the follower won't act on such entries, so they might as well not have been sent). Also, nothing in this area changed.

  • next: trying to distill the test down. I'm surprised

    // TestRaftRemoveRace adds and removes a replica repeatedly in an attempt to
    // reproduce a race (see #1911 and #9037).
    func TestRaftRemoveRace(t *testing.T) {
    defer leaktest.AfterTest(t)()
    defer log.Scope(t).Close(t)
    numServers := 10
    if util.RaceEnabled {
    // In race builds, running 10 nodes needs more than 1 full CPU (due to
    // background gossip and heartbeat overhead), so it can't keep up when run
    // under stress with one process per CPU. Run a reduced version of this test
    // in race builds. This isn't as likely to reproduce the races but will
    // still have a chance to do so.
    numServers = 3
    }
    ctx := context.Background()
    tc := testcluster.StartTestCluster(t, numServers,
    base.TestClusterArgs{
    ReplicationMode: base.ReplicationManual,
    })
    defer tc.Stopper().Stop(ctx)
    key := tc.ScratchRange(t)
    desc := tc.LookupRangeOrFatal(t, key)
    // Cyclically up-replicate to a bunch of nodes which stresses a condition
    // where replicas receive messages for a previous or later incarnation of the
    // replica.
    targets := make([]roachpb.ReplicationTarget, len(tc.Servers)-1)
    for i := 1; i < len(tc.Servers); i++ {
    targets[i-1] = tc.Target(i)
    }
    tc.AddVotersOrFatal(t, key, targets...)
    for i := 0; i < 10; i++ {
    tc.RemoveVotersOrFatal(t, key, tc.Target(2))
    tc.AddVotersOrFatal(t, key, tc.Target(2))
    // Verify the tombstone key does not exist. See #12130.
    tombstoneKey := keys.RangeTombstoneKey(desc.RangeID)
    var tombstone roachpb.RangeTombstone
    if ok, err := storage.MVCCGetProto(
    ctx, tc.GetFirstStoreFromServer(t, 2).Engine(), tombstoneKey,
    hlc.Timestamp{}, &tombstone, storage.MVCCGetOptions{},
    ); err != nil {
    t.Fatal(err)
    } else if ok {
    t.Fatal("tombstone should not exist")
    }
    }
    }
    doesn't reproduce this most easily.

@erikgrinaker
Copy link
Contributor

Bisecting is a bit iffy since it sometimes passes for hundreds of runs. But I'll keep at it.

@erikgrinaker
Copy link
Contributor

Bisected to 1bb58e5.

@sumeerbhola
Copy link
Collaborator

@tbg, I see raft.newRaft calling raft.Storage.InitialState, which would make sense. Wouldn't that propagate into the Ready.HardState? Which is why my earlier guess was that we hadn't properly cleaned up the previous n1,s1,r1/1, which correlates with the fact that we don't see the log statement from destroyRaftMuLocked. What am I missing?

@tbg
Copy link
Member

tbg commented Mar 1, 2022

Good point, let me take a quick look before I head out. It is suspicious that we're seeing

// During merges, the context might have the subsuming range, so we explicitly
// log the replica to be removed.
log.Infof(ctx, "removing replica r%d/%d", rep.RangeID, rep.replicaID)

but not the logging inside of

rep.disconnectReplicationRaftMuLocked(ctx)
if opts.DestroyData {
if err := rep.destroyRaftMuLocked(ctx, nextReplicaID); err != nil {
return nil, err
}
}

I was hoping to find a hole, where we unlink r1/1 from the store before destroying the data, allowing r1/6 to sneak in and consume r1/1's HardState. However, there isn't obviously one. We call destroyRaftMuLocked before unlinking from the Store, and also the replica's raftMu is held throughout, so a message for r1/6 would be taking this branch:

if repl, ok := s.mu.replicasByRangeID.Load(rangeID); ok {
repl.raftMu.Lock() // not unlocked on success

if it were to see the HardState. And the first thing that happens upon entering this branch is acquiring raftMu, so it would properly synchronize with the removal and hit the .Removed() early return here.

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Mar 1, 2022

FWIW, I ran 3300 stress runs of 5b42a98, the immediate parent of 1bb58e5, and it had no failures -- so it seems fairly certain to have been introduced or triggered by 1bb58e5.

@erikgrinaker erikgrinaker changed the title kv/kvserver: TestReplicateQueueRebalance failed kv/kvserver: TestReplicateQueueRebalance failed [non-zero HardState.Commit on uninitialized] Mar 2, 2022
sumeerbhola added a commit to sumeerbhola/cockroach that referenced this issue Mar 2, 2022
This is a temporary change, until we can find the root cause of
cockroachdb#77030.

Release justification: Temporary bug workaround.

Release note: None
@sumeerbhola
Copy link
Collaborator

It isn't HardState from the old replica based on running TestReplicateQueueDecommissioningNonVoters/remove with additional instrumentation. Here are the relevant lines (from the fuller logs below): the commit value was 71 for r3/1 prior to removal and there is a subsequent attempt to load HardState that finds nothing, and then it attempts to set a Commit value of 46 on the uninitialized r3/6.

I220302 17:36:44.849779 655 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1356  setting HardState for r3: {Term:6 Vote:1 Commit:71}
I220302 17:36:44.853321 655 kv/kvserver/pkg/kv/kvserver/store_remove_replica.go:150  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1372  removing replica r3/1
I220302 17:36:45.019540 684 kv/kvserver/stateloader/stateloader.go:366  [n1,s1,r3/6:{-}] 1618  no HardState for r3, err: <nil>
I220302 17:36:45.019715 684 1@kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1619  the server is terminating due to a fatal error (see the DEV channel for details)
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663  setting non-zero HardState.Commit on uninitialized replica [n1,s1,r3/6:{-}]. HS={Term:7 Vote:0 Commit:46}

Fuller logs

I220302 17:36:44.440976 793 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333  [n1,s1,r3/1:/System/{NodeLive…-tsd}] 1248  proposing SIMPLE(l4) [(n4,s4):4LEARNER]: after=[(n1,s1):1 (n3,s3):2 (n5,s5):3 (n4,s4):4LEARNER] next=5
I220302 17:36:44.441997 691 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1249  setting HardState for r3: {Term:6 Vote:1 Commit:64}
I220302 17:36:44.463800 696 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1283  setting HardState for r3: {Term:6 Vote:1 Commit:65}
I220302 17:36:44.474001 793 kv/kvserver/pkg/kv/kvserver/replica_raft.go:333  [n1,s1,r3/1:/System/{NodeLive…-tsd}] 1292  proposing ENTER_JOINT(r1 l1 v4) [(n4,s4):4VOTER_INCOMING], [(n1,s1):1VOTER_DEMOTING_LEARNER]: after=[(n1,s1):1VOTER_D
EMOTING_LEARNER (n3,s3):2 (n5,s5):3 (n4,s4):4VOTER_INCOMING] next=5
I220302 17:36:44.474830 713 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1295  setting HardState for r3: {Term:6 Vote:1 Commit:66}
I220302 17:36:44.480151 715 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1301  setting HardState for r3: {Term:6 Vote:1 Commit:67}
I220302 17:36:44.490542 724 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1313  setting HardState for r3: {Term:6 Vote:1 Commit:68}
I220302 17:36:44.493355 731 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1323  setting HardState for r3: {Term:6 Vote:1 Commit:69}
I220302 17:36:44.510397 737 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1331  setting HardState for r3: {Term:6 Vote:1 Commit:70}
I220302 17:36:44.849779 655 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1356  setting HardState for r3: {Term:6 Vote:1 Commit:71}
I220302 17:36:44.853321 655 kv/kvserver/pkg/kv/kvserver/store_remove_replica.go:150  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 1372  removing replica r3/1
I220302 17:36:45.019540 684 kv/kvserver/stateloader/stateloader.go:366  [n1,s1,r3/6:{-}] 1618  no HardState for r3, err: <nil>
I220302 17:36:45.019715 684 1@kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1619  the server is terminating due to a fatal error (see the DEV channel for details)
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663  setting non-zero HardState.Commit on uninitialized replica [n1,s1,r3/6:{-}]. HS={Term:7 Vote:0 Commit:46}
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !goroutine 684 [running]:
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x1)
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0x8a
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0xc000aea040, {{{0xc0006cda40, 0x24}, {0x452957c, 0x1}, {0x0
, 0x0}, {0x0, 0x0}}, 0x16d8a0ed56514282, ...})
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/util/log/clog.go:239 +0x97
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/util/log.logfDepth({0x577cfd8, 0xc002318cf0}, 0x1, 0x4, 0x0, {0x45e70a8, 0x45}, {0xc005f6f848, 
0x2, 0x2})
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/util/log/channels.go:60 +0x385
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/util/log.Fatalf(...)
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/bazel-out/k8-fastbuild/bin/pkg/util/log/log_channels_generated.go:834
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked(0xc008ee8a80, {0x577cfd8, 0xc002318cf0}, {{0
x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:826 +0x123b
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReady(0xc005f6fe70, {0x577cfd8, 0xc002318cf0}, {{0x0, 0x0, 0x0
, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:556 +0x15b
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processReady(0xc0039f6000, 0x577cfd8)
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:512 +0xd7
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).worker(0xc0035d8f70, {0x577cfd8, 0xc003bcc2a0})
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:308 +0x25d
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:494 +0x16f
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
F220302 17:36:45.019775 684 kv/kvserver/pkg/kv/kvserver/replica_raft.go:826  [n1,s1,r3/6:{-},raft] 1663 !	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:483 +0x445

craig bot pushed a commit that referenced this issue Mar 2, 2022
77254: sql/importer: use desc's PK ID to sort KVs by index r=dt a=dt

Release note: none.

Fixes: #76196.

Release justification: low-risk bug fix.

77278: kvserver: set kv.raft_log.loosely_coupled_truncation.enabled to false r=erikgrinaker a=sumeerbhola

This is a temporary change, until we can find the root cause of
#77030.

Release justification: Temporary bug workaround.

Release note: None

Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
@sumeerbhola
Copy link
Collaborator

The instrumentation is in sumeerbhola@7db8067

Since there is a log statement for every call to StateLoader.SetHardState we are not setting any HardState between the last setting HardState for r3: {Term:6 Vote:1 Commit:71} for the previous replica and the HS={Term:7 Vote:0 Commit:46} that panics. So maybe it is the snapshot thing @tbg was thinking about. But the Commit value is regressing a bit and corresponds to the following which is 1s earlier. Who could be sending an old snapshot and why?

I220302 17:36:43.857999 654 kv/kvserver/stateloader/stateloader.go:379  [n1,s1,r3/1:/System/{NodeLive…-tsd},raft] 703  setting HardState for r3: {Term:6 Vote:1 Commit:46}

Hmm, I don't think the following code in acquireReplicaForTruncator is correct. There is a gap between the lock held for the isAlive check and the raftMu acquisition, which could potentially allow it to return a destroyed Replica (though I don't know why that would cause this HardState panic)

	if isAlive := func() bool {
		r.mu.Lock()
		defer r.mu.Unlock()
		return r.mu.destroyStatus.IsAlive()
	}(); !isAlive {
		return nil
	}
	r.raftMu.Lock()
	return (*raftTruncatorReplica)(r)

@sumeerbhola
Copy link
Collaborator

sumeerbhola commented Mar 2, 2022

sumeerbhola@7edb8f4 has been running for ~5000 runs on gceworker without failure.


now at ~19,000 successful runs over ~8 hours

@tbg
Copy link
Member

tbg commented Mar 3, 2022

Uh, yeah, that missing raftMu.Lock() is an obvious bug, I'm not sure how I missed that during review. Let me take a stab at understanding the failure mode we're seeing here. In the meantime, mind sending your patch as a PR which can also revert the cluster setting change?

@tbg
Copy link
Member

tbg commented Mar 3, 2022

So with the mutex gap the following could happen:

  1. replica is loaded and its destroy status checked inside of acquireReplicaForTruncator
  2. replica is replicaGC'ed (but we're already past the destroy check in 1.), in particular it is unliked from the store maps
  3. replica.raftMu acquired, replica returned to tryEnactTruncations
  4. truncation runs, and writes a TruncatedState (where there wasn't one previously due to replicaGC)
  5. replica is re-added under higher replicaID, it loads the truncated state (via the InitialState method)
  6. that new incarnation sees the index from the truncated state pop up as a Committed

I still have to verify 6., but it seems like an obvious problem if there is a bogus truncated state on an uninitialized replica.

@tbg
Copy link
Member

tbg commented Mar 3, 2022

when RawNode is initialized, it sets prevHardSt:

https://github.com/etcd-io/etcd/blob/8ac44ffa5fcccc7928876be4682c07f50b5e3b7e/raft/rawnode.go#L40-L55

which essentially equals InitialState at that point, except the Committed field, which will come from the TruncatedState:

https://github.dev/etcd-io/etcd/blob/8ac44ffa5fcccc7928876be4682c07f50b5e3b7e/raft/raft.go#L323

https://github.dev/etcd-io/etcd/blob/8ac44ffa5fcccc7928876be4682c07f50b5e3b7e/raft/log.go#L65-L76

and InitialState will be empty (since nothing is persisted):

func (r *replicaRaftStorage) InitialState() (raftpb.HardState, raftpb.ConfState, error) {
ctx := r.AnnotateCtx(context.TODO())
hs, err := r.mu.stateLoader.LoadHardState(ctx, r.store.Engine())
// For uninitialized ranges, membership is unknown at this point.
if raft.IsEmptyHardState(hs) || err != nil {
return raftpb.HardState{}, raftpb.ConfState{}, err
}
cs := r.mu.state.Desc.Replicas().ConfState()
return hs, cs, nil
}

Now when the leaseholder probes us, we are going to respond saying that our logs start at index 41. The leader will thus send us an append(idx=[42, 43, ..., 100], committed=100) (or something like that), i.e. entries we shouldn't be receiving yet. I just think that the moving Commit index is the first place where this explodes.

craig bot pushed a commit that referenced this issue Mar 3, 2022
…77328 #77335

75751: sql: Add DateStyle/IntervalStyle visitor r=e-mbrown a=e-mbrown

The DateStyle visitor allows for cast expressions with string
to interval and date/interval types to string cast to be rewritten.
These stable cast cause issues with DateStyle/IntervalStyle formatting so they
need to be wrapped in builtins containing their immutable version.

Release note: None
Release justification: Low risk update to new functionality

76705: backupccl: add prototype metadata.sst r=rhu713 a=rhu713

This adds writing of an additional file to the completion of BACKUP. This new file
is an sstable that contains the same metadata currently stored in the BACKUP_MANIFEST
file and statistics files, but organizes that data differently.

The current BACKUP_MANIFEST file contains a single binary-encoded protobuf message
of type BackupManifest, that in turn has several fields some of which are repeated
to contain e.g. the TableDescriptor for every table backed up, or every revision to
every table descriptor backed up. This can result in these manifests being quite large
in some cases, which is potentially concerning because as a single protobuf message,
one has to read and unmarshal the entire struct into memory to read any field(s) of it.

Organizing this metadata into an SSTable where repeated fields are instead stored as
separate messages under separate keys should instead allow reading it incrementally:
one can seek to a particular key or key prefix and then scan, acting on whatever data
is found as it is read, without loading the entire file at once (when opened using the
same seek-ing remote SST reader we use to read backup data ssts).

This initial prototype adds only the writer -- RESTORE does not rely on, or even open,
this new file at this time.

Release note: none.

77018: release: automate orchestration version update r=celiala a=rail

Previously, as a part of the release process we had to bump the
orchestration versions using `sed` with some error-prone regexes.

This patch adds `set-orchestration-version` subcommand to the release
tool. It uses templates in order to generate the orchestration files.

Release note: None

77055: sql: change index backfill merger to use batch api r=rhu713 a=rhu713

Use Batch API instead of txn.Scan() in order to limit the number of bytes per
batch response in the index backfill merger.

Fixes #76685.

Release note: None

77065: bazel: use test sharding more liberally r=rail a=rickystewart

Closes #76376.

Release note: None

77109: ccl/sqlproxyccl: add helpers related to connection migration r=JeffSwenson,andy-kimball a=jaylim-crl

#### ccl/sqlproxyccl: add helpers related to connection migration 

Informs #76000. Extracted from #76805.

This commit adds helpers related to connection migration. This includes support
for retrieving the transfer state through SHOW TRANSFER STATE, as well as
deserializing the session through crdb_internal.deserialize_session.

Release note: None

Release justification: Helpers added in this commit are needed for the
connection migration work. Connection migration is currently not being used
in production, and CockroachCloud is the only user of sqlproxy.
  
#### ccl/sqlproxyccl: fix math for defaultBufferSize in interceptors 

Previously, we incorrectly defined defaultBufferSize as 16K bytes. Note that
2 << 13 is 16K bytes. This commit fixes that behavior to match the original
intention of 8K bytes.

Release note: None

Release justification: This fixes an unintentional buglet within the sqlproxy
code that was introduced with the interceptors back then. Not having this in
means we're using double the memory for each connection within the sqlproxy.



77307: sql: add cluster setting to limit max size of serialized session r=otan,jaylim-crl a=rafiss

fixes #77302

The sql.session_transfer.max_session_size cluster setting can be used to
limit the max size of a session that is serialized using
crdb_internal.serialize_session.

No release note since this is not a public setting.

Release justification: high priority fix for new functionality.

Release note: None

77318: roachpb: extract keysbase to break some dependencies r=yuzefovich a=yuzefovich

This commit extracts a couple of things out of `roachpb` into new
`keysbase` package in order to break the dependency of `util/json` and
`sql/inverted` on `roachpb` (which is a part of the effort to clean up
the dependencies of `execgen`).

Addresses: #77234.

Release note: None

Release justification: low risk change to clean up the dependencies.

77319: sessiondatapb: move one enum definition into lex package r=yuzefovich a=yuzefovich

This commit moves the definition of `BytesEncodeFormat` enum from
`sessiondatapb` to `lex`. This is done in order to make `lex` not depend
on a lot of stuff (eventually on `roachpb`) and is a part of the effort
to clean up the dependencies of `execgen`. Note that the proto package
name is not changed, so this change is backwards-compatible.

Informs: #77234.

Release note: None

Release justification: low risk change to clean up the dependencies.

77328: roachtest: log stdout and stderr in sstable corruption test r=itsbilal a=nicktrav

To aid in debugging #77321, log the contents stdout and stderr if the
manifest dump command fails.

Release justification: Tests only.

Release note: None.

77335: kvserver: fix race that caused truncator to truncate non-alive replica r=tbg,erikgrinaker a=sumeerbhola

This was causing truncated state to be written to such a
replica, which would then get picked up as the
HardState.Commit value when a different replica was later
added back for the same range. See
#77030 (comment)
for the detailed explanation.

Also restore the default value of
kv.raft_log.loosely_coupled_truncation.enabled to true.

Fixes #77030

Release justification: Bug fix.
Release note: None

Co-authored-by: e-mbrown <ebsonari@gmail.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Rui Hu <rui@cockroachlabs.com>
Co-authored-by: Rail Aliiev <rail@iqchoice.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Co-authored-by: Jay <jay@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Nick Travers <travers@cockroachlabs.com>
Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
@craig craig bot closed this as completed in 1925799 Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants