Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RemoveTablet during TabletExternallyReparented causing connection issues #16371

Merged
merged 9 commits into from
Aug 9, 2024
30 changes: 28 additions & 2 deletions go/vt/discovery/healthcheck.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ import (
"encoding/json"
"fmt"
"hash/crc32"
"math"
"net/http"
"sort"
"strings"
Expand Down Expand Up @@ -467,10 +468,16 @@ func (hc *HealthCheckImpl) deleteTablet(tablet *topodata.Tablet) {
continue
}
delete(ths, tabletAlias)
// delete from healthy list

healthy, ok := hc.healthy[key]
if ok && len(healthy) > 0 {
hc.recomputeHealthy(key)
if tabletType == topodata.TabletType_PRIMARY {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this some more..

  • we are deleting a tablet. if that tablet's type is not PRIMARY, we should not even "recompute the healthy primary"
  • even if that tablet's type is PRIMARY, why should we iterate over the healthData for PRIMARY? we know that at any point there should only be one "healthy" primary. i.e healthy[PRIMARY] should have a size of either 0 or 1. If the tablet that we are deleting is THE current healthy primary, we should just set healthy[PRIMARY] to an empty slice. If it is not, we should do nothing.

It seems to me that the problem with the original implementation of this defer func was that it was calling recomputeHealthy for PRIMARY type (which was never done before this defer func was added) instead of handling PRIMARY separately. This caused it to violate the assumption that there is at most one healthy PRIMARY at any point in time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're absolutely right. I simplified this changeset by simply performing the same logic as in updateHealth for PRIMARY tablet types, and calling recomputeHealth for the other types.

// If tablet type is primary, we should only have one tablet in the healthy list.
hc.recomputeHealthyPrimary(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting bug. We have safeguards in updateHealth which make sure that there is exactly one element in healthy for the PRIMARY tablet type, but the recomputation here had no such checks.
While the fix seems correct, what do you think about pushing it into recomputeHealthy instead of creating a new function call here?
Right now this is the only call site where recomputeHealthy is called for the PRIMARY tablet type, but it seems more compact and future proof to change recomputeHealthy to maintain the invariant.

Copy link
Contributor Author

@arthurschreiber arthurschreiber Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the fix seems correct, what do you think about pushing it into recomputeHealthy instead of creating a new function call here?

I feel like having two separate methods helps highlighting the difference between the recomputation logic for primaries vs any other replica type more clearly.

I was thinking that maybe having a recomputeHealthy method that then calls out to either recomputeHealthyReplicas and recomputeHealthyPrimaries would probably be the best for clarity (and the go compiler should just inline all of these calls anyway, so not a performance issue either).

But then I noticed that pushing down the tablet type check into recomputeHealth would require passing down the key and tabletType as arguments, which seems redundant, because key contains the tablet type already, just as a string. Not passing down key would require passing down other arguments to be able to build the key, and then things just start to become more messy.

As I mentioned in Slack, I'm very interested giving the healthcheck code a polishing pass separately from this fix, so I'd suggest we punt on this for now. 🙇‍♂️ What do you think?

Copy link
Member

@deepthi deepthi Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that instead of passing the "key", you would now have to pass the target. When this was written, it just seemed convenient to use the key because it has already been computed. So we have

			hc.recomputeHealthy(targetKey)

However, later on, we are doing a key computation for the second call

			oldTargetKey := KeyFromTarget(prevTarget)
			hc.recomputeHealthy(oldTargetKey)

It does not seem too invasive to change recomputeHealthy to be

func (hc *HealthCheckImpl) recomputeHealthy(target *query.Target) {
			key := KeyFromTarget(target)
			if target.TabletType == ..._PRIMARY {
			...
}

} else {
// Simply recompute the list of healthy tablets for all other tablet types.
hc.recomputeHealthy(key)
}
}
}
}()
Expand Down Expand Up @@ -598,6 +605,25 @@ func (hc *HealthCheckImpl) recomputeHealthy(key KeyspaceShardTabletType) {
hc.healthy[key] = FilterStatsByReplicationLag(allArray)
}

// Recompute the healthy primary tablet for the given key.
func (hc *HealthCheckImpl) recomputeHealthyPrimary(key KeyspaceShardTabletType) {
highestPrimaryTermStartTime := int64(math.MinInt64)
var newestPrimary *TabletHealth

for _, s := range hc.healthData[key] {
if s.PrimaryTermStartTime >= highestPrimaryTermStartTime {
highestPrimaryTermStartTime = s.PrimaryTermStartTime
newestPrimary = s
}
}
deepthi marked this conversation as resolved.
Show resolved Hide resolved

if newestPrimary != nil {
hc.healthy[key] = []*TabletHealth{newestPrimary}
} else {
hc.healthy[key] = []*TabletHealth{}
}
}

// Subscribe adds a listener. Used by vtgate buffer to learn about primary changes.
func (hc *HealthCheckImpl) Subscribe() chan *TabletHealth {
hc.subMu.Lock()
Expand Down
121 changes: 121 additions & 0 deletions go/vt/discovery/healthcheck_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -784,6 +784,127 @@ func TestRemoveTablet(t *testing.T) {
assert.Empty(t, a, "wrong result, expected empty list")
}

// When an external primary failover is performed,
// the demoted primary will advertise itself as a `PRIMARY`
// tablet until it recognizes that it was demoted,
// and until all in-flight operations have either finished
// (successfully or unsuccessfully, see `--shutdown_grace_period` flag).
//
// During this time, operations like `RemoveTablet` should not lead
// to multiple tablets becoming valid targets for `PRIMARY`.
func TestRemoveTabletDuringExternalReparenting(t *testing.T) {
ctx := utils.LeakCheckContext(t)

// reset error counters
hcErrorCounters.ResetAll()
ts := memorytopo.NewServer(ctx, "cell")
defer ts.Close()
hc := createTestHc(ctx, ts)
// close healthcheck
defer hc.Close()

firstTablet := createTestTablet(0, "cell", "a")
firstTablet.Type = topodatapb.TabletType_PRIMARY

secondTablet := createTestTablet(1, "cell", "b")
secondTablet.Type = topodatapb.TabletType_REPLICA

thirdTablet := createTestTablet(2, "cell", "c")
thirdTablet.Type = topodatapb.TabletType_REPLICA

firstTabletHealthStream := make(chan *querypb.StreamHealthResponse)
firstTabletConn := createFakeConn(firstTablet, firstTabletHealthStream)
firstTabletConn.errCh = make(chan error)

secondTabletHealthStream := make(chan *querypb.StreamHealthResponse)
secondTabletConn := createFakeConn(secondTablet, secondTabletHealthStream)
secondTabletConn.errCh = make(chan error)

thirdTabletHealthStream := make(chan *querypb.StreamHealthResponse)
thirdTabletConn := createFakeConn(thirdTablet, thirdTabletHealthStream)
thirdTabletConn.errCh = make(chan error)

resultChan := hc.Subscribe()

hc.AddTablet(firstTablet)
hc.AddTablet(secondTablet)
hc.AddTablet(thirdTablet)

<-resultChan
<-resultChan

arthurschreiber marked this conversation as resolved.
Show resolved Hide resolved
firstTabletPrimaryTermStartTimestamp := time.Now().Unix() - 10

firstTabletHealthStream <- &querypb.StreamHealthResponse{
TabletAlias: firstTablet.Alias,
Target: &querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_PRIMARY},
Serving: true,

PrimaryTermStartTimestamp: firstTabletPrimaryTermStartTimestamp,
RealtimeStats: &querypb.RealtimeStats{ReplicationLagSeconds: 0, CpuUsage: 0.5},
}

secondTabletHealthStream <- &querypb.StreamHealthResponse{
TabletAlias: secondTablet.Alias,
Target: &querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_REPLICA},
Serving: true,

PrimaryTermStartTimestamp: 0,
RealtimeStats: &querypb.RealtimeStats{ReplicationLagSeconds: 1, CpuUsage: 0.5},
}

thirdTabletHealthStream <- &querypb.StreamHealthResponse{
TabletAlias: thirdTablet.Alias,
Target: &querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_REPLICA},
Serving: true,

PrimaryTermStartTimestamp: 0,
RealtimeStats: &querypb.RealtimeStats{ReplicationLagSeconds: 1, CpuUsage: 0.5},
}

<-resultChan
<-resultChan
<-resultChan

secondTabletPrimaryTermStartTimestamp := time.Now().Unix()

// Simulate a failover
firstTabletHealthStream <- &querypb.StreamHealthResponse{
TabletAlias: firstTablet.Alias,
Target: &querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_PRIMARY},
Serving: true,

PrimaryTermStartTimestamp: firstTabletPrimaryTermStartTimestamp,
RealtimeStats: &querypb.RealtimeStats{ReplicationLagSeconds: 0, CpuUsage: 0.5},
}

secondTabletHealthStream <- &querypb.StreamHealthResponse{
TabletAlias: secondTablet.Alias,
Target: &querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_PRIMARY},
Serving: true,

PrimaryTermStartTimestamp: secondTabletPrimaryTermStartTimestamp,
RealtimeStats: &querypb.RealtimeStats{ReplicationLagSeconds: 0, CpuUsage: 0.5},
}

<-resultChan
<-resultChan

hc.RemoveTablet(thirdTablet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does removing an unrelated tablet trigger the bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see this

When RemoveTablet was called, the list of healthy tablets was recomputed, but the logic did not account for the fact that there can only be a single entry in the list of healthy primary tablets.


// tablet 1 is the primary now
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is wrong, it should be

Suggested change
// tablet 1 is the primary now
// `secondTablet` should be the primary now

expectedTabletStats := []*TabletHealth{{
Tablet: secondTablet,
Target: &querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_PRIMARY},
Serving: true,
Stats: &querypb.RealtimeStats{ReplicationLagSeconds: 0, CpuUsage: 0.5},
PrimaryTermStartTime: secondTabletPrimaryTermStartTimestamp,
}}

actualTabletStats := hc.GetHealthyTabletStats(&querypb.Target{Keyspace: "k", Shard: "s", TabletType: topodatapb.TabletType_PRIMARY})
mustMatch(t, expectedTabletStats, actualTabletStats, "unexpected result")
}

// TestGetHealthyTablets tests the functionality of GetHealthyTabletStats.
func TestGetHealthyTablets(t *testing.T) {
ctx := utils.LeakCheckContext(t)
Expand Down
Loading