New config & failover behavior: PreventCrossDataCenterMasterFailover #766

shlomi-noach · 2018-12-27T08:50:03Z

Introducing PreventCrossDataCenterMasterFailover (boolean), defaults false.

Setting to true forces orchestrator to only fail over masters within same DC as failed master.

Some notes

regardless of this new config:

orchestrator will try its best to pick a replica from same DC
If unsuccessful, it may pick a server in a different DC
orchestrator then proceeds to check whether it should perform 2-step promotion, i.e. if it can promote yet another server on top of the one already chosen.

Now, when PreventCrossDataCenterMasterFailover: true:

It will completely disregard any server not in failed master's DC
It will do whatever it can to replace chosen server with one that is in failed master's DC

Finally:

whatever happens, if PreventCrossDataCenterMasterFailover: true, and the final-suggested server is not in same DC as failed master, the failover is aborted with error.
- RESET SLAVE ALL and SET @@global.read_only will not be executed
- PostMasterFailoverProcesses will not be executed
- PostUnsuccessfulFailoverProcesses will be executed

Also:

This config does not affect intermediate master or master-master failovers.
When PreventCrossDataCenterMasterFailover: true, the raft DC distribution becomes mostly irrelevant. To elaborate:
- It doesn't matter where orchestrator/raft none members are running at, there will never be a cross-DC master failover.
- Say all masters are in dc1 and dc1 gets network isolated:
  - If orchestrator/raft has a quorum in dc1 then it is happy because it can see the masters and there is no failover.
  - If the orchestrator/raft quorum is outside dc1, then the leader (running from some dc2) will attempt a failover. It will run pre-failover hooks. But it will very quickly realize it cannot find a server to promote, because all of the servers in dc1 are inaccessible to it, and all other servers are disqualified.

cc @github/database-infrastructure @matt-ullmer @jeremycole, @sroysen

shlomi-noach · 2018-12-27T08:50:24Z

TODO:

documentation

…promotion_rule

shlomi-noach · 2018-12-30T07:04:35Z

As example to a complex scenario:

srvA.dc1
+ srvB.dc1
+ srvC.dc1
+ srvX.dc2
  + srvY.dc2
  + srvZ.dc2

assume master srvA.dc1 fails, and "PreventCrossDataCenterMasterFailover": true.

If most up-to-date replica is srvB.dc1 or srvC.dc1 then everything is simple and there's no problem picking the replacement master.

If most up-to-date replica is srvX.dc2, then orchestrator will:

step 1:

srvX.dc2
+ srvB.dc1
+ srvC.dc1
+ srvY.dc2
+ srvZ.dc2

step 2:
Realize promoted server is invalid. Try 2-step promotion of, say, srvB.cp1
- if successful:

srvB.dc1
+ srvX.dc2
  + srvC.dc1
  + srvY.dc2
  + srvZ.dc2

if unsuccessful, fail the operation.

sroysen · 2019-01-02T21:13:29Z

ping @jordanwheeler , @akshaysuryawanshi

akshaysuryawanshi · 2019-01-02T21:38:36Z

if unsuccessful, fail the operation.

@shlomi-noach so does this fail only step 2 or it undo's step 1 as well, because now we have a master effectively in another DC, although not accepting writes since it fails to find the right master, and setting read_only off also fails ?

shlomi-noach · 2019-01-03T06:15:48Z

@akshaysuryawanshi

or it undo's step 1 as well

it does not undo step 1. Can you please explain again your scenario? I'm not sure if what you're describing is a failover experiment with this branch, or is your scenario unrelated?

akshaysuryawanshi · 2019-01-03T16:52:09Z

@shlomi-noach I was trying to understand your example scenario. if I understand correctly, the config option will make Orchestrator choose a better replica (one in the same DC) in the second step of the failover process. So if it isnt able to find the a valid replica, supposedly due to replication lag on them, Orchestrator will leave the topology with a remote master host, but not make it writable. Is that correct understanding of this PR ?

Also, the 2-step promotion step, does it retry based on some timeout or number of retries ? or it checks only once after step 1 is completed ?

shlomi-noach · 2019-01-03T17:27:52Z

if I understand correctly, the config option will make Orchestrator choose a better replica (one in the same DC) in the second step of the failover process. So if it isnt able to find the a valid replica, supposedly due to replication lag on them, Orchestrator will leave the topology with a remote master host, but not make it writable. Is that correct understanding of this PR ?

correct

Also, the 2-step promotion step, does it retry based on some timeout or number of retries ? or it checks only once after step 1 is completed ?

Only once.

I should clarify the example I presented is the most complex case. "If most up-to-date replica is srvB.dc1 or srvC.dc1 then everything is simple and there's no problem picking the replacement master." should be the more common case.

akshaysuryawanshi · 2019-01-03T17:37:36Z

I should clarify the example I presented is the most complex case.

Makes sense, we are testing similar kind of flag, which is checked when in IsBannedFromBeingCandidateReplica. If the candidateReplica's DC is not same as its master then we return false for that replica and abort the failover. It is up to the user to then take the correct action based on the state of the cluster, one of which is what this PR does in step 1.

The scenario is exactly as you mentioned, pretty complex one.

shlomi-noach · 2019-01-14T07:01:37Z

woot! Tests well in production

New config: PreventCrossDataCenterMasterFailover

ed2c8f2

Shlomi Noach added 2 commits December 27, 2018 10:58

CrossDataCenterMasterFailoverConstraintSatisfied more important than …

064f299

…promotion_rule

PreventCrossDataCenterMasterFailover documentation and sample configs

c934253

Merge branch 'master' into disable-cross-dc-master-failover

a3fe4b8

Merge branch 'master' into disable-cross-dc-master-failover

13dd8fd

shlomi-noach temporarily deployed to production/mysql_cluster=conductor January 6, 2019 06:38 Inactive

shlomi-noach temporarily deployed to production/mysql_cluster=conductor January 8, 2019 06:14 Inactive

shlomi-noach temporarily deployed to production/mysql_cluster=conductor January 8, 2019 13:56 Inactive

shlomi-noach temporarily deployed to production/mysql_cluster=conductor January 9, 2019 07:07 Inactive

Merge branch 'master' into disable-cross-dc-master-failover

a156441

shlomi-noach temporarily deployed to production/mysql_cluster=conductor January 14, 2019 06:36 Inactive

Merge branch 'master' into disable-cross-dc-master-failover

3ce654b

shlomi-noach temporarily deployed to production/mysql_cluster=conductor January 14, 2019 06:55 Inactive

shlomi-noach merged commit 7d98b24 into master Jan 14, 2019

shlomi-noach deleted the disable-cross-dc-master-failover branch January 14, 2019 07:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New config & failover behavior: PreventCrossDataCenterMasterFailover #766

New config & failover behavior: PreventCrossDataCenterMasterFailover #766

shlomi-noach commented Dec 27, 2018

shlomi-noach commented Dec 27, 2018 •

edited

Loading

shlomi-noach commented Dec 30, 2018

sroysen commented Jan 2, 2019

akshaysuryawanshi commented Jan 2, 2019

shlomi-noach commented Jan 3, 2019

akshaysuryawanshi commented Jan 3, 2019

shlomi-noach commented Jan 3, 2019

akshaysuryawanshi commented Jan 3, 2019

shlomi-noach commented Jan 14, 2019

New config & failover behavior: PreventCrossDataCenterMasterFailover #766

New config & failover behavior: PreventCrossDataCenterMasterFailover #766

Conversation

shlomi-noach commented Dec 27, 2018

Some notes

shlomi-noach commented Dec 27, 2018 • edited Loading

shlomi-noach commented Dec 30, 2018

sroysen commented Jan 2, 2019

akshaysuryawanshi commented Jan 2, 2019

shlomi-noach commented Jan 3, 2019

akshaysuryawanshi commented Jan 3, 2019

shlomi-noach commented Jan 3, 2019

akshaysuryawanshi commented Jan 3, 2019

shlomi-noach commented Jan 14, 2019

shlomi-noach commented Dec 27, 2018 •

edited

Loading