-
Notifications
You must be signed in to change notification settings - Fork 937
New config & failover behavior: PreventCrossDataCenterMasterFailover #766
Conversation
TODO:
|
As example to a complex scenario:
assume master If most up-to-date replica is If most up-to-date replica is
|
ping @jordanwheeler , @akshaysuryawanshi |
@shlomi-noach so does this fail only step 2 or it undo's step 1 as well, because now we have a master effectively in another DC, although not accepting writes since it fails to find the right master, and setting read_only off also fails ? |
it does not undo step 1. Can you please explain again your scenario? I'm not sure if what you're describing is a failover experiment with this branch, or is your scenario unrelated? |
@shlomi-noach I was trying to understand your example scenario. if I understand correctly, the config option will make Orchestrator choose a better replica (one in the same DC) in the second step of the failover process. So if it isnt able to find the a valid replica, supposedly due to replication lag on them, Orchestrator will leave the topology with a remote master host, but not make it writable. Is that correct understanding of this PR ? Also, the 2-step promotion step, does it retry based on some timeout or number of retries ? or it checks only once after step 1 is completed ? |
correct
Only once. I should clarify the example I presented is the most complex case. "If most up-to-date replica is srvB.dc1 or srvC.dc1 then everything is simple and there's no problem picking the replacement master." should be the more common case. |
Makes sense, we are testing similar kind of flag, which is checked when in The scenario is exactly as you mentioned, pretty complex one. |
woot! Tests well in production |
Introducing
PreventCrossDataCenterMasterFailover
(boolean), defaultsfalse
.Setting to
true
forcesorchestrator
to only fail over masters within same DC as failed master.Some notes
regardless of this new config:
orchestrator
will try its best to pick a replica from same DCorchestrator
then proceeds to check whether it should perform 2-step promotion, i.e. if it can promote yet another server on top of the one already chosen.Now, when
PreventCrossDataCenterMasterFailover: true
:Finally:
PreventCrossDataCenterMasterFailover: true
, and the final-suggested server is not in same DC as failed master, the failover is aborted with error.RESET SLAVE ALL
andSET @@global.read_only
will not be executedPostMasterFailoverProcesses
will not be executedPostUnsuccessfulFailoverProcesses
will be executedAlso:
PreventCrossDataCenterMasterFailover: true
, theraft
DC distribution becomes mostly irrelevant. To elaborate:orchestrator/raft
none members are running at, there will never be a cross-DC master failover.dc1
anddc1
gets network isolated:orchestrator/raft
has a quorum indc1
then it is happy because it can see the masters and there is no failover.orchestrator/raft
quorum is outsidedc1
, then the leader (running from somedc2
) will attempt a failover. It will run pre-failover hooks. But it will very quickly realize it cannot find a server to promote, because all of the servers indc1
are inaccessible to it, and all other servers are disqualified.cc @github/database-infrastructure @matt-ullmer @jeremycole, @sroysen