From 0f651c7567d147f9837b2d83aa3253a4d84db01a Mon Sep 17 00:00:00 2001 From: Vitaliy Kukharik Date: Mon, 16 Nov 2020 17:27:58 +0300 Subject: [PATCH] etcd: set "initial-selection-tick-advance=false" By default, etcd --initial-election-tick-advance=true, then local member fast-forwards election ticks to speed up "initial" leader election trigger. This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election. Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout. However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities. Disabling this would slow down initial bootstrap process for cross datacenter deployments. We don't care too much about the delay in cluster bootstrap, but we do care about the availability of etcd clusters. With "initial-election-tick-advance" set to false, a rejoining node has more chance to receive leader heartbeats before disrupting the cluster. https://github.com/etcd-io/etcd/issues/9333 --- roles/etcd/templates/etcd.conf.j2 | 1 + 1 file changed, 1 insertion(+) diff --git a/roles/etcd/templates/etcd.conf.j2 b/roles/etcd/templates/etcd.conf.j2 index 6e7d913a4..cc29baa25 100644 --- a/roles/etcd/templates/etcd.conf.j2 +++ b/roles/etcd/templates/etcd.conf.j2 @@ -9,4 +9,5 @@ ETCD_INITIAL_CLUSTER_STATE="new" ETCD_DATA_DIR="{{ etcd_data_dir }}" ETCD_ELECTION_TIMEOUT="5000" ETCD_HEARTBEAT_INTERVAL="1000" +ETCD_INITIAL_ELECTION_TICK_ADVANCE="false" ETCD_ENABLE_V2="true"