Skip to content
This repository has been archived by the owner on Feb 6, 2025. It is now read-only.

Update etcd to 3.4.10+ (3.4.13) #1327

Closed

Conversation

jordimassaguerpla
Copy link
Member

@jordimassaguerpla jordimassaguerpla commented Aug 18, 2020

Why is this PR needed?

fixes CVE-2020-15106 bsc#1174951 https://github.com/SUSE/avant-garde/issues/1876

Reminder: Add the "fixes bsc#XXXX" to the title of the commit so that it will
appear in the changelog.

What does this PR do?

fix a security issue for etcd

Anything else a reviewer needs to know?

The packages are in https://build.suse.de/project/show/Devel:CaaSP:4.5:Branches:etcd_3.4.10

Info for QA

This is info for QA so that they can validate this. This is mandatory if this PR fixes a bug.
If this is a new feature, a good description in "What does this PR do" may be enough.

Related info

Info that can be relevant for QA:

  • link to other PRs that should be merged together
  • link to packages that should be released together
  • upstream issues

Status BEFORE applying the patch

How can we reproduce the issue? How can we see this issue? Please provide the steps and the prove
this issue is not fixed.

** Check the etcd version **

Status AFTER applying the patch

How can we validate this issue is fixed? Please provide the steps and the prove this issue is fixed.

** Check etcd version and check test results for regressions **

Docs

If docs need to be updated, please add a link to a PR to https://github.com/SUSE/doc-caasp.
At the time of creating the issue, this PR can be work in progress (set its title to [WIP]),
but the documentation needs to be finalized before the PR can be merged.

etcd version should be updated into the attributes file

SUSE/doc-caasp#972

Merge restrictions

(Please do not edit this)

We are in v4-maintenance phase, so we will restrict what can be merged to prevent unexpected surprises:

What can be merged (merge criteria):
    2 approvals:
        1 developer: code is fine
        1 QA: QA is fine
    there is a PR for updating documentation (or a statement that this is not needed)

fixes CVE-2020-15106 bsc#1174951

Signed-off-by: Jordi Massaguer Pla <jmassaguerpla@suse.de>
jordimassaguerpla added a commit to SUSE/doc-caasp that referenced this pull request Aug 18, 2020
This goes along with:

SUSE/skuba#1327
@jordimassaguerpla
Copy link
Member Author

jordimassaguerpla commented Aug 18, 2020

This is the error I see

Aug 18 12:26:10 0100164095144 kubelet[20993]: I0818 12:26:10.978689   20993 kubelet_getters.go:173] status for pod etcd-caasp-master-105-caasp-jobs-dev-e2e-test-0 updated to {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:11:11 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:25:59 +0000 UTC  } {ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:25:59 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:11:11 +0000 UTC  }]    10.164.95.144 10.164.95.144 [{10.164.95.144}] 2020-08-18 12:11:11 +0000 UTC [] [{etcd {nil &ContainerStateRunning{StartedAt:2020-08-18 12:25:59 +0000 UTC,} nil} {nil nil &ContainerStateTerminated{ExitCode:2,Signal:0,Reason:Error,Message:,StartedAt:2020-08-18 12:21:45 +0000 UTC,FinishedAt:2020-08-18 12:23:15 +0000 UTC,ContainerID:cri-o://6680ca37dbcffa0b2a75f2868221e61951f8630bc568156861fe012dfe492630,}} true 7 registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/etcd:3.4.10 registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/etcd@sha256:1ba623cd1cfe7e96161bb0306b3d9981ac69b8915a199a17cff74a5f739bf73b cri-o://471d4157cb521c57303c117ac5f0e28893caacd9f127ff127bbac2cadce1675a 0xc001b137a9}] BestEffort []}
Aug 18 12:26:10 0100164095144 kubelet[20993]: I0818 12:26:10.978723   20993 kubelet_getters.go:173] status for pod kube-apiserver-caasp-master-105-caasp-jobs-dev-e2e-test-0 updated to {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:11:11 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:22:29 +0000 UTC ContainersNotReady containers with unready status: [kube-apiserver]} {ContainersReady False 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:22:29 +0000 UTC ContainersNotReady containers with unready status: [kube-apiserver]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:11:11 +0000 UTC  }]    10.164.95.144 10.164.95.144 [{10.164.95.144}] 2020-08-18 12:11:11 +0000 UTC [] [{kube-apiserver {&ContainerStateWaiting{Reason:CrashLoopBackOff,Message:back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-caasp-master-105-caasp-jobs-dev-e2e-test-0_kube-system(89696ff2b9d2600bb7ad2106752bb89e),} nil nil} {nil nil &ContainerStateTerminated{ExitCode:2,Signal:0,Reason:Error,Message:,StartedAt:2020-08-18 12:22:08 +0000 UTC,FinishedAt:2020-08-18 12:22:28 +0000 UTC,ContainerID:cri-o://22ac539d13180a243b10ef22225324ac35fae5044577fe376f6dfe47491103e4,}} false 7 registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/kube-apiserver:v1.18.6 registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/kube-apiserver@sha256:a16e46ca50fadbe5676d8cf37a8f3f2194e06b0a70f3aa0d6c59edcd71ee403b cri-o://22ac539d13180a243b10ef22225324ac35fae5044577fe376f6dfe47491103e4 0xc000eba609}] Burstable []}
Aug 18 12:26:10 0100164095144 kubelet[20993]: I0818 12:26:10.978754   20993 kubelet_getters.go:173] status for pod kube-scheduler-caasp-master-105-caasp-jobs-dev-e2e-test-0 updated to {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:11:11 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:13:11 +0000 UTC  } {ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:13:11 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-08-18 12:11:11 +0000 UTC  }]    10.164.95.144 10.164.95.144 [{10.164.95.144}] 2020-08-18 12:11:11 +0000 UTC [] [{kube-scheduler {nil &ContainerStateRunning{StartedAt:2020-08-18 12:13:10 +0000 UTC,} nil} {nil nil &ContainerStateTerminated{ExitCode:255,Signal:0,Reason:Error,Message:,StartedAt:2020-08-18 12:09:56 +0000 UTC,FinishedAt:2020-08-18 12:13:10 +0000 UTC,ContainerID:cri-o://a859af8bc5046a70e65c008f7f4bd452522b3263c3cb8c242b9535fedd6bed6d,}} true 1 registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/kube-scheduler:v1.18.6 registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/kube-scheduler@sha256:90685b90dcb5c7f427b02150d08fd94fffd8c524e4eda85a752ee3cf8b8e9b0b cri-o://a98528ef0365b58392708e75b2c8df37c58f8b6c5c351cabc63bb74e14b5a9d3 0xc0017c4b89}] Burstable []}
Aug 18 12:26:11 0100164095144 kubelet[20993]: E0818 12:26:11.295505   20993 event.go:269] Unable to write event: 'Post https://10.164.95.141:6443/api/v1/namespaces/kube-system/events: EOF' (may retry after sleeping)
Aug 18 12:26:13 0100164095144 kubelet[20993]: E0818 12:26:13.196127   20993 reflector.go:178] object-"kube-system"/"oidc-gangway-cert": Failed to list *v1.Secret: an error on the server ("") has prevented the request from succeeding (get secrets)
Aug 18 12:26:13 0100164095144 kubelet[20993]: W0818 12:26:13.431141   20993 status_manager.go:556] Failed to get status for pod "etcd-caasp-master-105-caasp-jobs-dev-e2e-test-0_kube-system(33d5f96b91fa3565cbfa8e3e52da970f)": an error on the server ("") has prevented the request from succeeding (get pods etcd-caasp-master-105-caasp-jobs-dev-e2e-test-0)
Aug 18 12:26:14 0100164095144 kubelet[20993]: I0818 12:26:14.020966   20993 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 0f7b30184cb6150fb560691bc9267451d4582e4d1dc6b1422a9a4cb930ee77a0
Aug 18 12:26:14 0100164095144 kubelet[20993]: I0818 12:26:14.020996   20993 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 22ac539d13180a243b10ef22225324ac35fae5044577fe376f6dfe47491103e4
Aug 18 12:26:14 0100164095144 kubelet[20993]: E0818 12:26:14.021356   20993 pod_workers.go:191] Error syncing pod 5cae50ee-3a6b-45b7-9298-431d379487cc ("kucero-tjdjq_kube-system(5cae50ee-3a6b-45b7-9298-431d379487cc)"), skipping: failed to "StartContainer" for "kucero" with CrashLoopBackOff: "back-off 5m0s restarting failed container=kucero pod=kucero-tjdjq_kube-system(5cae50ee-3a6b-45b7-9298-431d379487cc)"
Aug 18 12:26:14 0100164095144 kubelet[20993]: E0818 12:26:14.021558   20993 pod_workers.go:191] Error syncing pod 89696ff2b9d2600bb7ad2106752bb89e ("kube-apiserver-caasp-master-105-caasp-jobs-dev-e2e-test-0_kube-system(89696ff2b9d2600bb7ad2106752bb89e)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-caasp-master-105-caasp-jobs-dev-e2e-test-0_kube-system(89696ff2b9d2600bb7ad2106752bb89e)"
Aug 18 12:26:15 0100164095144 kubelet[20993]: E0818 12:26:15.096292   20993 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: an error on the server ("") has prevented the request from succeeding (get leases.coordination.k8s.io caasp-master-105-caasp-jobs-dev-e2e-test-0)
Aug 18 12:26:15 0100164095144 kubelet[20993]: I0818 12:26:15.384257   20993 prober.go:124] Liveness probe for "etcd-caasp-master-105-caasp-jobs-dev-e2e-test-0_kube-system(33d5f96b91fa3565cbfa8e3e52da970f):etcd" failed (failure): HTTP probe failed with statuscode: 503

at https://ci.suse.de/view/CaaSP/view/CaaSP-Dev/job/caasp-jobs/job/dev/job/e2e-test/105/artifact/platform_logs/master_10_164_95_144/kubelet.log


@jordimassaguerpla
Copy link
Member Author

This is the log from the etcd container that "exited":


sles@0100164095144:~> sudo crictl logs dc2b32768f9d5
2020-08-18 12:48:44.162007 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=arm64
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2020-08-18 12:48:44.162096 I | etcdmain: etcd Version: 3.4.10
2020-08-18 12:48:44.162099 I | etcdmain: Git SHA: Not provided (use ./build instead of go build)
2020-08-18 12:48:44.162101 I | etcdmain: Go Version: go1.14.2
2020-08-18 12:48:44.162103 I | etcdmain: Go OS/Arch: linux/amd64
2020-08-18 12:48:44.162106 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2020-08-18 12:48:44.162149 N | etcdmain: the server is already initialized as member before, starting as etcd member...
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2020-08-18 12:48:44.162166 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 
2020-08-18 12:48:44.162776 I | embed: name = caasp-master-105-caasp-jobs-dev-e2e-test-0
2020-08-18 12:48:44.162783 I | embed: data dir = /var/lib/etcd
2020-08-18 12:48:44.162786 I | embed: member dir = /var/lib/etcd/member
2020-08-18 12:48:44.162788 I | embed: heartbeat = 100ms
2020-08-18 12:48:44.162790 I | embed: election = 1000ms
2020-08-18 12:48:44.162792 I | embed: snapshot count = 10000
2020-08-18 12:48:44.162796 I | embed: advertise client URLs = https://10.164.95.144:2379
2020-08-18 12:48:44.162799 I | embed: initial advertise peer URLs = https://10.164.95.144:2380
2020-08-18 12:48:44.162808 I | embed: initial cluster = 
2020-08-18 12:48:44.177965 I | etcdserver: restarting member 404272f994ea9271 in cluster 279d994dc5d9c353 at commit index 1394
raft2020/08/18 12:48:44 INFO: 404272f994ea9271 switched to configuration voters=()
raft2020/08/18 12:48:44 INFO: 404272f994ea9271 became follower at term 723
raft2020/08/18 12:48:44 INFO: newRaft 404272f994ea9271 [peers: [], term: 723, commit: 1394, applied: 0, lastindex: 1398, lastterm: 2]
2020-08-18 12:48:44.178949 W | auth: simple token is not cryptographically signed
2020-08-18 12:48:44.183967 I | etcdserver: starting server... [version: 3.4.10, cluster version: to_be_decided]
raft2020/08/18 12:48:44 INFO: 404272f994ea9271 switched to configuration voters=(4630389783161115249)
2020-08-18 12:48:44.184456 I | etcdserver/membership: added member 404272f994ea9271 [https://10.164.95.144:2380] to cluster 279d994dc5d9c353
2020-08-18 12:48:44.184523 N | etcdserver/membership: set the initial cluster version to 3.4
2020-08-18 12:48:44.184552 I | etcdserver/api: enabled capabilities for version 3.4
2020-08-18 12:48:44.185522 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 
2020-08-18 12:48:44.185591 I | embed: listening for peers on 10.164.95.144:2380
2020-08-18 12:48:44.185658 I | embed: listening for metrics on http://127.0.0.1:2381
raft2020/08/18 12:48:44 INFO: 404272f994ea9271 switched to configuration voters=(4630389783161115249 7598523962297662078)
2020-08-18 12:48:44.188752 I | etcdserver/membership: added member 69736148f60bba7e [https://10.164.95.139:2380] to cluster 279d994dc5d9c353
2020-08-18 12:48:44.188774 I | rafthttp: starting peer 69736148f60bba7e...
2020-08-18 12:48:44.188797 I | rafthttp: started HTTP pipelining with peer 69736148f60bba7e
2020-08-18 12:48:44.189221 I | rafthttp: started streaming with peer 69736148f60bba7e (writer)
2020-08-18 12:48:44.189464 I | rafthttp: started streaming with peer 69736148f60bba7e (writer)
2020-08-18 12:48:44.190243 I | rafthttp: started peer 69736148f60bba7e
2020-08-18 12:48:44.190289 I | rafthttp: added peer 69736148f60bba7e
2020-08-18 12:48:44.190294 I | rafthttp: started streaming with peer 69736148f60bba7e (stream MsgApp v2 reader)
2020-08-18 12:48:44.190339 I | rafthttp: started streaming with peer 69736148f60bba7e (stream Message reader)
raft2020/08/18 12:48:45 INFO: 404272f994ea9271 is starting a new election at term 723
raft2020/08/18 12:48:45 INFO: 404272f994ea9271 became candidate at term 724
raft2020/08/18 12:48:45 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 724
raft2020/08/18 12:48:45 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 724
raft2020/08/18 12:48:46 INFO: 404272f994ea9271 is starting a new election at term 724
raft2020/08/18 12:48:46 INFO: 404272f994ea9271 became candidate at term 725
raft2020/08/18 12:48:46 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 725
raft2020/08/18 12:48:46 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 725
raft2020/08/18 12:48:47 INFO: 404272f994ea9271 is starting a new election at term 725
raft2020/08/18 12:48:47 INFO: 404272f994ea9271 became candidate at term 726
raft2020/08/18 12:48:47 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 726
raft2020/08/18 12:48:47 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 726
raft2020/08/18 12:48:48 INFO: 404272f994ea9271 is starting a new election at term 726
raft2020/08/18 12:48:48 INFO: 404272f994ea9271 became candidate at term 727
raft2020/08/18 12:48:48 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 727
raft2020/08/18 12:48:48 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 727
2020-08-18 12:48:49.190429 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:48:49.190493 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:48:50 INFO: 404272f994ea9271 is starting a new election at term 727
raft2020/08/18 12:48:50 INFO: 404272f994ea9271 became candidate at term 728
raft2020/08/18 12:48:50 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 728
raft2020/08/18 12:48:50 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 728
2020-08-18 12:48:51.184247 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:48:51 INFO: 404272f994ea9271 is starting a new election at term 728
raft2020/08/18 12:48:51 INFO: 404272f994ea9271 became candidate at term 729
raft2020/08/18 12:48:51 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 729
raft2020/08/18 12:48:51 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 729
raft2020/08/18 12:48:53 INFO: 404272f994ea9271 is starting a new election at term 729
raft2020/08/18 12:48:53 INFO: 404272f994ea9271 became candidate at term 730
raft2020/08/18 12:48:53 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 730
raft2020/08/18 12:48:53 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 730
2020-08-18 12:48:54.190542 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:48:54.190573 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:48:54 INFO: 404272f994ea9271 is starting a new election at term 730
raft2020/08/18 12:48:54 INFO: 404272f994ea9271 became candidate at term 731
raft2020/08/18 12:48:54 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 731
raft2020/08/18 12:48:54 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 731
raft2020/08/18 12:48:56 INFO: 404272f994ea9271 is starting a new election at term 731
raft2020/08/18 12:48:56 INFO: 404272f994ea9271 became candidate at term 732
raft2020/08/18 12:48:56 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 732
raft2020/08/18 12:48:56 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 732
raft2020/08/18 12:48:57 INFO: 404272f994ea9271 is starting a new election at term 732
raft2020/08/18 12:48:57 INFO: 404272f994ea9271 became candidate at term 733
raft2020/08/18 12:48:57 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 733
raft2020/08/18 12:48:57 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 733
2020-08-18 12:48:58.184358 E | etcdserver: publish error: etcdserver: request timed out
2020-08-18 12:48:59.190677 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:48:59.190808 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:48:59 INFO: 404272f994ea9271 is starting a new election at term 733
raft2020/08/18 12:48:59 INFO: 404272f994ea9271 became candidate at term 734
raft2020/08/18 12:48:59 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 734
raft2020/08/18 12:48:59 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 734
raft2020/08/18 12:49:00 INFO: 404272f994ea9271 is starting a new election at term 734
raft2020/08/18 12:49:00 INFO: 404272f994ea9271 became candidate at term 735
raft2020/08/18 12:49:00 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 735
raft2020/08/18 12:49:00 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 735
raft2020/08/18 12:49:02 INFO: 404272f994ea9271 is starting a new election at term 735
raft2020/08/18 12:49:02 INFO: 404272f994ea9271 became candidate at term 736
raft2020/08/18 12:49:02 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 736
raft2020/08/18 12:49:02 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 736
raft2020/08/18 12:49:03 INFO: 404272f994ea9271 is starting a new election at term 736
raft2020/08/18 12:49:03 INFO: 404272f994ea9271 became candidate at term 737
raft2020/08/18 12:49:03 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 737
raft2020/08/18 12:49:03 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 737
2020-08-18 12:49:04.190859 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:04.190884 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:05.184537 E | etcdserver: publish error: etcdserver: request timed out
2020-08-18 12:49:05.384131 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:49:05 INFO: 404272f994ea9271 is starting a new election at term 737
raft2020/08/18 12:49:05 INFO: 404272f994ea9271 became candidate at term 738
raft2020/08/18 12:49:05 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 738
raft2020/08/18 12:49:05 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 738
raft2020/08/18 12:49:06 INFO: 404272f994ea9271 is starting a new election at term 738
raft2020/08/18 12:49:06 INFO: 404272f994ea9271 became candidate at term 739
raft2020/08/18 12:49:06 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 739
raft2020/08/18 12:49:06 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 739
raft2020/08/18 12:49:08 INFO: 404272f994ea9271 is starting a new election at term 739
raft2020/08/18 12:49:08 INFO: 404272f994ea9271 became candidate at term 740
raft2020/08/18 12:49:08 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 740
raft2020/08/18 12:49:08 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 740
2020-08-18 12:49:09.190928 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:09.190954 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:09 INFO: 404272f994ea9271 is starting a new election at term 740
raft2020/08/18 12:49:09 INFO: 404272f994ea9271 became candidate at term 741
raft2020/08/18 12:49:09 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 741
raft2020/08/18 12:49:09 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 741
raft2020/08/18 12:49:10 INFO: 404272f994ea9271 is starting a new election at term 741
raft2020/08/18 12:49:10 INFO: 404272f994ea9271 became candidate at term 742
raft2020/08/18 12:49:10 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 742
raft2020/08/18 12:49:10 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 742
2020-08-18 12:49:12.184673 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:49:12 INFO: 404272f994ea9271 is starting a new election at term 742
raft2020/08/18 12:49:12 INFO: 404272f994ea9271 became candidate at term 743
raft2020/08/18 12:49:12 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 743
raft2020/08/18 12:49:12 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 743
raft2020/08/18 12:49:13 INFO: 404272f994ea9271 is starting a new election at term 743
raft2020/08/18 12:49:13 INFO: 404272f994ea9271 became candidate at term 744
raft2020/08/18 12:49:13 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 744
raft2020/08/18 12:49:13 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 744
2020-08-18 12:49:14.190991 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:14.191018 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:15 INFO: 404272f994ea9271 is starting a new election at term 744
raft2020/08/18 12:49:15 INFO: 404272f994ea9271 became candidate at term 745
raft2020/08/18 12:49:15 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 745
raft2020/08/18 12:49:15 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 745
2020-08-18 12:49:15.384113 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:49:16 INFO: 404272f994ea9271 is starting a new election at term 745
raft2020/08/18 12:49:16 INFO: 404272f994ea9271 became candidate at term 746
raft2020/08/18 12:49:16 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 746
raft2020/08/18 12:49:16 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 746
raft2020/08/18 12:49:17 INFO: 404272f994ea9271 is starting a new election at term 746
raft2020/08/18 12:49:17 INFO: 404272f994ea9271 became candidate at term 747
raft2020/08/18 12:49:17 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 747
raft2020/08/18 12:49:17 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 747
raft2020/08/18 12:49:18 INFO: 404272f994ea9271 is starting a new election at term 747
raft2020/08/18 12:49:18 INFO: 404272f994ea9271 became candidate at term 748
raft2020/08/18 12:49:18 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 748
raft2020/08/18 12:49:18 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 748
2020-08-18 12:49:19.184806 E | etcdserver: publish error: etcdserver: request timed out
2020-08-18 12:49:19.191158 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:19.191183 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:20 INFO: 404272f994ea9271 is starting a new election at term 748
raft2020/08/18 12:49:20 INFO: 404272f994ea9271 became candidate at term 749
raft2020/08/18 12:49:20 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 749
raft2020/08/18 12:49:20 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 749
raft2020/08/18 12:49:21 INFO: 404272f994ea9271 is starting a new election at term 749
raft2020/08/18 12:49:21 INFO: 404272f994ea9271 became candidate at term 750
raft2020/08/18 12:49:21 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 750
raft2020/08/18 12:49:21 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 750
raft2020/08/18 12:49:23 INFO: 404272f994ea9271 is starting a new election at term 750
raft2020/08/18 12:49:23 INFO: 404272f994ea9271 became candidate at term 751
raft2020/08/18 12:49:23 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 751
raft2020/08/18 12:49:23 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 751
2020-08-18 12:49:24.191212 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:24.191253 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:24 INFO: 404272f994ea9271 is starting a new election at term 751
raft2020/08/18 12:49:24 INFO: 404272f994ea9271 became candidate at term 752
raft2020/08/18 12:49:24 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 752
raft2020/08/18 12:49:24 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 752
2020-08-18 12:49:25.384031 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:49:25 INFO: 404272f994ea9271 is starting a new election at term 752
raft2020/08/18 12:49:25 INFO: 404272f994ea9271 became candidate at term 753
raft2020/08/18 12:49:25 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 753
raft2020/08/18 12:49:25 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 753
2020-08-18 12:49:26.184983 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:49:27 INFO: 404272f994ea9271 is starting a new election at term 753
raft2020/08/18 12:49:27 INFO: 404272f994ea9271 became candidate at term 754
raft2020/08/18 12:49:27 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 754
raft2020/08/18 12:49:27 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 754
raft2020/08/18 12:49:28 INFO: 404272f994ea9271 is starting a new election at term 754
raft2020/08/18 12:49:28 INFO: 404272f994ea9271 became candidate at term 755
raft2020/08/18 12:49:28 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 755
raft2020/08/18 12:49:28 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 755
2020-08-18 12:49:29.191279 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:29.191319 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:29 INFO: 404272f994ea9271 is starting a new election at term 755
raft2020/08/18 12:49:29 INFO: 404272f994ea9271 became candidate at term 756
raft2020/08/18 12:49:29 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 756
raft2020/08/18 12:49:29 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 756
raft2020/08/18 12:49:30 INFO: 404272f994ea9271 is starting a new election at term 756
raft2020/08/18 12:49:30 INFO: 404272f994ea9271 became candidate at term 757
raft2020/08/18 12:49:30 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 757
raft2020/08/18 12:49:30 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 757
raft2020/08/18 12:49:32 INFO: 404272f994ea9271 is starting a new election at term 757
raft2020/08/18 12:49:32 INFO: 404272f994ea9271 became candidate at term 758
raft2020/08/18 12:49:32 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 758
raft2020/08/18 12:49:32 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 758
2020-08-18 12:49:33.185154 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:49:33 INFO: 404272f994ea9271 is starting a new election at term 758
raft2020/08/18 12:49:33 INFO: 404272f994ea9271 became candidate at term 759
raft2020/08/18 12:49:33 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 759
raft2020/08/18 12:49:33 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 759
2020-08-18 12:49:34.191326 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:34.191363 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:35 INFO: 404272f994ea9271 is starting a new election at term 759
raft2020/08/18 12:49:35 INFO: 404272f994ea9271 became candidate at term 760
raft2020/08/18 12:49:35 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 760
raft2020/08/18 12:49:35 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 760
2020-08-18 12:49:35.384093 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:49:36 INFO: 404272f994ea9271 is starting a new election at term 760
raft2020/08/18 12:49:36 INFO: 404272f994ea9271 became candidate at term 761
raft2020/08/18 12:49:36 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 761
raft2020/08/18 12:49:36 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 761
raft2020/08/18 12:49:37 INFO: 404272f994ea9271 is starting a new election at term 761
raft2020/08/18 12:49:37 INFO: 404272f994ea9271 became candidate at term 762
raft2020/08/18 12:49:37 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 762
raft2020/08/18 12:49:37 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 762
raft2020/08/18 12:49:38 INFO: 404272f994ea9271 is starting a new election at term 762
raft2020/08/18 12:49:38 INFO: 404272f994ea9271 became candidate at term 763
raft2020/08/18 12:49:38 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 763
raft2020/08/18 12:49:38 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 763
2020-08-18 12:49:39.191390 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:39.191418 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:40.185273 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:49:40 INFO: 404272f994ea9271 is starting a new election at term 763
raft2020/08/18 12:49:40 INFO: 404272f994ea9271 became candidate at term 764
raft2020/08/18 12:49:40 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 764
raft2020/08/18 12:49:40 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 764
raft2020/08/18 12:49:41 INFO: 404272f994ea9271 is starting a new election at term 764
raft2020/08/18 12:49:41 INFO: 404272f994ea9271 became candidate at term 765
raft2020/08/18 12:49:41 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 765
raft2020/08/18 12:49:41 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 765
raft2020/08/18 12:49:42 INFO: 404272f994ea9271 is starting a new election at term 765
raft2020/08/18 12:49:42 INFO: 404272f994ea9271 became candidate at term 766
raft2020/08/18 12:49:42 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 766
raft2020/08/18 12:49:42 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 766
2020-08-18 12:49:44.191499 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:44.191529 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:44 INFO: 404272f994ea9271 is starting a new election at term 766
raft2020/08/18 12:49:44 INFO: 404272f994ea9271 became candidate at term 767
raft2020/08/18 12:49:44 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 767
raft2020/08/18 12:49:44 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 767
2020-08-18 12:49:45.384079 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:49:46 INFO: 404272f994ea9271 is starting a new election at term 767
raft2020/08/18 12:49:46 INFO: 404272f994ea9271 became candidate at term 768
raft2020/08/18 12:49:46 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 768
raft2020/08/18 12:49:46 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 768
2020-08-18 12:49:47.185449 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:49:47 INFO: 404272f994ea9271 is starting a new election at term 768
raft2020/08/18 12:49:47 INFO: 404272f994ea9271 became candidate at term 769
raft2020/08/18 12:49:47 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 769
raft2020/08/18 12:49:47 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 769
raft2020/08/18 12:49:49 INFO: 404272f994ea9271 is starting a new election at term 769
raft2020/08/18 12:49:49 INFO: 404272f994ea9271 became candidate at term 770
raft2020/08/18 12:49:49 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 770
raft2020/08/18 12:49:49 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 770
2020-08-18 12:49:49.191978 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:49.192004 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:50 INFO: 404272f994ea9271 is starting a new election at term 770
raft2020/08/18 12:49:50 INFO: 404272f994ea9271 became candidate at term 771
raft2020/08/18 12:49:50 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 771
raft2020/08/18 12:49:50 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 771
raft2020/08/18 12:49:52 INFO: 404272f994ea9271 is starting a new election at term 771
raft2020/08/18 12:49:52 INFO: 404272f994ea9271 became candidate at term 772
raft2020/08/18 12:49:52 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 772
raft2020/08/18 12:49:52 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 772
2020-08-18 12:49:54.185609 E | etcdserver: publish error: etcdserver: request timed out
2020-08-18 12:49:54.192084 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:54.192150 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:49:54 INFO: 404272f994ea9271 is starting a new election at term 772
raft2020/08/18 12:49:54 INFO: 404272f994ea9271 became candidate at term 773
raft2020/08/18 12:49:54 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 773
raft2020/08/18 12:49:54 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 773
2020-08-18 12:49:55.383951 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:49:55 INFO: 404272f994ea9271 is starting a new election at term 773
raft2020/08/18 12:49:55 INFO: 404272f994ea9271 became candidate at term 774
raft2020/08/18 12:49:55 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 774
raft2020/08/18 12:49:55 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 774
raft2020/08/18 12:49:57 INFO: 404272f994ea9271 is starting a new election at term 774
raft2020/08/18 12:49:57 INFO: 404272f994ea9271 became candidate at term 775
raft2020/08/18 12:49:57 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 775
raft2020/08/18 12:49:57 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 775
raft2020/08/18 12:49:58 INFO: 404272f994ea9271 is starting a new election at term 775
raft2020/08/18 12:49:58 INFO: 404272f994ea9271 became candidate at term 776
raft2020/08/18 12:49:58 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 776
raft2020/08/18 12:49:58 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 776
2020-08-18 12:49:59.192241 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:49:59.192268 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:50:00 INFO: 404272f994ea9271 is starting a new election at term 776
raft2020/08/18 12:50:00 INFO: 404272f994ea9271 became candidate at term 777
raft2020/08/18 12:50:00 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 777
raft2020/08/18 12:50:00 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 777
2020-08-18 12:50:01.185778 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:50:01 INFO: 404272f994ea9271 is starting a new election at term 777
raft2020/08/18 12:50:01 INFO: 404272f994ea9271 became candidate at term 778
raft2020/08/18 12:50:01 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 778
raft2020/08/18 12:50:01 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 778
raft2020/08/18 12:50:02 INFO: 404272f994ea9271 is starting a new election at term 778
raft2020/08/18 12:50:02 INFO: 404272f994ea9271 became candidate at term 779
raft2020/08/18 12:50:02 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 779
raft2020/08/18 12:50:02 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 779
2020-08-18 12:50:04.192350 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:50:04.192375 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:50:04 INFO: 404272f994ea9271 is starting a new election at term 779
raft2020/08/18 12:50:04 INFO: 404272f994ea9271 became candidate at term 780
raft2020/08/18 12:50:04 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 780
raft2020/08/18 12:50:04 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 780
2020-08-18 12:50:05.384017 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)
raft2020/08/18 12:50:06 INFO: 404272f994ea9271 is starting a new election at term 780
raft2020/08/18 12:50:06 INFO: 404272f994ea9271 became candidate at term 781
raft2020/08/18 12:50:06 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 781
raft2020/08/18 12:50:06 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 781
raft2020/08/18 12:50:08 INFO: 404272f994ea9271 is starting a new election at term 781
raft2020/08/18 12:50:08 INFO: 404272f994ea9271 became candidate at term 782
raft2020/08/18 12:50:08 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 782
raft2020/08/18 12:50:08 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 782
2020-08-18 12:50:08.185936 E | etcdserver: publish error: etcdserver: request timed out
raft2020/08/18 12:50:09 INFO: 404272f994ea9271 is starting a new election at term 782
raft2020/08/18 12:50:09 INFO: 404272f994ea9271 became candidate at term 783
raft2020/08/18 12:50:09 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 783
raft2020/08/18 12:50:09 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 783
2020-08-18 12:50:09.192418 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:50:09.192454 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:50:10 INFO: 404272f994ea9271 is starting a new election at term 783
raft2020/08/18 12:50:10 INFO: 404272f994ea9271 became candidate at term 784
raft2020/08/18 12:50:10 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 784
raft2020/08/18 12:50:10 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 784
raft2020/08/18 12:50:12 INFO: 404272f994ea9271 is starting a new election at term 784
raft2020/08/18 12:50:12 INFO: 404272f994ea9271 became candidate at term 785
raft2020/08/18 12:50:12 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 785
raft2020/08/18 12:50:12 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 785
raft2020/08/18 12:50:13 INFO: 404272f994ea9271 is starting a new election at term 785
raft2020/08/18 12:50:13 INFO: 404272f994ea9271 became candidate at term 786
raft2020/08/18 12:50:13 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 786
raft2020/08/18 12:50:13 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 786
2020-08-18 12:50:14.192547 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
2020-08-18 12:50:14.192574 W | rafthttp: health check for peer 69736148f60bba7e could not connect: dial tcp 10.164.95.139:2380: connect: connection refused
raft2020/08/18 12:50:15 INFO: 404272f994ea9271 is starting a new election at term 786
raft2020/08/18 12:50:15 INFO: 404272f994ea9271 became candidate at term 787
raft2020/08/18 12:50:15 INFO: 404272f994ea9271 received MsgVoteResp from 404272f994ea9271 at term 787
raft2020/08/18 12:50:15 INFO: 404272f994ea9271 [logterm: 2, index: 1398] sent MsgVote request to 69736148f60bba7e at term 787
2020-08-18 12:50:15.186071 E | etcdserver: publish error: etcdserver: request timed out
2020-08-18 12:50:15.384067 W | etcdserver/api/etcdhttp: /health error; no leader (status code 503)

@jordimassaguerpla
Copy link
Member Author

@dannysauer Were you able to deploy a cluster with this new etcd? Can you give it a try?

@dannysauer
Copy link
Contributor

It ran, but I didn't do a full cluster. I'll give that a shot now.

@dannysauer
Copy link
Contributor

It ran, but I didn't do a full cluster. I'll give that a shot now.

Well, that's not working out. My SCC access still doesn't work, so I can't enable anything on my SLES 15.2 image, which inhibits building an actual cluster. 🤦

The etcd container log above indicates that there are two nodes in this etcd cluster - 404272f994ea9271 is the one running (on 10.164.95.144), and 69736148f60bba7e, which should be reachable at 10.164.95.139:2380. However, messages sent to 69736148f60bba7e are getting "connection refused", which means either that second etcd instance isn't listening on port 2080, or the connection is being rejected by a firewall rule. I don't think that indicates an error in etcd, since this container's instance didn't log an issue with starting up and binding to port 2080. But that does cause kubeadm to fail to work, since the etcd cluster appears to fail to enter a healthy state.

I thought I'd pull down the container and see if I could replicate that behavior in just a manual etcd cluster, but the registry URL in the kubelet log - registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/etcd:3.4.10 - isn't something podman will pull for me. I had to use registry.suse.de/devel/caasp/4.5/branches/etcd_3.4.10/containers/caasp/v4.5/etcd:3.4.10. So I'm a little curious how the CI worked when I can't get that container to run at all here. But that's maybe a separate thing.

@jordimassaguerpla
Copy link
Member Author

It ran, but I didn't do a full cluster. I'll give that a shot now.

Well, that's not working out. My SCC access still doesn't work, so I can't enable anything on my SLES 15.2 image, which inhibits building an actual cluster. facepalm

The etcd container log above indicates that there are two nodes in this etcd cluster - 404272f994ea9271 is the one running (on 10.164.95.144), and 69736148f60bba7e, which should be reachable at 10.164.95.139:2380. However, messages sent to 69736148f60bba7e are getting "connection refused", which means either that second etcd instance isn't listening on port 2080, or the connection is being rejected by a firewall rule. I don't think that indicates an error in etcd, since this container's instance didn't log an issue with starting up and binding to port 2080. But that does cause kubeadm to fail to work, since the etcd cluster appears to fail to enter a healthy state.

I thought I'd pull down the container and see if I could replicate that behavior in just a manual etcd cluster, but the registry URL in the kubelet log - registry.suse.de/devel/caasp/4.5/containers/containers/caasp/v4.5/etcd:3.4.10 - isn't something podman will pull for me. I had to use registry.suse.de/devel/caasp/4.5/branches/etcd_3.4.10/containers/caasp/v4.5/etcd:3.4.10. So I'm a little curious how the CI worked when I can't get that container to run at all here. But that's maybe a separate thing.

Hi! I just sent you a subscription key for SCC on private message.

CI uses a cri-o feature for mirroring registry.suse.de/devel/caasp/4.5/containers/containers/ to registry.suse.de/devel/caasp/4.5/branches/etcd_3.4.10/. In short, you can see the registries.conf configuration in:

https://ci.suse.de/job/caasp-jobs/job/pr/job/test/job/PR-1327/1/artifact/platform_logs/master_10_164_95_77/containers/registries.conf

unqualified-search-registries = [ "docker.io",]
[[registry]]
prefix = "registry.suse.de/devel/caasp/4.5/containers/containers"
location = "registry.suse.de/devel/caasp/4.5/containers/containers"
[[registry.mirror]]
location = "registry.suse.de/devel/caasp/4.5/branches/etcd_3.4.10/containers"
insecure = true

@jordimassaguerpla
Copy link
Member Author

I was able to pass the tests with master, this is why I think the etcd update is the cause of the "broken deployment"

@dannysauer
Copy link
Contributor

Well, if changing etcd is what breaks it, then it's reasonable to guess that etcd is the problem. 😂

Thanks for sharing the key. I'll get a cluster up this afternoon and see if I can track down where things are going wrong.

@dannysauer
Copy link
Contributor

Ah. On the other master node, which is .139, there's this:

sles@0100164095139:~> sudo crictl logs 05b5d90a0028a
2020-08-18 16:44:44.111591 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=arm64
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2020-08-18 16:44:44.111672 I | etcdmain: etcd Version: 3.4.10
2020-08-18 16:44:44.111675 I | etcdmain: Git SHA: Not provided (use ./build instead of go build)
2020-08-18 16:44:44.111678 I | etcdmain: Go Version: go1.14.2
2020-08-18 16:44:44.111681 I | etcdmain: Go OS/Arch: linux/amd64
2020-08-18 16:44:44.111683 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2020-08-18 16:44:44.111736 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.cr
t, client-cert-auth = true, crl-file =
2020-08-18 16:44:44.112171 I | embed: name = caasp-master-105-caasp-jobs-dev-e2e-test-1
2020-08-18 16:44:44.112180 I | embed: data dir = /var/lib/etcd
2020-08-18 16:44:44.112183 I | embed: member dir = /var/lib/etcd/member
2020-08-18 16:44:44.112186 I | embed: heartbeat = 100ms
2020-08-18 16:44:44.112188 I | embed: election = 1000ms
2020-08-18 16:44:44.112190 I | embed: snapshot count = 10000
2020-08-18 16:44:44.112200 I | embed: advertise client URLs = https://10.164.95.139:2379
2020-08-18 16:44:44.112285 C | etcdmain: cannot access data directory: directory "/var/lib/etcd","drwxr-xr-x" exist without desired file permission "-rwx------".

So node .144 has an etcd which (perhaps accurately) thinks it's in an existing cluster with .139, but .139 isn't starting because the permissions on the datadir are wrong. Sure enough:

sles@0100164095139:~> ls -ld /var/lib/etcd
drwxr-xr-x 1 root root 0 Aug 18 12:12 /var/lib/etcd
sles@0100164095139:~> sudo chmod 0700 /var/lib/etcd

After fixing the data directory permissions to what they're supposed to be, the second etcd came up just fine and then the first one also came up. Provisioning failed because the second etcd was added but didn't run, which makes a 2-node etcd cluster that thinks its in split-brain. In that case, etcd doesn't handle reads or writes, and then the API server can't API so everything fails.

@jordimassaguerpla
Copy link
Member Author

How do we fix this? And why is it happening with the new etcd and not with the old one?

@dannysauer
Copy link
Contributor

I wonder if it was a sporadic failure in the CI? I'm digging to see what makes that directory to begin with. I think the container expects that to exist before it starts, so I'm assuming either kubernetes or skuba? Does cri-o create the directory if a non-existing directory is specified as a volume? That would make sense for why it was created with the default 022 umask.

@dannysauer
Copy link
Contributor

Adding the second etcd node failed the same way on reexecution. So this seems likely to be related to the container somehow. I'll look into what is different here this afternoon.

If we can't find the issue, I don't think this vulnerability is significant enough to delay the initial release. The etcd endpoints should only be accessible inside the cluster if a customer has set up their firewall rules / network segmentation following our suggestions in the admin guide; etcd should only be accessible by k8s nodes (or by trusted nodes). Exploiting this vulnerability requires an attacker to take control of the etcd leader in order to send crafted WAL entries, which means access to the SSL certs or local machine access. Those are generally pretty high bars. So, delaying this to a maintenance release rather than calling it a release blocker should be ok.

@dannysauer
Copy link
Contributor

Looks like this is actually a feature in newer etcd. I looked at an old 4.2.0 cluster:

sauer@macdanny:~> ssh 192.168.0.244 ls -ld /var/lib/etcd/
drwx------ 3 root root 20 Jul  9 13:48 /var/lib/etcd/
sauer@macdanny:~> ssh 192.168.0.233 ls -ld /var/lib/etcd/
drwxr-xr-x 3 root root 20 Jul  9 14:48 /var/lib/etcd/
sauer@macdanny:~> ssh 192.168.0.227 ls -ld /var/lib/etcd/
drwxr-xr-x 3 root root 20 Jun 26 01:26 /var/lib/etcd/

Note that the first master node created has the right permissions on /var/lib/etcd, but the second and third nodes do not. What's new is that etcd requires the tighter perms. There was an issue created in kubeadm which addressed this behavior a while ago - kubernetes/kubeadm#1308. So, could be that there's actually a defect (either on our end of kubeadm) when growing the etcd cluster.

This directory permission checking behavior is newly introduced with etcd 3.4.10 in etcd-io/etcd#11798, and is mentioned as a breaking change in the changelog (whoops) https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md

@dannysauer
Copy link
Contributor

Looks like the mkdir for additional etcd nodes was removed in kubernetes/kubernetes@6bbed9f#diff-0960dc0bb7c3e7113a0daa027856b8f9. Opening upstream issue.

@dannysauer
Copy link
Contributor

dannysauer commented Aug 18, 2020

I guess this is blocked until kubernetes/kubeadm#2256 is resolved or one of us fixes it locally.

@jordimassaguerpla
Copy link
Member Author

I am happy we could test this before accepting the packages into the build service repo. @pablochacin @davidcassany well done with the "github label to ibs project" link CI feature 👍

@dannysauer
Copy link
Contributor

The failure was reduced to a warning in 3.4.13 at the end of last week, so we can deploy that without waiting for the kubectl fix/backport or breaking customers. I'll get the IBS request updated.

@dannysauer dannysauer changed the title Update etcd to 3.4.10 Update etcd to 3.4.10+ (3.4.13) Sep 3, 2020
@dannysauer
Copy link
Contributor

I'm still having a problem with it not building cleanly after the OBS go module service updates the vendor tarball I might need some help from someone who knows how go module vendoring works better than I do (which is setting the bar pretty low :D).

@Itxaka
Copy link
Contributor

Itxaka commented Nov 10, 2020

if Im not mistaken this is superseeded by #1402 so closing this

@Itxaka Itxaka closed this Nov 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants