Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

Cannot start etcd 3.2 cluster with etcd-member.service #2095

Closed
jsoriano opened this issue Aug 9, 2017 · 2 comments
Closed

Cannot start etcd 3.2 cluster with etcd-member.service #2095

jsoriano opened this issue Aug 9, 2017 · 2 comments

Comments

@jsoriano
Copy link

jsoriano commented Aug 9, 2017

Issue Report

This is an issue I'm having while trying to start an etcd cluster with version 3.2 using CoreOS Container Linux and the provided etcd-member.service unit file. The problem appears both when upgrading from
3.1 and when starting a new cluster. This doesn't happen with any previous version (tried with 2.3, 3.0 and 3.1) and same configurations.

Maybe I'm misconfiguring something, but I cannot find what.

Bug

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1409.7.0
VERSION_ID=1409.7.0
BUILD_ID=2017-07-19-0005
PRETTY_NAME="Container Linux by CoreOS 1409.7.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

CoreOS Container Linux on virtualbox using vagrant scenario.

Expected Behavior

etcd cluster can be correctly started using etcd-member.service unit file

Actual Behavior

etcd cluster doesn't reach the healthy state.

Reproduction Steps

Two ways to reproduce this:

With a new cluster:

  1. Configure etcd-member.service with a drop-in like this one:
[Service]
Environment=ETCD_IMAGE_TAG=v3.2
Environment=ETCD_NAME=%H
Environment=ETCD_INITIAL_CLUSTER="core-01=http://172.17.8.101:2380,core-02=http://172.17.8.102:2380,core-03=http://172.17.8.103:2380"
Environment=ETCD_INITIAL_ADVERTISE_PEER_URLS=http://$public_ipv4:2380
Environment=ETCD_INITIAL_CLUSTER_STATE=new
Environment=ETCD_INITIAL_CLUSTER_TOKEN=token-1
Environment=ETCD_ADVERTISE_CLIENT_URLS=http://$public_ipv4:2379
Environment=ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001
Environment=ETCD_LISTEN_PEER_URLS=http://$private_ipv4:2380,http://$private_ipv4:7001
Environment=ETCD_DATA_DIR=/var/lib/etcd2
  1. Start members
  2. Cluster doesn't reach healthy state

Upgrading an existing 3.1 cluster:

  1. Start members with the same cofiguration as previous point, but with ETCD_IMAGE_TAG=v3.1.
  2. The cluster is healthy
  3. Change configuration so ETCD_IMAGE_TAG=v3.2 and restart members one by one
  4. When last member is restarted, cluster goes unhealthy and doesn't recover

Other Information

Members are continously logging messages like these:

3:44.297060 I | raft: b84cadd6519c819f became candidate at term 20
3:44.298000 I | raft: b84cadd6519c819f received MsgVoteResp from b84cadd6519c819f at term 20
3:44.298733 I | raft: b84cadd6519c819f [logterm: 18, index: 6752] sent MsgVote request to c7e5681337a477ff at term 20
3:44.298788 I | raft: b84cadd6519c819f [logterm: 18, index: 6752] sent MsgVote request to e057fbb8dd7aa671 at term 20
3:45.394940 I | raft: b84cadd6519c819f is starting a new election at term 20
3:45.395373 I | raft: b84cadd6519c819f became candidate at term 21
...
@jsoriano
Copy link
Author

jsoriano commented Aug 9, 2017

It seems to be related with having multiple peer urls.

@jsoriano
Copy link
Author

Fixed by etcd-io/etcd#8414

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant