Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd2 cluster ID mismatch #3710

Closed
haozhenxiao opened this issue Oct 19, 2015 · 21 comments
Closed

etcd2 cluster ID mismatch #3710

haozhenxiao opened this issue Oct 19, 2015 · 21 comments

Comments

@haozhenxiao
Copy link

I have two CoreOS machines, their IPs are: 10.10.26.160 and 10.10.24.156, I'm using the static bootstrap, the bootstrap script of the 10.10.26.160 is:

etcd2 -name etcd1 -data-dir data
-advertise-client-urls http://10.10.26.160:2379
-listen-client-urls http://10.10.26.160:2379,http://127.0.0.1:2379
-initial-advertise-peer-urls http://10.10.26.160:2380
-listen-peer-urls http://10.10.26.160:2380
-initial-cluster-token etcd-cluster-2
-initial-cluster etcd0=http://10.10.24.156:2380,etcd1=http://10.10.26.160:2380
-initial-cluster-state new

the bootstrap of 10.10.24.156 is:

etcd2 -name etcd0 -data-dir data
-advertise-client-urls http://10.10.24.156:2379
-listen-client-urls http://10.10.24.156:2379,http://127.0.0.1:2379
-initial-advertise-peer-urls http://10.10.24.156:2380
-listen-peer-urls http://10.10.24.156:2380
-initial-cluster-token etcd-cluster-2
-initial-cluster etcd0=http://10.10.24.156:2380,etcd1=http://10.10.26.160:2380
-initial-cluster-state new

While running the two scripts on the two coreos machines, I got some errors, the error of 10.10.26.160 is:

2015/10/16 07:51:46 raft: 8f87889e2f3130e3 is starting a new election at term 88
2015/10/16 07:51:46 raft: 8f87889e2f3130e3 became candidate at term 89
2015/10/16 07:51:46 raft: 8f87889e2f3130e3 received vote from 8f87889e2f3130e3 at term 89
2015/10/16 07:51:46 raft: 8f87889e2f3130e3 [logterm: 1, index: 3] sent vote request to db30be88917b6839 at term 89
2015/10/16 07:51:46 raft: 8f87889e2f3130e3 [logterm: 1, index: 3] sent vote request to f2d68f8a4e38f628 at term 89
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:46 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: request sent was ignored (cluster ID mismatch: remote[db30be88917b6839]=ac5f3aa02066b598, local=9b09b40f488fe304)
2015/10/16 07:51:47 rafthttp: failed to dial db30be88917b6839 on stream Message (dial tcp 10.10.24.156:2380: connection refused)
2015/10/16 07:51:47 rafthttp: failed to dial db30be88917b6839 on stream MsgApp v2 (dial tcp 10.10.24.156:2380: connection refused)
2015/10/16 07:51:47 rafthttp: failed to dial f2d68f8a4e38f628 on stream Message (dial tcp 10.10.24.161:2380: no route to host)

the error of 10.10.24.156 is:

2015/10/16 07:51:44 raft: 6ada9347d44a3950 is starting a new election at term 355
2015/10/16 07:51:44 raft: 6ada9347d44a3950 became candidate at term 356
2015/10/16 07:51:44 raft: 6ada9347d44a3950 received vote from 6ada9347d44a3950 at term 356
2015/10/16 07:51:44 raft: 6ada9347d44a3950 [logterm: 1, index: 2] sent vote request to 17c82b75bae3cfdf at term 356
2015/10/16 07:51:44 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:44 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: request received was ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: failed to write 17c82b75bae3cfdf on pipeline (dial tcp 10.10.48.217:2390: i/o timeout)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:45 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:46 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:46 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:46 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)
2015/10/16 07:51:46 rafthttp: streaming request ignored (cluster ID mismatch got 9b09b40f488fe304 want ac5f3aa02066b598)

I also checked the status of etcd2 using systemctl status -l etcd2, it seems that the 10.10.24.156 machine sometimes stopped running, sometimes it recovered by yourself, the outputs of the 10.10.24.156 machine varies between:

etcd2.service - etcd2
Loaded: loaded (/usr/lib64/systemd/system/etcd2.service; disabled; vendor preset: disabled)
Drop-In: /run/systemd/system/etcd2.service.d
└─20-cloudinit.conf
Active: activating (auto-restart) since Mon 2015-10-19 06:58:15 UTC; 9s ago
Process: 1109 ExecStart=/usr/bin/etcd2 (code=exited, status=0/SUCCESS)
Main PID: 1109 (code=exited, status=0/SUCCESS)

and:

etcd2.service - etcd2
Loaded: loaded (/usr/lib64/systemd/system/etcd2.service; disabled; vendor preset: disabled)
Drop-In: /run/systemd/system/etcd2.service.d
└─20-cloudinit.conf
Active: active (running) since Mon 2015-10-19 06:58:25 UTC; 784ms ago
Main PID: 1118 (etcd2)
Memory: 4.1M
CPU: 781ms
CGroup: /system.slice/etcd2.service
└─1118 /usr/bin/etcd2

Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: recovered store from snapshot at index 160016
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: name = fab7de8892ac4659aa45ab6c640bb05d
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: data dir = /var/lib/etcd2
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: member dir = /var/lib/etcd2/member
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: heartbeat = 100ms
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: election = 1000ms
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: snapshot count = 10000
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: discovery URL= https://discovery.etcd.io/c0eb5523934c5a502f2e314f9326781f
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: advertise client URLs = http://:2379,http://:4001
Oct 19 06:58:25 localhost etcd2[1118]: 2015/10/19 06:58:25 etcdserver: loaded cluster information from store:

The output of etcdctl member list of 10.10.26.160 is : client: no endpoints available, and for the 10.10.24.156 machine, the output is client: etcd cluster is unavailable or misconfigured.

What am I missing while bootstrapping etcd2 here? Thank you in advance.

@xiang90
Copy link
Contributor

xiang90 commented Oct 19, 2015

@haozhenxiao One of the member was bootstrapped via discovery service. You must remove the previous data-dir to clean up the member information. Or the member will ignore the new configuration and start with the old configuration. That is why you see the mismatch.

See https://github.com/coreos/etcd/blob/master/Documentation/admin_guide.md#lifecycle for more details.

Thanks.

@xiang90 xiang90 closed this as completed Oct 19, 2015
@haozhenxiao
Copy link
Author

@xiang90 If -data-dir is not specified, where is the data directory by default? Now I have deleted all the data directories, but I still get the cluster ID mismatch complain.

@yichengq
Copy link
Contributor

@xiang90
Copy link
Contributor

xiang90 commented Oct 23, 2015

@haozhenxiao When etcd starts, it will print out the data dir it uses and how it starts (using old configuration or bootstrap). You can check it yourself.

@dannysauer
Copy link
Contributor

On a related note for someone's future use, I also had this happen when I added the $WAL_DIR parameter to an existing cluster's startup config, but forgot to move the existing wal directory contents to the new location on a couple of nodes. Which is pretty much the same problem - partial missing data directory. :)

@Bregor
Copy link

Bregor commented Nov 4, 2016

Looks like same here:

# ./etcd --version
etcd Version: 2.3.7
Git SHA: fd17c91
Go Version: go1.6.2
Go OS/Arch: linux/amd64

Data dir is empty on start:

# ls -al /var/lib/etcd
ls: cannot access '/var/lib/etcd': No such file or directory

Starting the server:

./etcd --name=192.168.222.1 \
--initial-advertise-peer-urls=http://192.168.222.1:2380 \
--listen-peer-urls=http://192.168.222.1:2380,http://127.0.0.1:2380 \
--listen-client-urls=http://192.168.222.1:2379,http://127.0.0.1:2379 \
--advertise-client-urls=http://192.168.222.1:2379 \
--initial-cluster-token=yupNocFiftOweulNubJifsIthIbMeewayChodevasJilpalImNoydJalUmjajkoj \
--initial-cluster=192.168.222.1=http://192.168.222.1:2380 \
--initial-cluster-state=new \
--force-new-cluster=true \
--data-dir=/var/lib/etcd

2016-11-04 22:56:06.497385 I | etcdmain: etcd Version: 2.3.7
2016-11-04 22:56:06.497459 I | etcdmain: Git SHA: fd17c91
2016-11-04 22:56:06.497468 I | etcdmain: Go Version: go1.6.2
2016-11-04 22:56:06.497477 I | etcdmain: Go OS/Arch: linux/amd64
2016-11-04 22:56:06.497485 I | etcdmain: setting maximum number of CPUs to 24, total number of available CPUs is 24
2016-11-04 22:56:06.497558 I | etcdmain: listening for peers on http://127.0.0.1:2380
2016-11-04 22:56:06.497594 I | etcdmain: listening for peers on http://192.168.222.1:2380
2016-11-04 22:56:06.497628 I | etcdmain: listening for client requests on http://127.0.0.1:2379
2016-11-04 22:56:06.497661 I | etcdmain: listening for client requests on http://192.168.222.1:2379
2016-11-04 22:56:06.497915 I | etcdserver: name = 192.168.222.1
2016-11-04 22:56:06.497939 I | etcdserver: force new cluster
2016-11-04 22:56:06.497948 I | etcdserver: data dir = /var/lib/etcd
2016-11-04 22:56:06.497957 I | etcdserver: member dir = /var/lib/etcd/member
2016-11-04 22:56:06.497968 I | etcdserver: heartbeat = 100ms
2016-11-04 22:56:06.497978 I | etcdserver: election = 1000ms
2016-11-04 22:56:06.497988 I | etcdserver: snapshot count = 10000
2016-11-04 22:56:06.498002 I | etcdserver: advertise client URLs = http://192.168.222.1:2379
2016-11-04 22:56:06.498015 I | etcdserver: initial advertise peer URLs = http://192.168.222.1:2380
2016-11-04 22:56:06.498034 I | etcdserver: initial cluster = 192.168.222.1=http://192.168.222.1:2380
2016-11-04 22:56:06.530276 I | etcdserver: starting member 3cab03d6d1b6adf7 in cluster 60faf2bb35e8d189
2016-11-04 22:56:06.530360 I | raft: 3cab03d6d1b6adf7 became follower at term 0
2016-11-04 22:56:06.530379 I | raft: newRaft 3cab03d6d1b6adf7 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2016-11-04 22:56:06.530386 I | raft: 3cab03d6d1b6adf7 became follower at term 1
2016-11-04 22:56:06.530548 I | etcdserver: starting server... [version: 2.3.7, cluster version: to_be_decided]
2016-11-04 22:56:06.531070 E | etcdmain: failed to notify systemd for readiness: No socket
2016-11-04 22:56:06.531099 E | etcdmain: forgot to set Type=notify in systemd service file?
2016-11-04 22:56:06.531183 N | etcdserver: added local member 3cab03d6d1b6adf7 [http://192.168.222.1:2380] to cluster 60faf2bb35e8d189
2016-11-04 22:56:06.593088 E | rafthttp: request cluster ID mismatch (got 79ff626ee1032c13 want 60faf2bb35e8d189)
2016-11-04 22:56:06.593238 E | rafthttp: request cluster ID mismatch (got 79ff626ee1032c13 want 60faf2bb35e8d189)
2016-11-04 22:56:06.694630 E | rafthttp: request cluster ID mismatch (got 79ff626ee1032c13 want 60faf2bb35e8d189)
2016-11-04 22:56:06.694735 E | rafthttp: request cluster ID mismatch (got 79ff626ee1032c13 want 60faf2bb35e8d189)
...

What am I doing wrong?

@alogoc
Copy link

alogoc commented Nov 16, 2016

@Bregor Did you find any solution? I am having the exact same issue with etcd2

@Bregor
Copy link

Bregor commented Nov 16, 2016

@alogoc nope, sorry :(

@prakashsingh08
Copy link

I am facing same issue :(

@m4r10k
Copy link

m4r10k commented Nov 29, 2016

I had a similar problem. I stopped one node because I want the data directory at another place. After removing the etcd member I tried to add it with the new datadir. I ran the etcdctl command to add the etcd member and afterwards I tried to start the new member with the same result you have, cluster mismatch...

After some time i recognized that I have to use -initial-cluster-state existing as it is provided by the etcdctl member add command:

ETCD_INITIAL_CLUSTER_STATE="existing"

After changing my static docker setup to:

-initial-cluster-state existing

all logs are clear an the cluster health is true.

Maybe that will help you.

@jonathan-kosgei
Copy link

I'm clearing all the data in /var/lib/etcd and using --initial-cluster-state existing for all nodes other than the first one but I still get output like below that makes me think not all the data was cleared

2017-03-23 21:53:03.092325 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2017-03-23 21:53:05.383384 I | etcdserver: restarting member fecf4b594409e830 in cluster eea12049faf24c47 at commit index 14
2017-03-23 21:53:15.780930 E | rafthttp: request cluster ID mismatch (got 50ff3c9548b23302 want eea12049faf24c47)
2017-03-23 21:53:15.781099 E | rafthttp: request cluster ID mismatch (got 50ff3c9548b23302 want eea12049faf24c47)
2017-03-23 21:53:15.881969 E | rafthttp: request cluster ID mismatch (got 50ff3c9548b23302 want eea12049faf24c47)
2017-03-23 21:53:15.892945 E | rafthttp: request cluster ID mismatch (got 50ff3c9548b23302 want eea12049faf24c47)
2017-03-23 21:53:15.893209 E | rafthttp: request cluster ID mismatch (got 50ff3c9548b23302 want eea12049faf24c47)
2017-03-23 21:53:16.004918 E | rafthttp: request cluster ID mismatch (got 50ff3c9548b23302 want eea12049faf24c47)

How can I fix this?

@jonathan-kosgei
Copy link

Actually..it was stale data..I was backing etcd with gluster and seems it was working too well

@jonathan-kosgei
Copy link

Or not..I cleared all the nodes of the data but still when I run etcd I get

2017-03-23 22:09:53.087431 N | etcdmain: the server is already initialized as member before, starting as etcd member...

Where is this coming from?

@heyitsanthony
Copy link
Contributor

@jonathan-kosgei did you remove/add the node with etcdctl before restarting it with a wiped data directory? Since raft expects members to acknowledge writes, it also expects the member to keep any writes acknowledged; wiping the data directory drops that data, so the member is effectively lost. To get the node running again, the member has to be removed/added through etcdctl member to start fresh.

@jonathan-kosgei
Copy link

I'm running etcd on kubernetes, once I delete the dirs I simply delete/recreate the pods

@heyitsanthony
Copy link
Contributor

@jonathan-kosgei if it's deleting/recreating pods for a single member instead of the entire cluster, the member for that single member pod needs to be removed with etcdctl remove, then added again with etcdctl add so that the cluster knows it's a fresh member. Deleting/recreating a pod won't communicate that it's a new member to etcd.

There's also the etcd-operator project which can help simplify this process for managing etcd members under kubernetes.

@jonathan-kosgei
Copy link

jonathan-kosgei commented Mar 24, 2017 via email

@erikbgithub
Copy link

erikbgithub commented Jun 13, 2017

I run into the same error with 3.1.8.

# one log message from etcd03:
request cluster ID mismatch (got 2ce64b9964276fd5 want 62da52d8fd9cccca)

The funny thing is that neither of my three pods actually has either of these two IDs.

Here's the list of IDs as mentioned by the log message 2017-06-11 05:10:32.033556 I | rafthttp: starting peer 265604773c187a60...:

etcd01: 800f7631bb1b92c
etcd02: 53c932a78cd5e776
etcd03: 265604773c187a60

edit 1:
I found the ID via grep, it's in these lines:

etcd01: starting member c711e5ecfbe3e2c4 in cluster 2ce64b9964276fd5
etcd02: starting member 18b94c8050f6a27e in cluster 48b9bd5e2db29015
etcd03: starting member 86e317627da6e2e4 in cluster 62da52d8fd9cccca

Why is the member ID here differernt than in the other log message? And why are they starting different clusters instead of the same?

For the setup see #8079

@ccctask
Copy link

ccctask commented Aug 8, 2018

i get it

if you create a etcd member use command

_**etcdctl member add ... http...**_

it will create a new member id

but the target host have a old ID when run etcd it save at --data-dir {pathfile}

you need to delet the data file and create with --initial-cluster-state existing

@RameshRM
Copy link

RameshRM commented Oct 1, 2018

+1

i get it

if you create a etcd member use command

_**etcdctl member add ... http...**_

it will create a new member id

but the target host have a old ID when run etcd it save at --data-dir {pathfile}

you need to delet the data file and create with --initial-cluster-state existing

+1 , Starting with initial-cluster-state existing fixed the issue. Another naive question, which node should start with existing state ?

@mgibson323
Copy link

mgibson323 commented Apr 22, 2021

@haozhenxiao One of the member was bootstrapped via discovery service. You must remove the previous data-dir to clean up the member information. Or the member will ignore the new configuration and start with the old configuration. That is why you see the mismatch.

See https://github.com/coreos/etcd/blob/master/Documentation/admin_guide.md#lifecycle for more details.

Thanks.

The links in replies on this issue are all broken. Here are working ones:

https://etcd.io/docs/v2.3/admin_guide/#lifecycle

https://etcd.io/docs/v3.4/op-guide/configuration/#-data-dir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests