Skip to content
This repository has been archived by the owner on Mar 28, 2020. It is now read-only.

randomize etcd member name #1872

Closed
hongchaodeng opened this issue Jan 20, 2018 · 7 comments
Closed

randomize etcd member name #1872

hongchaodeng opened this issue Jan 20, 2018 · 7 comments

Comments

@hongchaodeng
Copy link
Member

hongchaodeng commented Jan 20, 2018

Currently we have a counter in member name to increment when adding new members. This could lose track though, e.g. operator restarts and highest member crashes, all pods are deleted and restore operator knows nothing.

When two etcd pods have the same name, it could lead to bad results -- same DNS record returns two different IPs. In k8s, if we delete a pod, it is asynchronous. And we can't guarantee that when we recreate another new pod with the same name, there would only be exactly one pod of the same. Taking these facts, we can see the problems here. In fact, we have seen real issues like #1825.

We should randomize etcd member name just like ReplicaSet does to each replica pod. This way we can prevent two etcd members from having the same name.

@alexandrem
Copy link

How will we build the list of peers (no discovery) and have TLS work with ALT names using random names?

@hongchaodeng
Copy link
Member Author

@alexandrem
Member DNS names have the same subdomain:

"*.example.default.svc",

@alexandrem
Copy link

alexandrem commented Jan 21, 2018

Right, so it's good with wildcards.

But what about initial peer list?

@hongchaodeng
Copy link
Member Author

But what about initial peer list?

Sorry. I don't understand what's this concern about? Mind give an example?

@alexandrem
Copy link

alexandrem commented Jan 22, 2018

Sorry I might have in mind the case where we change the restartPolicy of pod members, so possibly this doesn't apply at the very moment.

Currently, I assume that if a member is unhealthy then the operator will replace it with a fresh pod and therefore can pass the recent active pod members as peers list.

On the other hand, if we have a restartPolicy: Always then I assume that the pod member who gets rescheduled or restarted for some reasons will have its old peers list configuration and won't be able to join back the cluster.

That is, if members changed since this pod creation.

Is this correct?

@hongchaodeng
Copy link
Member Author

etcd-operator will have the global membership knowledge and configure the peer list for each etcd pod.

But from my understand, even if membership changes during pod replacement, etcd member will still have the logs (data) and sync with leader to catch up missing knowledge.

@alexandrem
Copy link

Ok, I don't have that much operational knowledge of etcd.

I was under the impression that when using static configuration the --initial-cluster parameter values of a member had to match exactly the active members of the cluster, including this new peer name, otherwise it wouldn't sync.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants