-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition with etcd when joining control planes #2001
Comments
it works for some users, not for others. right now our etcd clients race to join, but if something goes wrong kubeadm does nothing. our long term fix is tracked here: /close |
@neolit123: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I noticed when doing my Kubernetes cluster setup, that it would be common to see at least once that either controller 2 or 3 would fail to join the cluster as an additional control plane. The error present would say something like, "etcdserver: re-configuration failed due to not enough started members". It appears joining additional control planes concurrently is somewhat flaky in whether or not it would work. For now, my solution to this would be to join the additional control planes serially vs concurrently. For reference on a related GitHub issue with the same error seen: kubernetes/kubeadm#2001
Is this a BUG REPORT or FEATURE REQUEST?
I'm running cluster-api on docker for mac. I've not had this problem until recently when simultaneously running
kubeadm join
. However, this might be my own problem as I am running--ignore-preflight-errors=all
. I really thought I could join nodes concurrently. Am I wrong in that assumption? Should I revert to joining one node at a time?Choose one: BUG REPORT
/kind bug
Versions
1.15.3 and 1.17.0 ( I expect it also exists in the 1.16 branch).
What happened?
I init'd a control plane and then simultaneously joined two more control planes to the cluster. One of them of them errors out with
What you expected to happen?
I expected to be able to run the control plane join simultaneously and have a bootstrapped cluster.
How to reproduce it (as minimally and precisely as possible)?
This happens when running the docker end-to-end tests in cluster-api, I'm working off a fork so it's not super easy at the moment, but this might not be a real bug and it might be my fault for bad assumptions or ignoring preflight checks.
The text was updated successfully, but these errors were encountered: