-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a task for HA etcd kubeadm #8301
Conversation
/cc @timothysc and @kubernetes/sig-cluster-lifecycle-pr-reviews |
@chuckha: Reiterating the mentions to trigger a notification: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
also cc @Bradamant3 (I'll take care of the hugo transition later, I want to get some eyes on this for the time being) |
Deploy preview for kubernetes-io-master-staging ready! Built with commit fdb3bc0 https://deploy-preview-8301--kubernetes-io-master-staging.netlify.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of updating the SANs for the generated etcd certificates, this looks really good!
serverCertSANs: | ||
- "${HOST1}" | ||
- "${HOST2}" | ||
- "${HOST3}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are generating unique certs on each etcd host using the copied ca cert, this should probably contain ${CURRENT_HOST} only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this is leftover from my lazy version :)
peerCertSANs: | ||
- "${HOST1}" | ||
- "${HOST2}" | ||
- "${HOST3}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
/assign |
|
||
### Certificate Authority | ||
|
||
Create a CA for the etcd certs. This only has to be done on one node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use CA generated outside of kubeadm? It looks like if /etc/kubernetes/pki/etcd/ca.{crt,key}
exists, kubeadm does not try to create it. Then could we mention that too?
I think this would be useful when using other tools with kubeadm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's worth mentioning. I'll add some copy. Thanks for the suggestion.
_data/tasks.yml
Outdated
@@ -166,7 +166,7 @@ toc: | |||
- docs/tasks/administer-cluster/reserve-compute-resources.md | |||
- docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods.md | |||
- docs/tasks/administer-cluster/declare-network-policy.md | |||
- docs/tasks/administer-cluster/kms-provider.md | |||
- docs/tasks/administer-cluster/kms-provider.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
editor automation 😪
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! Just a few questions
* Three hosts that can talk to each other over default etcd ports (2379 for | ||
clients and 2380 for peers). | ||
* The `kubeadm` binary installed on each host. | ||
* SSH or similar to one host and from that host, SSH needs to work to both other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we clarify the wording here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I struggled with this one and it shows. Basically, I want to convey that the users can copy files from host A to hosts B and C. scp
would be the easiest from a command line perspective...how about something like "SSH between hosts"? The only thing is that this is not really a requirement for the system itself and just to copy certs around. I'll mull on this for a little bit. Happy to listen to suggestions.
* SSH or similar to one host and from that host, SSH needs to work to both other | ||
hosts (to copy certificate files around). | ||
* Docker installed on every host. | ||
* The `kubelet` installed and running with configuration to read static |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could link them to our installation guide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, it's part of the main instruction so just linking to a block pre-req would be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call 👍
section. | ||
|
||
``` | ||
export HOST1= # example: 10.0.0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use dummy vals for this? I think it helps users get an idea of what they need to replace with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
1. `/etc/kubernetes/pki/etcd/ca.crt` | ||
2. `/etc/kubernetes/pki/etcd/ca.key` | ||
|
||
Copy these certs to the other two nodes and put them in `/etc/kubernetes/pki/etcd/`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it's a good idea to output the commands people can run so they can just copy/paste? I find it lowers the cognitive burden for beginners
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to provide commands for users using ubuntu and we can discuss how you think that looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pretty much agree with @jamiehannaford 's comments.
Please address and we'll get this merged.
* SSH or similar to one host and from that host, SSH needs to work to both other | ||
hosts (to copy certificate files around). | ||
* Docker installed on every host. | ||
* The `kubelet` installed and running with configuration to read static |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, it's part of the main instruction so just linking to a block pre-req would be ideal.
1. `/etc/kubernetes/pki/etcd/ca.crt` | ||
2. `/etc/kubernetes/pki/etcd/ca.key` | ||
|
||
Copy these certs to the other two nodes and put them in `/etc/kubernetes/pki/etcd/`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
{% endcapture %} | ||
|
||
{% capture prerequisites %} | ||
* Three hosts that can talk to each other over default etcd ports (2379 for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
etcd ports can be configured via kubeadmcfg.yaml
so it would be better if this wording reflects that.
63776df
to
3e7a600
Compare
@timothysc @jamiehannaford Changes in, would love some feedback, especially on the swizzling bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @chuckha 👏!
Thanks a lot, this is a crucial piece of docs we really need.
Some comments, after those are addressed I want to merge this
--- | ||
|
||
{% capture overview %} | ||
Kubeadm by default will create a self-hosted etcd cluster of one member. This is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not self-hosted as per the definition of "self-hosting" we have in the community. I'd say it like this:
By default, kubeadm will run one etcd instance in a Static Pod ran by the kubelet on the master. This is not a highly-available setup in case your master with that one etcd replica goes unavailable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to
Kubeadm defaults to running a single member etcd cluster in a static pod managed
by the kubelet on the control plane node. This is not a highly-available setup
as the the etcd cluster contains only one member and cannot sustain any members
becoming unavailable.
document assumes these default ports. However, they are configurable through | ||
the kubeadm config file. | ||
* Each host must [have docker, kubelet, and kubeadm installed][toolbox]. | ||
* SSH between hosts is required for certificate copying. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you say you need some infrastructure to copy files around instead of saying SSH is the only way to move files around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I've been struggling with this. I think giving ssh as an example is helpful so I came up with
- Some infrastructure to copy files between hosts (e.g., ssh).
peerCertSANs: | ||
- "${CURRENT_HOST}" | ||
extraArgs: | ||
initial-cluster-token: etcd-cluster-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, great catch. We don't even need this line as it has a default value and we are not spinning up multiple clusters. Here is the relevant documentation:
If spinning up multiple clusters (or creating and destroying a single cluster) with same configuration for testing purpose, it is highly recommended that each cluster is given a unique initial-cluster-token. By doing this, etcd can generate unique cluster IDs and member IDs for the clusters even if they otherwise have the exact same configuration. This can protect etcd from cross-cluster-interaction, which might corrupt the clusters.
|
||
{% capture steps %} | ||
|
||
## Kubeadm Configuration File |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a Configuration File for kubeadm to consume
|
||
## Kubeadm Configuration File | ||
|
||
Create a kubeadm configuration file on each host that looks something like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like what's scripted below.
?
``` | ||
$ kubeadm alpha phase certs etcd-server --config=kubeadmcfg.yaml | ||
$ kubeadm alpha phase certs etcd-peer --config=kubeadmcfg.yaml | ||
$ kubeadm alpha phase certs etcd-healthcheck-client --config=kubeadmcfg.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a client cert, I'd rephrase the title to be "unique certs to be generated per peer" or similar
`kubeadm` command to generate a static manifest for etcd. | ||
|
||
``` | ||
kubeadm alpha phase etcd local --config=kubeadmcfg.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time I see alpha here I really look forward to the phases graduation 😄
|
||
## Optional: Check the cluster health | ||
|
||
### Get `etcdctl` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the docker image here instead so we don't have to curl/wget etc. instead just
docker run -it --net host -v /etc/kubernetes:/etc/kubernetes quay.io/coreos/etcd:v3.2.14 etcdctl
makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hah yes, that is definitely an easier approach.
### Check the health | ||
|
||
``` | ||
./etcd-v3.1.13-linux-amd64/etcdctl --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --ca-file /etc/kubernetes/pki/etcd/ca.crt --endpoints https://${HOST1}:2379 cluster-health |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show successful output?
{% endcapture %} | ||
|
||
{% capture whatsnext %} | ||
* Learn how to use this HA etcd cluster as an external etcd when bootstrapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add link to the next chapter here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
please address @luxas comments and we'll get this merged.
oh right. jekyll got switched out for hugo. Also updating that. |
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
@luxas I changed the content significantly and added new content taking your suggestions into account. I'm better about certs & security here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
If folks have other edits we can do them as a knock-on PR.
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: timothysc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
Signed-off-by: Chuck Ha ha.chuck@gmail.com