Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a task for HA etcd kubeadm #8301

Merged
merged 1 commit into from
May 11, 2018
Merged

Conversation

chuckha
Copy link
Contributor

@chuckha chuckha commented May 3, 2018

Signed-off-by: Chuck Ha ha.chuck@gmail.com

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 3, 2018
@chuckha
Copy link
Contributor Author

chuckha commented May 3, 2018

/cc @timothysc and @kubernetes/sig-cluster-lifecycle-pr-reviews

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label May 3, 2018
@k8s-ci-robot
Copy link
Contributor

@chuckha: Reiterating the mentions to trigger a notification:
@kubernetes/sig-cluster-lifecycle-pr-reviews

In response to this:

/cc @timothysc and @kubernetes/sig-cluster-lifecycle-pr-reviews

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chuckha
Copy link
Contributor Author

chuckha commented May 3, 2018

also cc @Bradamant3 (I'll take care of the hugo transition later, I want to get some eyes on this for the time being)

@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented May 3, 2018

Deploy preview for kubernetes-io-master-staging ready!

Built with commit fdb3bc0

https://deploy-preview-8301--kubernetes-io-master-staging.netlify.com

Copy link
Member

@detiber detiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of updating the SANs for the generated etcd certificates, this looks really good!

serverCertSANs:
- "${HOST1}"
- "${HOST2}"
- "${HOST3}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are generating unique certs on each etcd host using the copied ca cert, this should probably contain ${CURRENT_HOST} only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this is leftover from my lazy version :)

peerCertSANs:
- "${HOST1}"
- "${HOST2}"
- "${HOST3}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@Bradamant3
Copy link
Contributor

/assign


### Certificate Authority

Create a CA for the etcd certs. This only has to be done on one node.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use CA generated outside of kubeadm? It looks like if /etc/kubernetes/pki/etcd/ca.{crt,key} exists, kubeadm does not try to create it. Then could we mention that too?

I think this would be useful when using other tools with kubeadm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's worth mentioning. I'll add some copy. Thanks for the suggestion.

_data/tasks.yml Outdated
@@ -166,7 +166,7 @@ toc:
- docs/tasks/administer-cluster/reserve-compute-resources.md
- docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods.md
- docs/tasks/administer-cluster/declare-network-policy.md
- docs/tasks/administer-cluster/kms-provider.md
- docs/tasks/administer-cluster/kms-provider.md
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

editor automation 😪

Copy link
Contributor

@jamiehannaford jamiehannaford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! Just a few questions

* Three hosts that can talk to each other over default etcd ports (2379 for
clients and 2380 for peers).
* The `kubeadm` binary installed on each host.
* SSH or similar to one host and from that host, SSH needs to work to both other
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we clarify the wording here?

Copy link
Contributor Author

@chuckha chuckha May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I struggled with this one and it shows. Basically, I want to convey that the users can copy files from host A to hosts B and C. scp would be the easiest from a command line perspective...how about something like "SSH between hosts"? The only thing is that this is not really a requirement for the system itself and just to copy certs around. I'll mull on this for a little bit. Happy to listen to suggestions.

* SSH or similar to one host and from that host, SSH needs to work to both other
hosts (to copy certificate files around).
* Docker installed on every host.
* The `kubelet` installed and running with configuration to read static
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could link them to our installation guide?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it's part of the main instruction so just linking to a block pre-req would be ideal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call 👍

section.

```
export HOST1= # example: 10.0.0.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use dummy vals for this? I think it helps users get an idea of what they need to replace with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing

1. `/etc/kubernetes/pki/etcd/ca.crt`
2. `/etc/kubernetes/pki/etcd/ca.key`

Copy these certs to the other two nodes and put them in `/etc/kubernetes/pki/etcd/`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it's a good idea to output the commands people can run so they can just copy/paste? I find it lowers the cognitive burden for beginners

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to provide commands for users using ubuntu and we can discuss how you think that looks.

Copy link
Member

@timothysc timothysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pretty much agree with @jamiehannaford 's comments.

Please address and we'll get this merged.

* SSH or similar to one host and from that host, SSH needs to work to both other
hosts (to copy certificate files around).
* Docker installed on every host.
* The `kubelet` installed and running with configuration to read static
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it's part of the main instruction so just linking to a block pre-req would be ideal.

1. `/etc/kubernetes/pki/etcd/ca.crt`
2. `/etc/kubernetes/pki/etcd/ca.key`

Copy these certs to the other two nodes and put them in `/etc/kubernetes/pki/etcd/`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

{% endcapture %}

{% capture prerequisites %}
* Three hosts that can talk to each other over default etcd ports (2379 for

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

etcd ports can be configured via kubeadmcfg.yaml so it would be better if this wording reflects that.

@chuckha chuckha force-pushed the ha-etcd branch 2 times, most recently from 63776df to 3e7a600 Compare May 7, 2018 18:56
@chuckha
Copy link
Contributor Author

chuckha commented May 7, 2018

@timothysc @jamiehannaford Changes in, would love some feedback, especially on the swizzling bit.

Copy link
Member

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @chuckha 👏!
Thanks a lot, this is a crucial piece of docs we really need.
Some comments, after those are addressed I want to merge this

---

{% capture overview %}
Kubeadm by default will create a self-hosted etcd cluster of one member. This is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not self-hosted as per the definition of "self-hosting" we have in the community. I'd say it like this:

By default, kubeadm will run one etcd instance in a Static Pod ran by the kubelet on the master. This is not a highly-available setup in case your master with that one etcd replica goes unavailable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to

Kubeadm defaults to running a single member etcd cluster in a static pod managed
by the kubelet on the control plane node. This is not a highly-available setup
as the the etcd cluster contains only one member and cannot sustain any members
becoming unavailable.

document assumes these default ports. However, they are configurable through
the kubeadm config file.
* Each host must [have docker, kubelet, and kubeadm installed][toolbox].
* SSH between hosts is required for certificate copying.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you say you need some infrastructure to copy files around instead of saying SSH is the only way to move files around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I've been struggling with this. I think giving ssh as an example is helpful so I came up with

  • Some infrastructure to copy files between hosts (e.g., ssh).

peerCertSANs:
- "${CURRENT_HOST}"
extraArgs:
initial-cluster-token: etcd-cluster-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great catch. We don't even need this line as it has a default value and we are not spinning up multiple clusters. Here is the relevant documentation:

If spinning up multiple clusters (or creating and destroying a single cluster) with same configuration for testing purpose, it is highly recommended that each cluster is given a unique initial-cluster-token. By doing this, etcd can generate unique cluster IDs and member IDs for the clusters even if they otherwise have the exact same configuration. This can protect etcd from cross-cluster-interaction, which might corrupt the clusters.


{% capture steps %}

## Kubeadm Configuration File
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a Configuration File for kubeadm to consume


## Kubeadm Configuration File

Create a kubeadm configuration file on each host that looks something like this:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like what's scripted below.?

```
$ kubeadm alpha phase certs etcd-server --config=kubeadmcfg.yaml
$ kubeadm alpha phase certs etcd-peer --config=kubeadmcfg.yaml
$ kubeadm alpha phase certs etcd-healthcheck-client --config=kubeadmcfg.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a client cert, I'd rephrase the title to be "unique certs to be generated per peer" or similar

`kubeadm` command to generate a static manifest for etcd.

```
kubeadm alpha phase etcd local --config=kubeadmcfg.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time I see alpha here I really look forward to the phases graduation 😄


## Optional: Check the cluster health

### Get `etcdctl`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the docker image here instead so we don't have to curl/wget etc. instead just

docker run -it --net host -v /etc/kubernetes:/etc/kubernetes quay.io/coreos/etcd:v3.2.14 etcdctl

makes sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah yes, that is definitely an easier approach.

### Check the health

```
./etcd-v3.1.13-linux-amd64/etcdctl --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --ca-file /etc/kubernetes/pki/etcd/ca.crt --endpoints https://${HOST1}:2379 cluster-health
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show successful output?

{% endcapture %}

{% capture whatsnext %}
* Learn how to use this HA etcd cluster as an external etcd when bootstrapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link to the next chapter here?

Copy link
Member

@timothysc timothysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

please address @luxas comments and we'll get this merged.

@timothysc timothysc added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 9, 2018
@chuckha
Copy link
Contributor Author

chuckha commented May 9, 2018

oh right. jekyll got switched out for hugo. Also updating that.

Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
@chuckha
Copy link
Contributor Author

chuckha commented May 9, 2018

@luxas I changed the content significantly and added new content taking your suggestions into account. I'm better about certs & security here.

Copy link
Member

@timothysc timothysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

If folks have other edits we can do them as a knock-on PR.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 9702579 into kubernetes:master May 11, 2018
mbert pushed a commit to mbert/website that referenced this pull request May 15, 2018
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants