Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding a solution for etcd #277

Closed
jamiehannaford opened this issue May 24, 2017 · 12 comments
Closed

Finding a solution for etcd #277

jamiehannaford opened this issue May 24, 2017 · 12 comments
Assignees
Labels
area/HA priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@jamiehannaford
Copy link
Contributor

jamiehannaford commented May 24, 2017

A few weeks a go some folks on Slack brought up the idea of defining requirements for highly available etcd on kubeadm-provisioned clusters. The idea was to define requirements before the implementation of any solution. This discussion was continued in the sig-cluster-lifecycle meeting on May 16th 2017, where we came up with some initial critieria:

  1. High availability
    a. Recovers from member failure
    b. Recovers from quorum loss
    c. Recovery from full cluster failure i.e. power-off
    d. Recovers from partial / failed / interrupted upgrades
  2. Handles discovery of etcd peers
  3. Secure by default
    a. TLS encryption
    b. Certificate rotation
  4. Support multiple form factors
    a. Non-self hosted
    b. Self-hosted (optional)
  5. Ability to restore from a backup (possibly not backup itself)
  6. Upgrades
    a. Rolling upgrades
    b. Downgrades (but tricky because of etcd)
  7. Resize/scale cluster from 1 -> 3 -> 5 members
  8. Ease of installation/teardown

Are there any I've missed?

The next stage is proposing solutions that meet the above criteria and can be verified in a fork.

cc/ @timothysc @justinsb @philips @xiang90 @aaronlevy

@jamiehannaford
Copy link
Contributor Author

My proposed solution is the etcd-operator, since it provides nearly all criteria out of the box:

  • High availability
    • Recovers from member failure (see here)
    • Recovers from quorum loss (see here)
    • Recovery from full cluster failure i.e. power-off (I think this can be accomplished using snapshotting and the operator's built in restart tolerance)
    • Recovers from partial / failed / interrupted upgrades
  • Handles discovery of etcd peers
  • Secure by default
    • TLS encryption (see here)
    • Certificate rotation (no, but my personal preference is to use an external tool, like a cert-rotator operator to handle TLS cert rotation across the cluster)
  • Support multiple form factors
    • Non-self hosted
    • Self-hosted
  • Ability to restore from a backup (see here)
  • Upgrades
    • Rolling upgrades (see here)
    • Downgrades (but tricky because of etcd)
  • Resize/scale cluster from 1 -> 3 -> 5 members (see here)
  • Ease of installation/teardown

I've already integrated kubeadm and etcd-operator successfully in this PR, and here is the fork.

I think it's probably worthwhile to come up with a more granular disaster recovery requirement list, and also think about to what degree an etcd solution should cover all the bases. We already have several issues where this is tracked, so we should take into account all the suggested solutions there too:

@xiang90
Copy link

xiang90 commented May 24, 2017

Recovers from partial / failed / interrupted upgrades

see kubernetes-retired/bootkube#528

@timothysc
Copy link
Member

@jamiehannaford if there is a branch you would like reviewed I'd be happy to go through it now.

@jamiehannaford
Copy link
Contributor Author

@xiang90 Awesome. So if a failed upgrade occurs, the user can manual restore from a backup file. Is there a way that etcd can automatically check specific locations (like local FS, S3) for backups without the user needing to specify one manually?

For example, assume that the etcd-operator has been backing stuff up to a S3 container. When it initialises it checks the same bucket and boots from there (this assume the user hasn't changed any backup options).

@jamiehannaford
Copy link
Contributor Author

jamiehannaford commented May 25, 2017

@timothysc Thanks! The only branch I have is the one I submitted in my PR. I think you've already gone through this though. Unless you meant something else?

There are a bunch of comments on that PR which I can start to address as a next step forward. I think I'll also add TLS secrets to the PR too. Should I go ahead and do that?

@timothysc timothysc added this to the v1.8 milestone May 25, 2017
@timothysc timothysc added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label May 25, 2017
@luxas
Copy link
Member

luxas commented May 29, 2017

There are a bunch of comments on that PR which I can start to address as a next step forward. I think I'll also add TLS secrets to the PR too. Should I go ahead and do that?

@jamiehannaford Feel free to. I'm gonna try to look at the TLS Secrets PR this week so it might yet change (@andrewrynhard), but I don't expect it to be part of v1.7 to give us a little more time to think about it until v1.8

@timothysc timothysc added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jun 6, 2017
@anguslees
Copy link
Member

For example, assume that the etcd-operator has been backing stuff up to a S3 container. When it initialises it checks the same bucket and boots from there

I'd just like to highlight that doing something like this automatically is a terrible idea and will give you multiple sources of truth if etcd is internally partitioned.

I suspect recovery will need to be manually triggered, because by definition it is required when the etcd cluster is incapable of making robust automatic decisions.

@xiang90
Copy link

xiang90 commented Aug 15, 2017

I'd just like to highlight that doing something like this automatically is a terrible idea and will give you multiple sources of truth if etcd is internally partitioned.

totally agree. we designed this to be a manual work, at least at etcd operator side.

@luxas
Copy link
Member

luxas commented Aug 19, 2017

Moving milestone to v1.9. In v1.8, we're gonna stick with a local etcd instance listening on localhost.

@luxas luxas modified the milestones: v1.9, v1.8 Aug 19, 2017
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Oct 26, 2017
Automatic merge from submit-queue (batch tested with PRs 54593, 54607, 54539, 54105). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add HA feature gate and minVersion validation

**What this PR does / why we need it**:

As we add more feature gates, there might be occasions where a feature is only available on newer releases of K8s. If a user makes a mistake, we should notify them as soon as possible in the init procedure and not them go down the path of hard-to-debug component issues.

Specifically with HA, we ideally need the new `TaintNodesByCondition` (added in v1.8.0 but working in v1.9.0).

**Which issue this PR fixes:**

kubernetes/kubeadm#261
kubernetes/kubeadm#277

**Release note**:
```release-note
Feature gates now check minimum versions
```

/cc @kubernetes/sig-cluster-lifecycle-pr-reviews @luxas @timothysc
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Nov 1, 2017
Automatic merge from submit-queue (batch tested with PRs 49840, 54937, 54543). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add self-hosted etcd API to kubeadm

**What this PR does / why we need it**:

This PR is part of a larger set that implements self-hosted etcd. This PR takes a first step by adding:

1. new API types in `cmd/kubeadm/app/apis` for configuring self-hosted etcd 
2. new Go types in `cmd/kubeadm/app/phases/etcd/spec` used for constructing EtcdCluster CRDs for the etcd-operator. The reason we define these in trunk is because kubeadm cannot import `github.com/coreos/etcd-operator` as a dependency until it's in its own repo. Until then, we need to redefine the structs in our codebase.

**Which issue this PR fixes**:

kubernetes/kubeadm#261
kubernetes/kubeadm#277

**Special notes for your reviewer**:

This is the first step PR in order to save reviewers from a goliath PR

**Release note**:
```release-note
NONE
```
@luxas
Copy link
Member

luxas commented Nov 20, 2017

Moving milestone for this to v1.10 as we depend on changes being made to the operator before we can use it and the code freeze is coming up.

@timothysc timothysc modified the milestones: v1.10, v1.11 Jan 24, 2018
@timothysc
Copy link
Member

Given all the history here and recent feedback, we need to go with the non-operator option.

@timothysc
Copy link
Member

So I'm going to close this issue and open a new one to outline the doc on using the existing commands to lay down etcd, we will likely have to wait until we have some of the other phases work done as well.

/cc @chuckha @fabriziopandini @stealthybox

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HA priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

5 participants