Dramatically Simplify Kubernetes Cluster Creation (kubeadm umbrella issue). #11

jbeda · 2016-06-22T17:14:18Z

Description

Creating a new Kubernetes cluster is too hard. We need to simplify the number and types of actions to get a production cluster up and running.

Note that this is different from bringing up a development cluster (single node ala monokube or minikube) or automation around cluster creation (https://github.com/kubernetes/community/wiki/Roadmap:-Cluster-Deployment).

If we do this right, the number of manual steps to get a cluster running should be minimal. This will have the added benefit of making other deployment scenarios (dev cluster, cluster automation) simpler and smaller.

As part of this, we should make simplifying assumptions and have opinionated defaults. An example would be embedding etcd and picking an easy to use network technology. Certificates and trust should be established automatically.

Progress Tracker

FEATURE_STATUS is used for feature tracking and to be updated by @kubernetes/feature-reviewers.
FEATURE_STATUS: IN_DEVELOPMENT

More advice:

Design

Once you get LGTM from a @kubernetes/feature-reviewers member, you can check this checkbox, and the reviewer will apply the "design-complete" label.

Coding

Use as many PRs as you need. Write tests in the same or different PRs, as is convenient for you.
As each PR is merged, add a comment to this issue referencing the PRs. Code goes in the http://github.com/kubernetes/kubernetes repository,
and sometimes http://github.com/kubernetes/contrib, or other repos.
When you are done with the code, apply the "code-complete" label.
When the feature has user docs, please add a comment mentioning @kubernetes/feature-reviewers and they will
check that the code matches the proposed feature and design, and that everything is done, and that there is adequate
testing. They won't do detailed code review: that already happened when your PRs were reviewed.
When that is done, you can check this box and the reviewer will apply the "code-complete" label.

Docs

Write user docs and get them merged in.
User docs go into http://github.com/kubernetes/kubernetes.github.io.
When the feature has user docs, please add a comment mentioning @kubernetes/docs.
When you get LGTM, you can check this checkbox, and the reviewer will apply the "docs-complete" label.

The text was updated successfully, but these errors were encountered:

jbeda · 2016-06-22T17:21:21Z

Other related efforts/prior art:

List of core k8s features that will help all deploys: sharing infrastruture between deployments kubernetes-retired/kube-deploy#123
[closed] Original proposal around this [somewhat outdated]: Dramatically simplify Kubernetes deployment kubernetes#2303
[closed] Proposal to rework Kubernetes deployment CLI: Proposal to rework Kubernetes deployment CLI kubernetes#5472
[closed] RFC: kube-bootstrap: RFC: kube-bootstrap kubernetes#16077
Implement the cluster bootstrap API: Implement the cluster bootstrap API kubernetes#5754 (pulled out of Dramatically simplify Kubernetes deployment kubernetes#2303)
How to get kargo project into kubernetes github tree How to get kargo project into kubernetes github tree ? kubernetes#27948
http://kubernetes.io/docs/getting-started-guides/scratch/
https://github.com/kubernetes/community/wiki/Roadmap:-Cluster-Deployment
https://github.com/kubernetes/kubernetes-anywhere
https://github.com/kubernetes/kube-deploy
https://github.com/coreos/bootkube

[I'll update this comment with new links as they come in]

jbeda · 2016-06-22T17:25:25Z

@mikedanese -- I know this is a lot of what you've been working on. I'd love to get that reflected here and scoped for 1.4. Do you mind shooting some pointers over?

philips · 2016-06-22T17:28:59Z

cc @derekparker @aaronlevy @pbx0 from the CoreOS team working on https://github.com/coreos/bootkube and the self-hosted stuff with @mikedanese to realize a k8s driven creation and update story.

jbeda · 2016-06-22T17:39:46Z

To be extra clear -- I'm proposing that we make this experience part of core kubernetes. The fact that core k8s is a set of things that have to be set up together is a powerful thing but it makes things look very very complex. We should be willing to have a sane set of defaults and embedded solutions built in to the main distribution.

Right now our "manual install" page is incredibly daunting. We should aim to reduce that (at least for a given set of integrated services) to a single screen without the crutch of automation tools that paper over the complexity.

mikedanese · 2016-06-22T20:07:51Z

There is a class of infrastructure (that doesn't currently exist) that would benefit all deployment automations. We should try to enumerate what these items are, give them relative priorities and advocate for them in v1.4 planning. I started to create a list a couple days ago: kubernetes-retired/kube-deploy#123 cc @justinsb @errordeveloper @bgrant0607

pires · 2016-06-22T20:15:38Z

@jbeda kubernetes/kubernetes#5472, kubernetes/kubernetes#16077 and kubernetes/kubernetes#5754 are related to this.

thockin · 2016-06-22T21:02:00Z

Something that crossed my mind with docker 1.12 - built-in kvstoreeans that libnetwork's overlay driver might be viable for us. Having a built-in network mode for Docker installs that works anywhere and doesn't require extra components might be nice.

Might require some work to not assume prefixes per node.

klizhentas · 2016-06-22T22:11:27Z

Just so I understand it better embedded etcd will be optional right? As in production we would want to still deploy etcd separate from the API/Scheduler/Controller

smarterclayton · 2016-06-22T22:14:32Z

Network is the hardest part - you can ignore security and edge cases as long as pods can talk to each other. Leveraging libnetwork seems like a practical choice where possible (or just have a daemonset that drops in your favorite network auto provisioner via CNI). Once the node is started we can run any code.

jbeda · 2016-06-22T22:29:55Z

@klizhentas Yes! The idea is to make small clusters super easy. Folks looking for large clusters will want to manage etcd independently. Users can choose to take on the complexity of breaking everything out but it'll be an advanced move.

jbeda · 2016-06-22T22:30:51Z

@smarterclayton I think we just need to pick something to get going. The easiest zero-config option would be the way to go.

derekwaynecarr · 2016-06-22T22:47:13Z

I think the model we present should not look that different from the
production model. I am a fan of making it easy to launch a Kubelet that
then has a static manifest with sensible defaults to launch control plane
that is not hidden in a mess of salt. On that model, etcd can still be a
pod as well as other parts of our control plane.

On Wednesday, June 22, 2016, Joe Beda notifications@github.com wrote:

@smarterclayton https://github.com/smarterclayton I think we just need
to pick something to get going. The easiest zero-config option would be
the way to go.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AF8dbA8VZgUidBorolDZ7ajk8w7C0O6xks5qObecgaJpZM4I8A4G
.

klizhentas · 2016-06-22T22:52:04Z

@jbeda so in the simple case kube-controller-manager, kube-apiserver,kubelet + etcd will be one go binary?

jbeda · 2016-06-22T23:17:16Z

@klizhentas @derekwaynecarr I don't know what the binaries will be that we ship. I do know that we have to make it dead easy to download a thing and get it up and running. If we can get stuff self hosted on the cluster in a container, that would be a good solution. The number of steps needs to be reduced to ~1 per node.

Let's start with the ideal set of things we want the end user to type. From there we can figure how to get there in a sustainable way (and with the opportunity to do everything in a more explicit way for advanced users).

derekwaynecarr · 2016-06-22T23:18:48Z

I think we can demonstrate the composable nature of the platform without
having to build monolithic binaries that are contrary to the spirit of
microservice architecture.

There are two separate but related topics: ability to create a node (and
bring up control plane), ability to have a new node join an existing
cluster easily. I worry moving to monolithic binaries don't necessarily
help either cause.

I also think if we want to advocate being agnostic about a particular
container runtime, the setup process should follow suite. This is why I
like @mikedanese ideas in the space since they start with the Kubelet
(which could work with any container runtime it's pointed against) rather
than starting with a particular container runtime launching the Kubelet.

On Wednesday, June 22, 2016, Alexander Klizhentas notifications@github.com
wrote:

@jbeda https://github.com/jbeda so in the simple case
kube-controller-manager, kube-apiserver,kubelet + etcd will be one go
binary?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AF8dbMrm4XRDpultZxQyXkAXkGxMpg13ks5qObyVgaJpZM4I8A4G
.

derekwaynecarr · 2016-06-22T23:20:19Z

@jbeda - agree on focusing on desired ux command first

On Wednesday, June 22, 2016, Derek Carr decarr@redhat.com wrote:

I think we can demonstrate the composable nature of the platform without
having to build monolithic binaries that are contrary to the spirit of
microservice architecture.

There are two separate but related topics: ability to create a node (and
bring up control plane), ability to have a new node join an existing
cluster easily. I worry moving to monolithic binaries don't necessarily
help either cause.

I also think if we want to advocate being agnostic about a particular
container runtime, the setup process should follow suite. This is why I
like @mikedanese ideas in the space since they start with the Kubelet
(which could work with any container runtime it's pointed against) rather
than starting with a particular container runtime launching the Kubelet.

On Wednesday, June 22, 2016, Alexander Klizhentas <
notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

@jbeda https://github.com/jbeda so in the simple case
kube-controller-manager, kube-apiserver,kubelet + etcd will be one go
binary?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AF8dbMrm4XRDpultZxQyXkAXkGxMpg13ks5qObyVgaJpZM4I8A4G
.

aaronlevy · 2016-06-22T23:30:23Z

Biased, because I'm already working on it, but I'd advocate for putting effort into self-hosted (k8s-on-k8s) installations as a means of simplifying both cluster creation, and lifecycle management.

If the installation contract becomes "I can run a kubelet" and everything else is built atop that in containers -- then installation criteria could become as simple as "does your node pass node-e2e tests?"

More or less this is already possible in a simple case with all core components being run as static pods. This is how many installations work, and is well understood. The problem with this approach is that it becomes difficult to transition from "this was easy to install" to "I want to customize/modify/update this installation".

As soon as we want to make modifications to this cluster, we're back to some kind of "modify files on disk" configuration management (salt/ansible/chef/etc). In this case it doesn't preclude us from having a "simple" installation and other "production" deployment tools. Or even decide on standardization/contract where a more complex tool can take over from the other static installation (e.g. kube-apiserver.yaml will exist in /etc/kubernetes/manifests)

Alternatively, in the self-hosted scenario, the static installation can remain simple on first install (in concept, replace your static pod definitions with deployments/daemonsets). But then can be modified / extended without relying on external configuration management (or external deployment tooling that needs to evolve in lock-step with your cluster) -- everything is by definition just a kubernetes object.

Updates to the cluster can become api-driven / or even an update-controller application. Assets (tls, configuration, flags) travel with their components as they are also just kubernetes objects (secrets / configMaps). We get all the niceties of kubernetes lifecycle management.

Now all that being said, networking really is a hard part. Maybe this comes down to figuring out the coordination at the kubelet level + cni (e.g. how do I self-host flannel-client + allow it to configure networking for subsequent pods).

klizhentas · 2016-06-22T23:40:51Z

For this particular feature I think it would help to go backwards - not from implementation to UX, but vise-versa. If we figure out user experience with this right, the implementation will follow.

Here's the ideal scenario that users can see on k8s quickstart page:

Starting single node k8s

wget https://kuberenetes.io/releases/latest/kube
# starts both node, API, etcd, all components really
kube start

This will let users to explore kubernetes, start containers

Adding node

Then there's a use case when users want to get to run smaller clusters to play with failover, HA and so on.

On first node, execute:

# adds provisioning token to securely add new nodes to the cluster
kube token add
<token1>

On any node to be added in the cluster:

kube start --token=<token1> --cluster=https://<ip of the first node>

That's the minimum amount of steps I can imagine to bootstrap the cluster in dev mode. If we figure out this UX first, everything else will follow.

bgrant0607 · 2016-06-23T00:38:33Z

This has become a discussion issue rather than a tracking issue. It's great that lots of people are interested in this topic. We could use help. I created a github team (sig-cluster-lifecycle) and googlegroup (kubernetes-sig-cluster-lifecycle), which you can request to join. I'm going to rename the sig-install slack channel to sig-cluster-lifecycle. We should brainstorm in the googlegroup rather than generate more github notifications.

Also, a number of people have been working in this area for a while. We're going to summarize the current state and make a prioritized list of work items that have already been identified.

metral · 2016-06-23T01:09:41Z

I took the liberty of creating a post that summarizes the generic expectations stated here in the googlegroup (kubernetes-sig-cluster-lifecycle) to continue the brainstorming: https://groups.google.com/forum/#!topic/kubernetes-sig-cluster-lifecycle/LRMygt2YNrE

philips · 2016-06-23T01:56:36Z

On Wed, Jun 22, 2016 at 2:02 PM Tim Hockin notifications@github.com wrote:

Something that crossed my mind with docker 1.12 - built-in kvstoreeans
that libnetwork's overlay driver might be viable for us. Having a built-in
network mode for Docker installs that works anywhere and doesn't require
extra components might be nice.

Relying on libnetwork for bootstrapping sounds like a mess longterm. I
would much rather figure out bootstrap, reconfiguration, etc of CNI, which
we need to figure out anyways, than make some compromise that puts a new
dependency on the Docker engine.

aronchick · 2016-06-23T04:52:39Z

Reminder - these issues (ideally) should be for only discussion about the flow of the work items related to the feature. If you would like to discuss (please do!) please do it in the Google group.

idvoretskyi · 2016-06-23T16:30:07Z

Please, take a look at this proposal - kubernetes/kubernetes#27948

jbeda · 2016-06-23T16:37:24Z

@aronchick @bgrant0607 @metral Agreed -- let's take this to the mailing list.

https://groups.google.com/forum/#!forum/kubernetes-sig-cluster-lifecycle

jimmycuadra · 2016-07-21T08:37:26Z

@jbeda For your list of related work: https://github.com/InQuicker/kaws

neolit123 · 2018-09-06T00:20:38Z

this gets my 👍 for GA in 1.12, in terms of docs.
(we do have to covert some other aspects.)

we have decent documentation in place and we haven't had major, negative feedback on the instructions for cluster creation with kubeadm which were improved a lot in 1.11. in the meantime we continue to improve the docs where possible.

pages to note:
https://kubernetes.io/docs/setup/independent/install-kubeadm/
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
https://kubernetes.io/docs/setup/independent/high-availability/

kacole2 · 2018-10-08T17:24:50Z

Hi. This is currently being tracked for 1.13. I want to see what changes are being made for 1.13 for the feature to be considered GA/Stable.

This release is targeted to be more ‘stable’ and will have an aggressive timeline. Please only include this enhancement if there is a high level of confidence it will meet the following deadlines:

Docs (open placeholder PRs): 11/8
Code Slush: 11/9
Code Freeze Begins: 11/15
Docs Complete and Reviewed: 11/27

Thanks!

timothysc · 2018-10-09T21:44:55Z

Our objective is to take kubeadm to GA this cycle.

bogdando · 2018-10-11T10:14:32Z

Reading into https://kubernetes.io/docs/setup/independent/high-availability/ I noticed there is now a neat config file (kubeadm-config.yaml in the guide). But I was not sure what is the status for "features flags" support for that config? Can we have init steps executed selectively yet? Or had it lost from the radar?

dims · 2018-10-11T11:33:13Z

@bogdando see the "full" example in the doc for featureGates:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/v1alpha3/doc.go#L160-L248

neolit123 · 2018-10-11T12:10:46Z

@bogdando

Can we have init steps executed selectively yet? Or had it lost from the radar?

yes, in 1.12 there is kubeadm alpha phase.... which handles init phases. in 1.13 these would be more widely available.

AishSundar · 2018-10-17T06:10:01Z

@neolit123 @roberthbailey is this still on track for GA in 1.1.3? do we have a link to list of pending PRs or issues for us to track this better?

neolit123 · 2018-10-17T13:10:46Z

@AishSundar

i don't have powers here to update the OP.
in terms of docs we are in a good state, test coverage might need some attention as much i understand the goals here.

i'm going to have to defer to @timothysc on this one.

AishSundar · 2018-10-29T00:12:37Z

@timothysc could you please provide a more latest update on the status of Kubeadm for GA in 1.13. Specifically around

(i) how many and which PRs (code and test) are pending
(ii) latest status of docs and links to docs PR
(iii) Status of failing kubeadmn tests in master blocking

With code slush nearing us on 11/9, could you provide us with an ETA of when you expect all pending things to land in master? Given we need sometime to stabilize things before Code freeze on 11/16, it might be a good idea to timebox the remaining work and make a Go/No-Go call for GA in 1.13 before Code freeze. Thanks !

@neolit123 @kacole2 @tfogo

neolit123 · 2018-10-29T11:45:50Z

(iii) Status of failing kubeadmn tests in master blocking

this should have green runs today.

timothysc · 2018-10-30T18:29:28Z

@AishSundar I wish it were that simple. There are a number of PRs in flight and most of the docs will be minor changes. Progress is good, but I probably won't have a good answer for you until ~ next week. Almost all of the work is not "net-new" features but cleanup and bug fixes in shuffling for GA, which will likely span into slush.

AishSundar · 2018-10-30T23:05:43Z

Ack that @timothysc and thanks for the update and consolidating all the Kubeadm GA work under this issue. We will check back once in slush.

claurence · 2018-11-07T15:47:52Z

@timothysc Hi I'm an enhancements shadow for 1.13 - checking in on progress for this issue. Code slush is 11/9 and Code freeze is 11/15, is this issue still on track for those milestones? Thanks!

kacole2 · 2018-11-08T19:36:50Z

@timothysc can you drop in a list of PRs we should be tracking for Kubeadm going GA? Thanks!

neolit123 · 2018-11-08T19:51:31Z

the issue is on track, yes.

kubeadm labels are auto-applied to our PRs, too many to log and track on our side:
https://github.com/kubernetes/kubernetes/pulls?q=is%3Apr+is%3Aopen+label%3Aarea%2Fkubeadm

as explained in a release team meeting these are our 2 GA items:

the list in here is a critical one:
kubernetes/kubeadm#1163

this is for the kubeadm config:
kubernetes/kubeadm#911

docs are mostly command reshuffle and are TBD.

AishSundar · 2018-11-12T00:19:10Z

@timothysc @neolit123 can one of you attend the Release burndown meeting next week (Mon, Wed or Fri) to give the latest update on Kubeadm GA.

neolit123 · 2018-11-12T15:57:48Z

i can try joining today.

update on docs:
for GA we only need 2 docs PRs, one already merged and one is a WIP placeholder, TBD before 19th:
kubernetes/website#10937 (comment)

neolit123 · 2018-11-18T19:27:57Z

update:

all feature PRs were merged. during code freeze we have time to dig for bugs.
remaining docs PR is WIP:
kubernetes/website#10960

neolit123 · 2018-12-03T22:46:19Z

our docs and all our items for 1.13 are in place.
this tracking issue can finally be closed.
kubeadm is now GA. 🎉

/close

k8s-ci-robot · 2018-12-03T22:46:21Z

@neolit123: Closing this issue.

In response to this:

our docs and all our items for 1.13 are in place.
this tracking issue can finally be closed.
kubeadm is now GA. 🎉

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Add machine-instance-lifecycle proposal

Update priority scheme and workloadSelector struct

update feature gating changes section

idvoretskyi mentioned this issue Jun 23, 2016

How to get kargo project into kubernetes github tree ? kubernetes/kubernetes#27948

Closed

errordeveloper mentioned this issue Jun 28, 2016

Addon management layer #18

Closed

18 tasks

philips mentioned this issue Jun 29, 2016

Make etcdmain/serve.go public and more modular etcd-io/etcd#5430

Closed

idvoretskyi modified the milestone: v1.4 Jul 18, 2016

timothysc modified the milestones: v1.12, v1.13 Sep 10, 2018

timothysc mentioned this issue Oct 30, 2018

Extensible configuration/invocation of kubeadm #356

Closed

k8s-ci-robot closed this as completed Dec 3, 2018

kacole2 added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Jul 15, 2019

ingvagabund pushed a commit to ingvagabund/enhancements that referenced this issue Apr 2, 2020

Merge pull request kubernetes#11 from enxebre/machine-instance-lifecycle

b41622f

Add machine-instance-lifecycle proposal

astoycos pushed a commit to astoycos/enhancements-1 that referenced this issue Jan 7, 2022

Merge pull request kubernetes#11 from abhiraut/minor-updates

87bf94e

Update priority scheme and workloadSelector struct

siyuanfoundation pushed a commit to siyuanfoundation/kep that referenced this issue Jan 26, 2024

Merge pull request kubernetes#11 from siyuanfoundation/compat-versions

92be63c

update feature gating changes section

Dramatically Simplify Kubernetes Cluster Creation (kubeadm umbrella issue). #11

Dramatically Simplify Kubernetes Cluster Creation (kubeadm umbrella issue). #11

Comments

jbeda commented Jun 22, 2016 • edited Loading

Description

Progress Tracker

jbeda commented Jun 22, 2016 • edited Loading

jbeda commented Jun 22, 2016

philips commented Jun 22, 2016 • edited Loading

jbeda commented Jun 22, 2016 • edited Loading

mikedanese commented Jun 22, 2016 • edited Loading

pires commented Jun 22, 2016

thockin commented Jun 22, 2016

klizhentas commented Jun 22, 2016

smarterclayton commented Jun 22, 2016

jbeda commented Jun 22, 2016

jbeda commented Jun 22, 2016

derekwaynecarr commented Jun 22, 2016

klizhentas commented Jun 22, 2016 • edited Loading

jbeda commented Jun 22, 2016

derekwaynecarr commented Jun 22, 2016

derekwaynecarr commented Jun 22, 2016

aaronlevy commented Jun 22, 2016

klizhentas commented Jun 22, 2016 • edited Loading

Starting single node k8s

Adding node

bgrant0607 commented Jun 23, 2016

metral commented Jun 23, 2016

philips commented Jun 23, 2016

aronchick commented Jun 23, 2016

idvoretskyi commented Jun 23, 2016

jbeda commented Jun 23, 2016 • edited Loading

jimmycuadra commented Jul 21, 2016

neolit123 commented Sep 6, 2018 • edited Loading

kacole2 commented Oct 8, 2018

timothysc commented Oct 9, 2018

bogdando commented Oct 11, 2018

dims commented Oct 11, 2018

neolit123 commented Oct 11, 2018

AishSundar commented Oct 17, 2018

neolit123 commented Oct 17, 2018

AishSundar commented Oct 29, 2018

neolit123 commented Oct 29, 2018

timothysc commented Oct 30, 2018

AishSundar commented Oct 30, 2018

claurence commented Nov 7, 2018

kacole2 commented Nov 8, 2018

neolit123 commented Nov 8, 2018

AishSundar commented Nov 12, 2018

neolit123 commented Nov 12, 2018

neolit123 commented Nov 18, 2018

neolit123 commented Dec 3, 2018

k8s-ci-robot commented Dec 3, 2018

jbeda commented Jun 22, 2016 •

edited

Loading

jbeda commented Jun 22, 2016 •

edited

Loading

philips commented Jun 22, 2016 •

edited

Loading

jbeda commented Jun 22, 2016 •

edited

Loading

mikedanese commented Jun 22, 2016 •

edited

Loading

klizhentas commented Jun 22, 2016 •

edited

Loading

klizhentas commented Jun 22, 2016 •

edited

Loading

jbeda commented Jun 23, 2016 •

edited

Loading

neolit123 commented Sep 6, 2018 •

edited

Loading