Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather user cases for kubeadm operator from CAPI side #7044

Closed
pacoxu opened this issue Aug 10, 2022 · 11 comments
Closed

Gather user cases for kubeadm operator from CAPI side #7044

pacoxu opened this issue Aug 10, 2022 · 11 comments
Labels
kind/proposal Issues or PRs related to proposals. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@pacoxu
Copy link
Member

pacoxu commented Aug 10, 2022

User Story
kubeadm operator was discussed many times before.
Generally, it can handle

  • kubeadm cluster upgrade (some dry-run or prechecks)
  • kubeadm configuration changes
  • certs rotation

We want to gather a list of use cases from the CAPI side.

Detailed Description

I tried to build a kubeadm operator that can help users on day2.
I opened a thread in https://groups.google.com/g/kubernetes-sig-cluster-lifecycle/c/LMAABdj31DI as well.

Anything else you would like to add:

Descriptions about the current kubeadm operator status.

Not sure if this is the right place to discuss on kubeadm operator. There are some threads in kubernetes/kubeadm#2317 and kubernetes/enhancements#2505.

I write a simple kubelet-reloader as a tool for kubeadm operator.

  • kubelet-reloader will watch on /usr/bin/kubelet-new.
  • once there is a different version of kubelet-new, the reloader will replace /usr/bin/kubelet and restart kubelet.
  • todo: verify the configuration of kubelet and version before replacing it.

Currently the kubeadm-operator v0.1.0 can support upgrade cross versions like v1.22 to v1.24.

  • kubeadm operator will download kubectl/kubelet/kubeadm and upgrade.(The current logic will download the binary directly, and I am not sure if yum upgrade/apt upgrade would be better.)
  • kubelet will be placed in /usr/bin/kubelet-new for kubelet reloader.

See quick-start.

Some thoughts on the next steps

My version https://github.com/pacoxu/kubeadm-operator is based on Fabrizio's first implementation kubernetes/kubeadm#2342 which is following the KEP https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2505-Kubeadm-operator.
BTW, https://github.com/chendave/kubeadm-operator is a similar project to mine.

Hope to receive your feedback and suggestions, or requirements on kubeadm operator.

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 10, 2022
@killianmuldoon
Copy link
Contributor

/cc @chrischdi seems like something you might be interested in! 🙂

@sbueringer
Copy link
Member

cc @fabriziopandini

@fabriziopandini
Copy link
Member

@pacoxu thanks for reaching out with this issue.
As you probably know cluster API promotes the idea of "immutable" infrastructure, and so every mutation happens by creating a new Machine/deleting the old one.

There is some starting discussion about mutability in CAPI (cc @enxebre), and personally, I would like to be more involved in it and work to a high-level design doc; IMO there are a few areas this document should figure out:

  • The API, modeling the current and desired state of components and providing the UX for mutable changes
  • The operating model, that most probably will be a mix of mutability and immutability
  • The scope and boundairies of mutability in CAPI; e.g. os or kernel upgrades are in scope or not for mutability
  • The split of responsibilities between different layers of the stack: what is in CAPI core, what in providers, what else?

When we will get to the last point, this is probably where we will start to figure out if something like the kubeadm operator can be relevant for CAPI, and how/which use cases it can cover. In the meantime, I will try to join the kubeadm office hours to follow discussion there and to brainstorm about this idea

@pacoxu
Copy link
Member Author

pacoxu commented Sep 1, 2022

@fabriziopandini Just watched the kubeadm office hour record. https://mail.google.com/mail/u/0/#inbox/FMfcgzGqPpgGlrWbcSVJpPnNjmbXXcQf is updated.

Thanks for your advice and comments on it. More feedback and discussions are needed.
For the design, I need to think again about a better design and limitations.

@shawn111
Copy link

shawn111 commented Sep 7, 2022

This operator provide something just like rpm-ostree with --apply-live.
rpm-ostree focus on immutable and --apply-live provide some convenience.
It is useful but can not cover all the cases.
We still have chances need to replace like runc/containerd.

Sorry, I'm new to this project. Just write some comments from my user experience.

Thanks @pacoxu

@pacoxu
Copy link
Member Author

pacoxu commented Sep 15, 2022

For kubeadm-operator topic, I think runc/containerd is out of the scope.

For package management of apt/yum, it may be part of it for kubelet/kubeadm/kubectl upgrade.
I think there are several choices.

  1. download binary to replace directly. Simple but not cover all the cases.
  2. upgrade using yum/apt with a configured repo.
  3. use rpm for offline(or use local repo with rpms like choice 2).

@shawn111
Copy link

shawn111 commented Sep 15, 2022

@pacoxu If you consider manage binary out of kubelet/kubeadm/kubectl, do you think luet (https://luet.io/) is kind of good solution?
Luet is a Package Manager based on containers and packages are stored in container registry.
But those might out of CAPI design.

@fabriziopandini
Copy link
Member

Let's keep this discussion open to channel feedback to @pacoxu
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 30, 2022
@fabriziopandini fabriziopandini added kind/proposal Issues or PRs related to proposals. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Nov 30, 2022
@furkatgofurov7
Copy link
Member

@pacoxu @fabriziopandini hi folks!

Found this while exploring the options for automatic CA rotation in CAPI (issue: #7721) and this looks to be spot on! I am mostly interested in kubeadm operator and it is usage in the context of bare-metal provider implementation of CAPI. Copy pasting the use-cases from the 7721 here for better reach:

There are also some cases in which the CA of the target clusters might be
different from that of the management cluster. Some use cases:

1. Deploy of management cluster and many target clusters with the same CA. 
Perform the cluster CA rotation on the target clusters and the management clusters without impact on traffic
2. Deploy of management cluster and many target clusters with different CA. 
Perform the cluster CA rotation on the target clusters and the management clusters without impact on traffic

Also @pacoxu have not got to the bottom of the initial discussion yet, but is this https://github.com/pacoxu/kubeadm-operator implements kubernetes/kubeadm#1698 and a prototype we can give a try already?

@fabriziopandini
Copy link
Member

(doing some cleanup on old issues without updates)
/close
this work belongs to the kubeadm repo, if people have more use case they can report it there

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

(doing some cleanup on old issues without updates)
/close
this work belongs to the kubeadm repo, if people have more use case they can report it there

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/proposal Issues or PRs related to proposals. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

7 participants