Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Control Planes taint to kubelet config instead of markcontrolplane phase #1621

Open
yagonobre opened this issue Jun 19, 2019 · 52 comments
Labels
area/UX kind/bug Categorizes issue or PR as related to a bug. kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@yagonobre
Copy link
Member

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT

What happened?

Due we apply the control plane taint after the control plane comes up, on a multi-control plane case we can have pod scheduled to this control plane.

What you expected to happen?

Use the kubelet --register-with-taints config instead of handler it on a separate phase.

How to reproduce it (as minimally and precisely as possible)?

  1. Kubeadm init
  2. Create a daemonset
  3. Join another control plane

Anything else we need to know?

For now, I just use this config but would be nice if kubeadm can handler it.

apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
nodeRegistration:
  # Remove the default control-plane taint so we taint it manually with KUBELET_EXTRA_ARGS
  taints: []
```
@neolit123
Copy link
Member

i guess this can be a real problem. thanks for the report @yagonobre

does the kubelet allow self-taints with node roles such as node-role.kubernetes.io/master=:NoSchedule?
(it certainly does not for labels kubernetes/kubernetes#45361)

@neolit123 neolit123 added kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jun 19, 2019
@neolit123 neolit123 added this to the v1.16 milestone Jun 19, 2019
@yagonobre
Copy link
Member Author

does the kubelet allow self-taints with node roles such as node-role.kubernetes.io/master=:NoSchedule?

Yes, probably we can keep the phase to add the label

@neolit123 neolit123 added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jul 18, 2019
@madhukar32
Copy link

@neolit123 : Can I take a crack at this?

@neolit123
Copy link
Member

@madhukar32
hi, this needs discussion before sending the PR.
if we remove the taint from markcontrolplane phase it will break existing users.

@yagonobre
Copy link
Member Author

I was away, but now I'll have enough time to do this.
@neolit123 what you think about add the taint to the kubelet config, and keep it on the markcontrolplane phase just for backward compatibility?

@neolit123
Copy link
Member

we might have to keep it in both places with a deprecation notice.

@blurpy
Copy link

blurpy commented Dec 5, 2019

We are seeing this when joining control plane nodes to existing clusters (1.15). We use nginx-ingress-controller as a daemonset, and it's on host port 443, same as the apiserver. So the apiserver always ends up in a CrashLoop until I manually delete the pod.

@neolit123
Copy link
Member

this is a known problem. changes in kubeadm phases are tricky - the existing workaround can be seen above, but we might have to have a period of time where we both taint using the kubelet configuration and the kubeadm mark-control-plane phase, potentially deprecating the tainting in the phase in the future.

@blurpy
Copy link

blurpy commented Dec 5, 2019

Not sure I understand how the workaround works. Isn't InitConfiguration used only during init of the first master? Or can it be updated in the configmap in kube-system and used during join --control-plane?

@neolit123
Copy link
Member

neolit123 commented Dec 5, 2019

both init and join configurations have the node registration options: https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#NodeRegistrationOptions

tainting using the KubeletConfigurtation is not possible.

@blurpy
Copy link

blurpy commented Dec 5, 2019

I've tried to use a JoinConfiguration to setup additional masters, but then I just get this message:

# kubeadm join --control-plane --config /etc/kubernetes/kubeadm-master-config.yaml
     can not mix '--config' with arguments [control-plane]

Related issue I found: #1485

@yagonobre
Copy link
Member Author

Today I add the taint directly to the kubelet config file. I'll try to work on it soon.

@neolit123
Copy link
Member

neolit123 commented Dec 5, 2019 via email

@neolit123
Copy link
Member

neolit123 commented Dec 5, 2019 via email

@yagonobre
Copy link
Member Author

Yago, i dont see a field for that in the kubeletconfiguration.

Sorry, it's a flag.
I use KUBELET_EXTRA_ARGS="--register-with-taints=node-role.kubernetes.io/master=:NoSchedule"

@neolit123
Copy link
Member

neolit123 commented Dec 5, 2019

@yagonobre i just tied passing --register-with-taints=node-role.kubernetes.io/master:NoSchedule to the kubelet instead of using the markcontorlplane phase for the taint and both the CNI (calico) and coredns are stuck in Pending saying that there is no Node to schedule on (even if they tolerate the master taint).

what k8s version have you tried this with?
i'm testing with the latest 1.18.

@yagonobre
Copy link
Member Author

yagonobre commented Dec 5, 2019 via email

@neolit123 neolit123 added this to the v1.26 milestone Aug 25, 2022
@neolit123
Copy link
Member

neolit123 commented Oct 11, 2022 via email

@pacoxu
Copy link
Member

pacoxu commented Oct 11, 2022

Paco, are you seeing the problem on your side with customers?

Not meet this recently.
BTW, this is not a P0 or P1 bug, maybe P2-P4 issue as it happened during cluster maintenance.

It also feels a bit odd to separate the taint and labels. Perhaps we want them to continue to be done by the kubeadm client instead of kubelet.

If we move it to the kubelet, the taint will be back once the kubelet restarts. It would make some users confused if they remove the tainted on purpose. It seems to be a behavior change.

@neolit123
Copy link
Member

If we move it to the kubelet, the taint will be back once the kubelet restarts. It would make some users confused if they remove the tainted on purpose.

that is indeed a concern

@pacoxu
Copy link
Member

pacoxu commented Oct 13, 2022

If we move it to the kubelet, the taint will be back once the kubelet restarts. It would make some users confused if they remove the tainted on purpose.

that is indeed a concern

IMO, it is a design problem of kubelet. One solution that I know is that kubelet should only add the node labels or taint only if it is the first time bootstrap and register to apiserver. After the bootstrap(node is already added to cluster), kubelet should not change its node label or taint itself and it should respect the apiserver/etcd storage. But this is another topic. If so, any cluster installer can add the default taint/labels during joining. It would make no disturbance in later kubelet restarts.

@neolit123
Copy link
Member

neolit123 commented Oct 13, 2022

true, so unless the kubelet changes its design we can play around it by adding the taints once and then removing them from the kubelet flags or local config. i must admit i am not a big fan of this solution as it creates more difficulties for us.

we can continue doing it in kubeadm with a client. perhaps sooner, but as far as i understand there could still be a short period when the taints are not there yet.

IMO this is not a very common issue. ideally users should create the control plane nodes and only then start applying workloads such as daemonsets.

@pacoxu
Copy link
Member

pacoxu commented Nov 3, 2022

To summarize,

  1. Workaround: use --patches when join new control plane. See Move Control Planes taint to kubelet config instead of markcontrolplane phase #1621 (comment).
  2. Problems that should be solved before we implement it.

@neolit123 neolit123 modified the milestones: v1.26, v1.27 Nov 21, 2022
@neolit123 neolit123 modified the milestones: v1.27, v1.28 Apr 17, 2023
@zyriljamez
Copy link

Is this issue resolved now ?

@neolit123 neolit123 modified the milestones: v1.28, v1.29 Jul 21, 2023
@neolit123 neolit123 modified the milestones: v1.29, v1.30 Nov 1, 2023
@neolit123 neolit123 modified the milestones: v1.30, v1.31 Apr 5, 2024
@neolit123 neolit123 modified the milestones: v1.31, Next, v1.32 Jul 3, 2024
@neolit123
Copy link
Member

some update here,

this feature:

missed 1.31, but once we have it we will have an instance-config.yaml with KubeletConfiguration on each node.
with that in place we can start using KubeletConfiguration patches to apply the taints with the instance-config.yaml.

i.e. same as the existing patches workaround:
#1621 (comment)

this would still need a KEP/design doc and a kubeadm feature gate / so that users can/opt-in out for a few releases.

@neolit123 neolit123 added kind/feature Categorizes issue or PR as related to a new feature. kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jul 3, 2024
@neolit123 neolit123 modified the milestones: v1.32, v1.33 Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/UX kind/bug Categorizes issue or PR as related to a bug. kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet