KEP: Node Readiness Gates #1003

andrewsykim · 2019-04-25T22:45:02Z

Signed-off-by: Andrew Sy Kim kiman@vmware.com
Co-authored-by: Vish Kannan vishnuk@google.com

Porting over the KEP for node readiness gates that @vishh worked on here kubernetes/community#2640. I'll be updating it with his guidance in the new few days so I wanted to open a WIP PR in-case folks wanted to provide early feedback.

/sig node
cc @vishh @chenk008 @yastij

andrewsykim · 2019-04-25T22:49:29Z

keps/sig-node/20190425-node-readiness-gates.md

+type Node struct {
+ …
+ Spec:
+  SystemExtensions []SystemExtension


I'll probably follow the same naming scheme as pod readiness gates and change this to something like

Spec: ReadinessGates: []NodeReadinessGate

Thoughts @vishh?

The rationale for coming up with a generic name was to facilitate other use cases like identifying system addons for introspection/monitoring. If the naming scheme is too confusing, I don't mind changing it to be more specific.

I agree with @andrewsykim - Normalization helps. This would avoid confusion for users.

yastij · 2019-04-26T11:41:01Z

/sig scheduling
/assign

for visibility

chenk008 · 2019-04-26T15:16:40Z

Thank you for your KEP.
#76946 In this scenario,the node's Readiness depends on the CCM's route_controller .
In Approach B: we need a pod to judge the route is created successfully.
I think Approach D LGTM

vishh · 2019-04-26T18:24:29Z

/assign

vishh

The initial draft looks good. I'd say try implementing a prototype to ensure the design is implementable and then come back to refining this KEP further.

vishh · 2019-04-26T18:43:26Z

keps/sig-node/20190425-node-readiness-gates.md

+
+One of the goals for this proposal is to evaluate the viability and usefullness of combining Status and Health for extensions.
+
+### Approach B - Use Readiness of Node extension pods (Preferred)


nit: Place this ahead of the previous approach.

I'm thinking of adding more details/content to the preferred approach (approach B) and condensing the other ones under an "Alternatives" section. Outlining all the other approaches in as much details as now makes it difficult to read IMO. What do you think?

vishh · 2019-04-26T18:45:37Z

keps/sig-node/20190425-node-readiness-gates.md

+type Node struct {
+ …
+ Spec:
+  SystemExtensions []SystemExtension


The rationale for coming up with a generic name was to facilitate other use cases like identifying system addons for introspection/monitoring. If the naming scheme is too confusing, I don't mind changing it to be more specific.

vishh · 2019-04-26T18:46:15Z

keps/sig-node/20190425-node-readiness-gates.md

+}
+```
+
+Node selector will take into account `Readiness` all the pods that match the selector(s) specified in the Node Spec `SystemExtensions` field to determine Node Readiness.


nit: s/Node Selector/Node Controller/g

vishh · 2019-04-26T18:48:25Z

keps/sig-node/20190425-node-readiness-gates.md

+
+TODO
+
+## Design Details


We also need to include a Production Maintenance section that talks about the health metrics and debugging tips to safely rollout and manage this feature in production.

yastij

Left some comments/suggestions

yastij · 2019-04-28T20:32:44Z

keps/sig-node/20190425-node-readiness-gates.md

+type Node struct {
+ …
+ Spec:
+  SystemExtensions []SystemExtension


I agree with @andrewsykim - Normalization helps. This would avoid confusion for users.

yastij · 2019-04-28T20:36:13Z

keps/sig-node/20190425-node-readiness-gates.md

+}
+
+type SystemExtension struct {
+   Selector LabelSelector


I'd like us to discuss/prototype relying on higher abstraction to select pods like priorityClass (e.g. Gate node readiness with pods that have node-critical priorityClass) @vishh

cc @bsalamat

I take back what I said, this would make the kubelet watch priorityClass.

yastij · 2019-04-28T20:44:26Z

keps/sig-node/20190425-node-readiness-gates.md

+type SystemExtension struct {
+   Selector LabelSelector
+   Namespace Namespace
+   AffectsReadiness bool // Ignore non-critical node extensions from node readiness


Is there a usecase behind AffectsReadiness (i.e why reference a set of pods on the nodeReadinessGate if they do not affect node readiness)

The original use case was to identify all system pods on a node including the ones whose health should not impact node health (like non-critical monitoring services for example)

yastij · 2019-04-28T20:46:07Z

keps/sig-node/20190425-node-readiness-gates.md

+Node selector will take into account `Readiness` all the pods that match the selector(s) specified in the Node Spec `SystemExtensions` field to determine Node Readiness.
+
+The individual system extensions are required to detect kubelet crash (or restart) quickly and reflect that in their Readiness.
+If the Node Controller observes that all the system extension pods are Ready after the kubelet places the special meta readiness taint, it will remove that taint immediately.


s/Node Controller/NodeLifecycle Controller/g

How can we tolerate a subset of conditions ?

thockin · 2019-04-29T19:45:43Z

What is the timeline for this? I want to review, but am buried.

andrewsykim · 2019-04-29T19:48:28Z

@thockin enhancement freeze for v1.15 is tomorrow so I think this will have to wait til v1.16. I'll poke you for a review when it's at a better state and when we all have a bit more bandwidth :)

DaiHao · 2019-05-06T10:16:15Z

How to solve the race condition? Some critical pods might have not created on node before kubelet or controller judge node "Ready".

vishh · 2019-05-06T18:08:07Z

@DaiHao with the current proposal, existing node readiness will allow system pods to be scheduled, whereas application pods will be scheduled only after "meta readiness" is satisfied.

justinsb · 2019-05-06T18:11:42Z

keps/sig-node/20190425-node-readiness-gates.md

+
+Handling of kubelet restarts can be tricky though since there is a chance that the extension pods may not reflect their connections with the kubelet soon after the kubelet restarts.
+
+### Approach C - Use conditions


Can we also contrast using taints? As I understand it, Conditions was the plan, then we realized we needed a mechanism to tolerate certain Conditions, and so we introduced taints & tolerations (which are also generally useful).

Yes, taints will be used in the current preferred approach.

justinsb · 2019-05-06T18:14:51Z

keps/sig-node/20190425-node-readiness-gates.md

+
+### Approach B - Use Readiness of Node extension pods (Preferred)
+
+This approach assumes that all node level extensions will get deployed as kubernetes pods and that the health of those extension pods can be exposed via existing Readiness probes.


So it sounds like the model envisaged is pod-per-Node. I do think the selector would enable selection of a shared pod, but it isn't clear that a pod could express that some subset of nodes are bad and some are good.

I think anything that programs infrastructure would end up needing a "proxy pod" per node?

Yes, in an ideal world all critical system daemons would be run as pods (including k'let :) ). Until that happens, there can be placeholder pods that reflect the health of system daemons not run as pods for the purposes of monitoring/introspection.

justinsb · 2019-05-06T18:15:45Z

keps/sig-node/20190425-node-readiness-gates.md

+...
+}
+
+type SystemExtension struct {


Presumably you need a merge-key in here (e.g. name)?

keps/sig-node/20190425-node-readiness-gates.md

andrewsykim · 2019-07-03T17:48:04Z

Thanks for the feedback, will be updating this KEP based on comments shortly.

danwinship · 2019-07-26T12:29:12Z

keps/sig-node/20190425-node-readiness-gates.md

+The `ReadinessGates` field can be used by external components to dictate node readiness since they can apply an arbitrary readiness gate with a reserved label selector
+that should not be used by any pods on the cluster. A common usecase here is an external cloud provider that wants to ensure no pods are scheduled on a node until
+networking for a given node is properly set up by the cloud provider. In this case, the cloud provider would apply any custom readiness gates to a node during registration
+and a separate controller would remove the readiness gate when it sees that node's networking has been properly configured. No pods matching the label selector of that


This is kind of totally a gross hack. The cloud provider tells the node "I won't be ready until event X happens", knowing full well that event X will never happen, and then just says "never mind, forget I said that" when it's ready. It's not using the defined mechanism, it's abusing it.

And if it has permission to modify the Node object anyway, couldn't it just apply and then remove a taint directly rather than fiddling with the readiness gates?

And if it has permission to modify the Node object anyway, couldn't it just apply and then remove a taint directly rather than fiddling with the readiness gates?

There's time between when the cloud provider can apply that initial taint and when the node is initialized. It's true that a cloud provider can apply custom taints to achieve the same behavior but it will always race against the scheduler for pods unless we add the gate during initialization.

This is kind of totally a gross hack. The cloud provider tells the node "I won't be ready until event X happens", knowing full well that event X will never happen, and then just says "never mind, forget I said that" when it's ready. It's not using the defined mechanism, it's abusing it.

I have mixed feelings about this... yes it's not exactly using the feature the way it is fully intended but it satisfies most of the use-cases around readiness - which is waiting for system critical pods (usually DaemonSets) - while also allowing for the external cloud provider case.

I'm not opposed to a cloud provider specific integration for this where we allow the cloud provider to do initialization in 2 steps - one for node addresses, topology, etc and another for node readiness - but I would rather have a generic solution that fixes this for many other use-cases.

danwinship · 2019-07-26T12:34:35Z

keps/sig-node/20190425-node-readiness-gates.md

+A readiness taint will be used to track the overall readiness of a node. The kubelet, node lifecycle controller, other external controllers will
+manage that taint.
+
+The Node object will be extended to include a list of readiness gates like so:


This allows for having completely unique readiness gates for each node, but it seems much more likely that you'd have a small set of distinct node types, where each type had its own readiness gates. And each node type would probably already be distinguishable by its labels. So having something that selects Nodes to apply readiness gates to them seems like it would be easier to manage than something that requires modifying the Nodes (especially since to avoid race conditions, the thing setting the node.Spec.ReadinessGates basically has to be in kubelet itself, whereas if you do it the other way, you can create the selectors well before creating the nodes).

And in fact, it seems like really we already have a thing that selects nodes that need special initialization: DaemonSets. You could just add the BlocksReadiness field to DaemonSetSpec, defaulting to false. And then a node would be "meta-ready" when all readiness-blocking DaemonSets that selected that node had successfully scheduled their pods to that node and the pods reported being ready.

And in fact, it seems like really we already have a thing that selects nodes that need special initialization: DaemonSets. You could just add the BlocksReadiness field to DaemonSetSpec, defaulting to false. And then a node would be "meta-ready" when all readiness-blocking DaemonSets that selected that node had successfully scheduled their pods to that node and the pods reported being ready.

I like this idea but I do think there should still be a way to set the readiness gate outside of DaemonSets also. Maybe for an initial implementation we can only allow readiness gates from cloud providers and from DaemonSets and if other use-cases come up we can add support later?

Should something like blocksReadiness on a DaemonSet be allowed only on namespaces like kube-system? With this, any user with access to DaemonSets could prevent pods from scheduling on the entire cluster.

Ah... yeah, there would need to be some sort of protection there, but it can't be as simple as "limited to kube-system". I'm not sure what approach would be consistent with other features. ("Limited to namespaces with the annotation X?")

danwinship · 2019-07-26T14:23:01Z

keps/sig-node/20190425-node-readiness-gates.md

+    // The name of the node readiness gate
+    // +optional
+    Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
+    // Whether a node is ready based on it's corresponding readiness gates


"its". But also, "gates" plural makes it sound like this is the aggregated ready state, but it's not. "Whether the node is ready based on this readiness gate".
And neither field should be +optional right?

Good catch, updated.

danwinship · 2019-07-26T14:31:53Z

keps/sig-node/20190425-node-readiness-gates.md

+
+One limitation in this design is that all system critical pods that should gate the readiness of a node must be updated to tolerate the readiness taint. Otherwise,
+those pods cannot be scheduled on a node since the readiness taint would already be applied by the kubelet on registration. If users forget to apply this toleration,
+they may accidentally gate the readiness of their nodes, preventing workloads from being scheduled there.


It seems as though any pod that blocks node readiness would almost necessarily have to tolerate all taints... Otherwise, eg, running low on disk space would cause all the node-readiness-gating pods to be evicted, which would then cause the node to become meta-unready, which would then cause most of the other pods to be evicted (even if they explicitly indicated that they can tolerate low disk space).

IME system critical pods tend to already tolerate all taints - but agreed this can be unexpected in many cases. If we specify readiness gate at the DaemonSet level like you suggested, we could set the right toleration when the DaemonSet is created? That doesn't sound ideal either because a DaemonSet would end up tolerating taints it shouldn't.

If we specify readiness gate at the DaemonSet level like you suggested, we could set the right toleration when the DaemonSet is created?

Or just require the user to create it, and reject it in validation if they didn't.

That doesn't sound ideal either because a DaemonSet would end up tolerating taints it shouldn't.

Maybe there are use cases for a node-readiness-blocking pod to tolerate some but not all taints. Anyway, the validation check could just check that it tolerates at least the meta-ready taint, rather than that it tolerates all taints.

Validation for tolerations sgtm

danwinship · 2019-07-26T14:38:24Z

keps/sig-node/20190425-node-readiness-gates.md

+
+The following journey attempts to illustrate this solution a bit more in detail:
+
+1. Kubelet starts up on a node and immediately taints the node with a special “kubernetes.io/MetaReadiness” taint with an effect of “NoSchedule” in addition to flipping it’s “Ready” condition to true (when appropriate).


Maybe "ApplicationReadiness" rather than "MetaReadiness"?

and not NodeReadiness?

Signed-off-by: Andrew Sy Kim <kiman@vmware.com> Co-authored-by: Vish Kannan <vishnuk@google.com>

lbernail · 2019-10-04T15:49:42Z

@andrewsykim do you think the KEP could be approved for the next release?

thockin

First, I agree with the premise of this. I have many thoughts about the design.

Second, it might be nice to enumerate even more concretely the use cases. Which things that we currently do would be replaced by this? Which things that people WANT to do would this enable? The scope of the list will help justify the complexity tradeoffs.

It's not quite like pod readiness, which uses a list of condition names. Why not? E.g. Every component registers a condition name in node.spec.readinessGates. Kubelet applies a taint until all conditions are present and True. This would force daemonsets to have a cluster-controller that tests whether that DS is present and ready on a given node and update the condition. Kind of bleh. As a refinement, each readiness gate could be a condition name OR a pod selector (or the name of a daemonset, which already has a selector).

I don't think modifying readiness gates on the fly for non-daemonset conditions is going to fly past API review.

As an alternative, why do we need more than taints? Let each readiness gate apply a taint on the node. NetworkNotReady, LoggingNotEnabled, etc. Why is that not sufficient?

Is this just about Node startup or are we talking about a real-time signal? E.g. is it a one-way, latched transition (node starts unready and becomes ready) or bi-directional (continuously evaluated)? If a critical daemonset becomes unready, does the taint come back? What happens to running pods which are, presumably, malfunctioning now?

It's much simpler if we assume, like pod readiness, that it's a "birth time" one-way flip. Can we get away with that?

I have to admit - you lost me with the discussion of special leases.

Is this something we want to press on in the new year? Maybe we should do a realtime discussion on it?

lbernail · 2019-12-19T12:41:55Z

Hi @thockin thank you for having a look at this KEP. We are very interested in this feature (we are currently working on using https://github.com/uswitch/nidhogg, which works fine but requires some work to manage the scale we run at).

A few example use-cases for us:

initContainers failing to work because kube-proxy / node-local-dns is not working yet
missing cloud credentials at application startup (for kube2iam/kiam and similar solutions)
missing traces / metrics at application startups (for applications sending those to an agent running as a daemonset)

InitContainers failing is not a very big problem because they work after a few retries. The other 2 are harder and several applications implement retry logic to address this (which of course they will have to do anyway in case the pod providing the service fails during the application lifetime). The hardest scenario is missing cloud credentials because most sdks will consider they don't exist and will continue without looking for them (if such a pod fails later, it's less of a problem because credentials are valid for some time and sdks will retry).

Having a solution for node start-up only would address most of our problems

rlenglet · 2020-01-04T01:29:20Z

keps/sig-node/20190425-node-readiness-gates.md

+  ReadinessGates []NodeReadinessGate `json:"readinessGates,omitempty" protobuf:"bytes,7,opt,name=readinessGates"`
+}
+
+// NodeReadinessGate indicates a set of pods that must be ready on a given node before the


Since the selector may select more than one pod, what is the expected behavior if not all of those selected pods are ready?
Is the gate ready if at least one of the selected pods is ready?
Or if all of the selected pods are ready?

If it's all, then this requires first calculating the set of system-critical daemonset pods that are expected to be scheduled on the node, before trying to evaluate the node readiness. And kubelet can't do that, only the scheduler can. So the all semantics would require the cooperation of the scheduler to determine node readiness.

In any case, this has to be clarified here.

rlenglet · 2020-01-04T01:37:58Z

keps/sig-node/20190425-node-readiness-gates.md

+}
+```
+
+For each `NodeReadinessGate`, kubelet will fetch the state of all pods that match its label selector and take the readiness state of those pods into account.


I don't think kubelet alone can determine the readiness, because readiness will depend on the set of system-critical daemonset pods that are to be scheduled on the node, and that is determined by the scheduler.
So I think this requires cooperation from the scheduler.
Otherwise, kubelet may declare the node ready even though a critical daemonset has not yet been scheduled to run on the node.

rlenglet · 2020-01-06T18:22:19Z

keps/sig-node/20190425-node-readiness-gates.md

+  ...
+  // Defines the list of readiness gates for a node
+  // +optional
+  ReadinessGates []NodeReadinessGate `json:"readinessGates,omitempty" protobuf:"bytes,7,opt,name=readinessGates"`


If we ignore the implementation details for a moment, this API is not necessary.
The set of pods to wait for is already clearly specified by users: it is the set of pods with a system-cluster-critical or system-node-critical priorityClass.
We shouldn't need any additional input.

I feel that this ReadinessGates API would just be used to replicate the information already specified in priorityClass: if a node has a critical priorityClass it should be matched by a ReadinessGates, and vice-versa I can't think of a case where a pod matched by a ReadinessGates shouldn't have a critical priorityClass.

It seems that introducing ReadinessGates is driven by the desire to implement this 100% in kubelet. However, if the implementation involved the scheduler to determine the set of pods with a critical priorityClass to be scheduled on the node, that new API wouldn't be necessary.

I discussed this with @vishh. I was making the assumption that we'd have to deal with only one scheduler, because I was considering only daemonset pods. But if we want to deal with pods other than daemonsets, which are scheduled by different schedulers, then this becomes complicated because all schedulers would need to support this logic.

fejta-bot · 2020-04-08T21:51:32Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

andrewsykim · 2020-04-08T22:10:15Z

This needs to be fleshed out more, I'll try to get to it in the next week or so but no guarantees. If other folks want to contribute please reach out :)

andrewsykim · 2020-04-24T15:39:01Z

I've been steeping on this for a bit and I think a "conditions/taints hook" from the cloud controllers is sufficient enough for the use-cases we need (at least for the cloud providers). The most common use-case I hear is preventing nodes from accepting new pods before the cloud provider registers routes for that node. We can allow cloud providers to add a "no routes" taint (or use the existing NetworkUnavailable node condition) assuming the routes controller will remove it later. Worth noting that the GCP provider does this today but via the kubelet. We should make this functionality available to all the external providers. I will open a separate KEP for this use-case.

Cloud providers aside, it seems like there are other valid use-cases (e.g. waiting for system critical pods) worth considering but all of these can be addressed today by passing a custom taint to kubelet (via --register-with-taints) and having a custom controller (or daemonset) to remove the taint later. Although this is probably asking a lot from users, it seems like the right approach (at least for now) since expressing all the possible conditions to block readiness in Node seems unlikely at this point. There are just too many factors/scenarios to consider.

/close

k8s-ci-robot · 2020-04-24T15:39:15Z

@andrewsykim: Closed this PR.

In response to this:

I've been steeping on this for a bit and I think a "conditions/taints hook" from the cloud controllers is sufficient enough for the use-cases we need (at least for the cloud providers). The most common use-case I hear is preventing nodes from accepting new pods before the cloud provider registers routes for that node. We can allow cloud providers to add a "no routes" taint (or use the existing NetworkUnavailable node condition) assuming the routes controller will remove it later. Worth noting that the GCP provider does this today but via the kubelet. We should make this functionality available to all the external providers. I will open a separate KEP for this use-case.

Cloud providers aside, it seems like there are other valid use-cases (e.g. waiting for system critical pods) worth considering but all of these can be addressed today by passing a custom taint to kubelet (via --register-with-taints) and having a custom controller (or daemonset) to remove the taint later. Although this is probably asking a lot from users, it seems like the right approach (at least for now) since expressing all the possible conditions to block readiness in Node seems unlikely at this point. There are just too many factors/scenarios to consider.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 25, 2019

k8s-ci-robot requested review from dchen1107 and derekwaynecarr April 25, 2019 22:45

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Apr 25, 2019

andrewsykim force-pushed the node-readiness-gates branch from 8036265 to 8bc2200 Compare April 25, 2019 22:47

andrewsykim commented Apr 25, 2019

View reviewed changes

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Apr 26, 2019

k8s-ci-robot assigned yastij Apr 26, 2019

k8s-ci-robot assigned vishh Apr 26, 2019

vishh reviewed Apr 26, 2019

View reviewed changes

yastij reviewed Apr 28, 2019

View reviewed changes

This was referenced May 1, 2019

CCM and cloud-provider=external should set node NodeNetworkUnavailable kubernetes/kubernetes#76946

Closed

Mark nodes as "NotReady" until critical pods from daemonsets are Ready kubernetes/kubernetes#75890

Closed

andrewsykim mentioned this pull request May 6, 2019

AWS: network providers required to update NetworkUnavailable kubernetes/kubernetes#33573

Closed

justinsb reviewed May 6, 2019

View reviewed changes

keps/sig-node/20190425-node-readiness-gates.md Show resolved Hide resolved

andrewsykim force-pushed the node-readiness-gates branch from 8bc2200 to 0cb9434 Compare July 3, 2019 18:20

andrewsykim force-pushed the node-readiness-gates branch from 54190a2 to ce535ce Compare July 9, 2019 15:23

thockin self-assigned this Jul 11, 2019

danwinship reviewed Jul 26, 2019

View reviewed changes

KEP: Node Readiness Gates

ee9f3e3

Signed-off-by: Andrew Sy Kim <kiman@vmware.com> Co-authored-by: Vish Kannan <vishnuk@google.com>

andrewsykim force-pushed the node-readiness-gates branch from ce535ce to ee9f3e3 Compare July 26, 2019 16:49

andrewsykim mentioned this pull request Jul 29, 2019

KEP: Support Instance Metadata Service with Cloud Controller Manager #1158

Merged

andrewsykim mentioned this pull request Aug 8, 2019

NoRouteCreated NodeNetworkUnavailable condition should be set by a cloud-provider webook, not kubelet kubernetes/kubernetes#80311

Closed

Joseph-Irving mentioned this pull request Aug 30, 2019

question: race conditions with untainted nodes uswitch/nidhogg#11

Closed

yuvipanda mentioned this pull request Sep 18, 2019

Making user-placeholders not block scheduling of user pods with image locality preference jupyterhub/zero-to-jupyterhub-k8s#1414

Open

7 tasks

thockin reviewed Dec 19, 2019

View reviewed changes

rlenglet suggested changes Jan 4, 2020

View reviewed changes

rlenglet suggested changes Jan 6, 2020

View reviewed changes

rlenglet mentioned this pull request Feb 25, 2020

Istio CNI race condition on cluster restart istio/istio#14327

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2020

jpugliesi mentioned this pull request Apr 21, 2020

kiam-agent race with dependent application pods on Node startup uswitch/kiam#395

Open

This was referenced Apr 23, 2020

Set iptable rules to block traffic to IMDS Azure/aks-engine#3135

Closed

Document issue during cluster auto scale event Azure/aad-pod-identity#573

Closed

k8s-ci-robot closed this Apr 24, 2020

rlenglet mentioned this pull request Apr 24, 2020

manifests: run gateways as non-root user istio/istio#23174

Merged

ghost mentioned this pull request Jul 23, 2020

Pods scheduled on non-ready Nodes gardener/gardener#2621

Closed

ZhiHanZ mentioned this pull request Aug 7, 2020

Taint controller istio/istio#26255

Merged

aramase mentioned this pull request Jan 18, 2022

Failing to mount secrets when a new node is scaled up Azure/secrets-store-csi-driver-provider-azure#759

Closed

2 tasks

aramase mentioned this pull request Mar 17, 2022

Pods using csi volumes fail to terminate if csi driver pods have been evicted kubernetes-sigs/secrets-store-csi-driver#895

Closed

timebertt mentioned this pull request Nov 30, 2022

Wait for node-critical pods before scheduling workload pods gardener/gardener#7117

Closed

13 tasks


		One of the goals for this proposal is to evaluate the viability and usefullness of combining Status and Health for extensions.

		### Approach B - Use Readiness of Node extension pods (Preferred)


		Handling of kubelet restarts can be tricky though since there is a chance that the extension pods may not reflect their connections with the kubelet soon after the kubelet restarts.

		### Approach C - Use conditions


		### Approach B - Use Readiness of Node extension pods (Preferred)

		This approach assumes that all node level extensions will get deployed as kubernetes pods and that the health of those extension pods can be exposed via existing Readiness probes.


		The following journey attempts to illustrate this solution a bit more in detail:

		1. Kubelet starts up on a node and immediately taints the node with a special “kubernetes.io/MetaReadiness” taint with an effect of “NoSchedule” in addition to flipping it’s “Ready” condition to true (when appropriate).

KEP: Node Readiness Gates #1003

KEP: Node Readiness Gates #1003

Conversation

andrewsykim commented Apr 25, 2019 • edited Loading

andrewsykim Apr 25, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yastij commented Apr 26, 2019 • edited Loading

chenk008 commented Apr 26, 2019

vishh commented Apr 26, 2019

vishh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yastij left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thockin commented Apr 29, 2019

andrewsykim commented Apr 29, 2019

DaiHao commented May 6, 2019

vishh commented May 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewsykim commented Jul 3, 2019

Choose a reason for hiding this comment

andrewsykim Jul 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lbernail commented Oct 4, 2019

thockin left a comment

Choose a reason for hiding this comment

lbernail commented Dec 19, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rlenglet Jan 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fejta-bot commented Apr 8, 2020

andrewsykim commented Apr 8, 2020

andrewsykim commented Apr 24, 2020

k8s-ci-robot commented Apr 24, 2020

andrewsykim commented Apr 25, 2019 •

edited

Loading

andrewsykim Apr 25, 2019 •

edited

Loading

yastij commented Apr 26, 2019 •

edited

Loading

andrewsykim Jul 26, 2019 •

edited

Loading

rlenglet Jan 6, 2020 •

edited

Loading