Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods which have not "started" can not be "ready" #92196

Merged

Conversation

thockin
Copy link
Member

@thockin thockin commented Jun 16, 2020

Before this commit, containers which have both a startupProbe and a
readinessProbe are marked as ready=false during stratup, but
containers which have only a startupProbe are marked ready=true.
This doesn't make sense.

This commit only considers readiness if the container is considered to
have "started", which leaves ready=false while starting up.

/kind bug

Fixes #89995

Special notes for your reviewer:

I am NOT super familiar with this code area. I dug around to find this and empirically it seems to work.

Does this PR introduce a user-facing change?:

Containers which specify a `startupProbe` but not a `readinessProbe` were previously considered "ready" before the `startupProbe` completed, but are now considered "not-ready".

@thockin thockin requested a review from matthyx June 16, 2020 17:39
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 16, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 16, 2020
@thockin thockin added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jun 16, 2020
@k8s-ci-robot k8s-ci-robot added area/kubelet and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 16, 2020
@matthyx
Copy link
Contributor

matthyx commented Jun 16, 2020

Thanks Tim. Let me have a look later today...

@thockin
Copy link
Member Author

thockin commented Jun 16, 2020

/retest

@matthyx
Copy link
Contributor

matthyx commented Jun 16, 2020

Tests are fine, which show the poor coverage of these edge cases...
Looks good, but I will add some tests in a follow-up PR.
/lgtm
/hold
Thanks!

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 16, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 16, 2020
@matthyx
Copy link
Contributor

matthyx commented Jun 16, 2020

  • you can cancel the hold when you are ready.

@matthyx
Copy link
Contributor

matthyx commented Jun 16, 2020

Once merged we should look at the serial e2e test results as there is some coverage there.

@@ -261,6 +249,20 @@ func (m *manager) UpdatePodStatus(podUID types.UID, podStatus *v1.PodStatus) {
started = !exists
}
podStatus.ContainerStatuses[i].Started = &started

if started {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no way container can be ready, but not started, perhaps the status can be stored in a single variable instead of two independent flags?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there are more ramifications... I think it's clearer to keep both even just for the sake of explanation and ease of testing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean in the API or in this function?

The API has shipped, we don't want to change that.

This commit was designed to be surgical - move the code block, add an if. Further cleanup may be possible, but it seems low-value to me - this code is pretty simple to read?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant API =). I commented before looked into it and understand it's seems to be hard to change. The only benefit is guaranteed consistency of the status. This method is ok. But there are more. For instance, this:

containerStatus.Ready = ready
where Ready and Started are set seemingly independently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That case (I think) is after readiness has been considered at a lower level, but I admit I am not 100% confident in this code area any more :(

@thockin
Copy link
Member Author

thockin commented Jun 17, 2020

/retest

verify failed after 2 hours and a LOT of logs

Before this commit, containers which have both a `startupProbe` and a
`readinessProbe` are marked as `ready=false` during stratup, but
containers which have only a `startupProbe` are marked `ready=true`.
This doesn't make sense.

This commit only considers readiness if the container is considered to
have "started", which leaves `ready=false` while starting up.
@thockin thockin removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 17, 2020
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2020
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

@thockin what's the difference between your initial commit and the force-pushed one? I don't see it...

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

@SergeyKanzhelev there are 2 states, ready and started, and they represent 2 different notions that are monitored by different probes... started is a permanent state (once startupProbe succeeds it never changes) whereas ready depends on the result of the readinessProbe.

Please refer to the KEP (kubernetes/enhancements#950) or my lightning talk on the subject: https://youtu.be/wO1uy9QKNHQ

@SergeyKanzhelev
Copy link
Member

@SergeyKanzhelev there are 2 states, ready and started, and they represent 2 different notions that are monitored by different probes... started is a permanent state (once startupProbe succeeds it never changes) whereas ready depends on the result of the readinessProbe.

Totally with you on this. My comment was that there are three acceptable states represented with four possible combinations of flags. Which led to this Ready, but not Started issue. And there are seems to be other places which set these flags independently without ensuring that container is not ending up in un-acceptable state.

I'm pretty sure the Ready flag was already defined when Started was introduced. So it wasn't much of a choice for redesigning flags. I made my comment before looking deeper into changes needed to forbid this un-acceptable state in compile time.

Also, great lightning talk.

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

seems to be other places which set these flags independently without ensuring that container is not ending up in un-acceptable state.

Good point, should we create another issue, or put everything on #89995 - in that case this PR should not "fix" it alone.

@heckad
Copy link

heckad commented Jun 17, 2020

Recently, I noticed that pods without a readiness probe continue to receive traffic while they are in terminating status. In this regard, I propose to add new state for startUp probe and make liveness and readiness probs work only for running state. If our pods in terminating or startUp state, they are should be not ready also if readiness probe didn't set. Olso add startUp probe an excellent start to add a new feature which should run depended pods only when pod started. For example, if bd not in startUp state, we don't start pods dependent from DB.

Why is it so important to solve this problem?
Because the readiness probe is responsible not only for sending traffic but also for making a decision when stopping old pods, if we don't have readiness probe then Kubernetes stopping old pods when new just start starting. Because I propose to add a new state and make readiness and licenses probes available only for running pods, not for pods in startUp state or in terminating states.

Example behaviour when pod got kill signal but have ready state

Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-frfdw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-frfdw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason   Age   From                                  Message
  ----    ------   ----  ----                                  -------
  Normal  Killing  24s   kubelet, vmi310368.contaboserver.net  Stopping container backend-order-crud

Thanks for your attention!

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

I noticed that pods without a readiness probe continue to receive traffic while they are in terminating status.

This has always been the case... when you don't specify a readinessProbe, you assume an always ready state.

I propose to add a new state and make readiness and liveness probes available only for running pods, not for pods in starting or terminating states.

This is clearly a different use case than what the startupProbe was designed to solve. If you feel strong enough you can take the point and start a KEP to modify the handling of the shutdown phase...

However, now that we have the startupProbe, people could almost do without a readinessProbe (if they don't care about removing/adding back the pod to the load balancer pool) if that case had been covered. Then it could be a feature (or follow up) of my KEP...

What do you think @thockin (from a philosophical point of view) ?

@heckad
Copy link

heckad commented Jun 17, 2020

This has always been the case... when you don't specify a readinessProbe, you assume an always ready state.

Yes, I propose to change this behaviour. A pod should be ready only in running state and readiness probe should manage ready state also only for running state. In startUp and terminating state pod should be not ready. I think this is bad behaviour when requests can go to pods wich starting work and completing work. How do you think this is a good propose? This propose can solve many problems like this.

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

I think this is bad behaviour when requests can go to pods wich starting work and completing work.

According to this page, it should not be the case: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods

Even if the pod is still "ready" no new traffic should go to it... so if it's something you can reproduce I suggest you fill a bug report (which is something I can help you with).

@heckad
Copy link

heckad commented Jun 17, 2020

Even if the pod is still "ready" no new traffic should go to it...

What mean "ready" for pods in termination state?

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

What mean "ready" for pods in termination state?

I need to check in the kubelet code...

@matthyx
Copy link
Contributor

matthyx commented Jun 17, 2020

/test pull-kubernetes-kubemark-e2e-gce-big
/test pull-kubernetes-e2e-kind
/test pull-kubernetes-verify

@thockin
Copy link
Member Author

thockin commented Jun 17, 2020

It was just a rebase - trying to unstick tests

@thockin thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2020
@dims
Copy link
Member

dims commented Jun 17, 2020

/hold Temporary hold to get prow/tide to get back on its feet. Feel free to remove hold in a few hours.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 17, 2020
@dims
Copy link
Member

dims commented Jun 17, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 17, 2020
@k8s-ci-robot k8s-ci-robot merged commit 78b503d into kubernetes:master Jun 18, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Jun 18, 2020
k8s-ci-robot added a commit that referenced this pull request Jul 9, 2020
…6-upstream-release-1.18

Automated cherry pick of #92196: Pods which have not "started" can not be "ready"
@thockin thockin deleted the startup-probe-blocks-readiness branch August 2, 2021 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected startupProbe behavior.
6 participants