Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MinReadySeconds to DCs #7114

Closed
0xmichalis opened this issue Feb 8, 2016 · 33 comments
Closed

Add MinReadySeconds to DCs #7114

0xmichalis opened this issue Feb 8, 2016 · 33 comments

Comments

@0xmichalis
Copy link
Contributor

Readiness in Kubernetes deployments is dictated by MinReadySeconds. We consider pods available as soon as they become ready w/o imposing any additional restriction.

    // Minimum number of seconds for which a newly created pod should be ready
    // without any of its container crashing, for it to be considered available.
    // Defaults to 0 (pod will be considered available as soon as it is ready)
    MinReadySeconds int32 `json:"minReadySeconds,omitempty"`

This issue is for tracking this gap.

@ironcladlou @smarterclayton @pweil-

@stevekuznetsov
Copy link
Contributor

@smarterclayton do we have executive direction on this gap?

@0xmichalis
Copy link
Contributor Author

I was thinking that we may want to skip this and always default to 0 (consider available when ready).

@smarterclayton
Copy link
Contributor

I would like to have this - since I was the one who proposed it to Dan originally.

@0xmichalis
Copy link
Contributor Author

ok

@0xmichalis
Copy link
Contributor Author

@stevekuznetsov I would like to get #6233 first and then you could add this.

@0xmichalis
Copy link
Contributor Author

@stevekuznetsov did you start working on this? If not, I would like @mfojtik to take a stab at it.

@stevekuznetsov
Copy link
Contributor

I haven't yet, and I'm OOO until next week. @mfojtik thanks!

@0xmichalis 0xmichalis assigned mfojtik and unassigned stevekuznetsov May 19, 2016
@mfojtik
Copy link
Contributor

mfojtik commented May 19, 2016

@Kargakis @stevekuznetsov to make me understand this:

  1. We add MinReadySeconds into DeploymentConfig
  2. When a new deployment is rollout every new pod that is started must be ready in ^^ seconds otherwise we fail deployment?

@0xmichalis
Copy link
Contributor Author

MinReadySeconds takes effect after a pod is ready. If you have dc.Spec.MinReadySeconds == 0, the deployment process should consider pods available as soon as they become ready. If you have dc.Spec.MinReadySeconds == 5, the deployment process should consider pods available 5 seconds after they become ready. I think you will need changes in the rolling updater.

@mfojtik
Copy link
Contributor

mfojtik commented May 19, 2016

@Kargakis right, I get it now, i think.

@stevekuznetsov
Copy link
Contributor

What's the utility of this over a intelligent readiness probe?

@stevekuznetsov
Copy link
Contributor

This does seem less powerful and more frustrating to use than a readiness probe, anyway.

@0xmichalis
Copy link
Contributor Author

@stevekuznetsov this doesn't replace readiness probes but complements them since it will take effect as soon as readiness probes in a pod succeed and it is considered ready.

@stevekuznetsov
Copy link
Contributor

Why is that useful? Shouldn't we want users to define readiness probes that are actually meaningful instead of "X and then it didn't crash for five seconds"?

@smarterclayton
Copy link
Contributor

Just because something is ready doesn't mean that you're ready to move on.

On Thu, May 19, 2016 at 2:30 PM, Michail Kargakis notifications@github.com
wrote:

@stevekuznetsov https://github.com/stevekuznetsov this doesn't replace
readiness probes but complements them since it will take effect as soon as
readiness probes in a pod succeed and it is considered ready.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

Isn't the point of the readiness probe to tell you when you're ready to use the pod, when it's available? I think I may be misunderstanding that feature.

@smarterclayton
Copy link
Contributor

Readiness probe tells you when to add it to a load balancer. Just because
something says it's ready, doesn't mean that it might not crash in the next
few seconds. The delay allows you to add time for new pods to start
accepting traffic in a stable fashion before continuing.

On Thu, May 19, 2016 at 2:41 PM, Steve Kuznetsov notifications@github.com
wrote:

Isn't the point of the readiness probe to tell you when you're ready to
use the pod, when it's available? I think I may be misunderstanding that
feature.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

What's the use of adding it to the target pool of a load balancer if we're not sure it's booted to a stable state and able to service requests?

@smarterclayton
Copy link
Contributor

Because how are you going to take requests that tax the new process until
you're added to the load balancer?

On May 19, 2016, at 7:35 PM, Steve Kuznetsov notifications@github.com
wrote:

What's the use of adding it to the target pool of a load balancer if we're
not sure it's booted to a stable state and able to service requests?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

So we're dealing with a service that boots fine, and without load would signal ready & healthy forever, but might cave under load but only in the first X seconds after being ready, so we want to expose it to traffic so it has the chance of crashing? What if no traffic happens to be routed to that pod in the first X seconds and it crashes on getting the first packet after X+1 seconds? Sorry, not trying to be obtuse here, but this just seems like a very limited use case to be adding to the API. The decision making part of this PR has already happened, so you can ignore me, but when I'm back in the office I'd really appreciate some help understanding this use case.

@stevekuznetsov
Copy link
Contributor

I can see the value of marking the deployment as compete after the pods have been ready, healthy, and have serviced requests, but a seconds time out seems like the most fragile way of doing it. Seems like what you'd really want is a Jenkins pod our or whatever running an e2e suite, but the canonical action there is to label the image as successful, not the deployment.

@ironcladlou
Copy link
Contributor

I think @stevekuznetsov has convinced me that MinReadySeconds is redundant.

@smarterclayton
Copy link
Contributor

It's very, very common for services to crumple under load, or to manifest
failures that would result in aborting the rollout.

Imagine the simplest possible deployment - 2 pod rolling. Dev pushes a
change and the rollout starts. The first pod comes up, reports ready, and
start taking traffic. The next time that pod will be checked for
readiness is configurable by the user, but defaults to 1m. Imagine the
developer has a bug in his caching code - first user hits and immediately
causes the error to manifest. The readiness check for that pod won't have
a chance to report that error for one minute.

In the meantime, the other pod is scaled up and hits the same scenario. As
soon as that pod is ready, the deployment is marked successful. 30s later,
both pods have failing readiness checks, both are out of rotation, and the
app is down. The deployment failed to guard against failure.

MinReadySeconds is necessary to allow the app author to design well written
readiness checks that ensure deployment succeed. Without that there is no
way to force deployments to give the readiness check the change to remain
stable.

On May 20, 2016, at 6:24 AM, Steve Kuznetsov notifications@github.com
wrote:

So we're dealing with a service that boots fine, and without load would
signal ready & healthy forever, but might cave under load but only in the
first X seconds after being ready, so we want to expose it to traffic so it
has the chance of crashing? What if no traffic happens to be routed to that
pod in the first X seconds and it crashes on getting the first packet after
X+1 seconds? Sorry, not trying to be obtuse here, but this just seems like
a very limited use case to be adding to the API. The decision making part
of this PR has already happened, so you can ignore me, but when I'm back in
the office I'd really appreciate some help understanding this use case.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

Do we want to be failing this at the deployment level?

I'm not a sophisticated user of OpenShift but I thought that a mature pipeline would look like dev->testing->prod, so these checks should be happening prior to you deploying something to a "production" environment where real traffic is being directed there.

Furthermore, there is no guarantee that the deployment you're making will be taking any load in the MinReadySeconds interval, making the check fragile.

As a consumer of OpenShift, I would be very hesitant to use this feature because of that fragility. Hell, if I threw in a

oc start-build my-build
# wait for completion
sleep 100

into our Bash scripts, I'd get smacked and someone would tell me to use os::build::wait_for_start.

@smarterclayton
Copy link
Contributor

Yes, we do.

On Fri, May 20, 2016 at 12:44 PM, Steve Kuznetsov notifications@github.com
wrote:

Do we want to be failing this at the deployment level?

I'm not a sophisticated user of OpenShift but I thought that a mature
pipeline would look like dev->testing->prod, so these checks should be
happening prior to you deploying something to a "production" environment
where real traffic is being directed there.

Furthermore, there is no guarantee that the deployment you're making will
be taking any load in the MinReadySeconds interval, making the check
fragile.

As a consumer of OpenShift, I would be very hesitant to use this feature
because of that fragility. Hell, if I threw in a

oc start-build my-build# wait for completion
sleep 100

into our Bash scripts, I'd get smacked and someone would tell me to use
os::build::wait_for_start.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

Can't argue with that :)

@smarterclayton
Copy link
Contributor

Think about this as theory vs practice. In theory, you'd have this magic
system where everything worked based on edge driven changes. But the
reality is that there's an infinite amount of complexity within the app,
and so we ultimately care about levels. One way to help stabilize levels
is to ensure that you give the system time to burn in.

On Fri, May 20, 2016 at 12:58 PM, Steve Kuznetsov notifications@github.com
wrote:

Can't argue with that :)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

Sure, I just didn't think we were going to be supporting a rolling deployment into production without testing as a good idea. This feature simply doesn't make sense if your rolling deployment into production happens after the success of some e2e tests labels an image as production ready - with OpenShift it should be trivial to keep your dev/test/prod deployment specs in sync and be confident that your testing will be relevant in that sense.

@smarterclayton
Copy link
Contributor

Testing in dev != verification in prod. Testing can only reduce the
impact, but if you have no difference between stage and prod, then stage
is prod. For instance, stage and prod usually have different secrets.
Your readiness check might verify that you can contact a db with your
secrets. But your readiness check can't verify that all the code that uses
the secret behaves the same when real users are hitting the service, which
is why you want to set a stabilization window.

Imagine this case. Your readiness check simulates a user login (expensive,
but possible). But can your readiness check simulate 1000 simultaneous
users hitting your brand new pod? No.

On May 20, 2016, at 1:18 PM, Steve Kuznetsov notifications@github.com
wrote:

Sure, I just didn't think we were going to be supporting a rolling
deployment into production without testing as a good idea. This feature
simply doesn't make sense if your rolling deployment into production
happens after the success of some e2e tests labels an image as production
ready - with OpenShift it should be trivial to keep your dev/test/prod
deployment specs in sync and be confident that your testing will be
relevant in that sense.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@stevekuznetsov
Copy link
Contributor

I really hope you're not doing your thousand-user stress test in production, and especially that you're not doing it by hoping one thousand users log in in X seconds after the ready check success...

As to your points about tangible differences in production environments, those make sense. We should doc the usefulness boundaries of this field and maybe even give warnings in oc status if users set the field to be shorter than a minute, or five minutes, or something, otherwise it's just fragile and could be a false sense of security when is utility is really minimal.

@smarterclayton
Copy link
Contributor

Definitely the description should include a use case for why you would
use this - i.e., it only makes sense to set this to > your readiness
interval, and in rough multiples

On Fri, May 20, 2016 at 1:47 PM, Steve Kuznetsov notifications@github.com
wrote:

I really hope you're not doing your thousand-user stress test in
production, and especially that you're not doing it by hoping one thousand
users log in in X seconds after the ready check success...

As to your points about tangible differences in production environments,
those make sense. We should doc the usefulness boundaries of this field and
maybe even give warnings in oc status if users set the field to be
shorter than a minute, or five minutes, or something, otherwise it's just
fragile and could be a false sense of security when is utility is really
minimal.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#7114 (comment)

@0xmichalis
Copy link
Contributor Author

@mfojtik #6233 is merged so you can start working on this. Note that it's part of https://trello.com/c/Mljldox7/643-8-deployments-downstream-support-for-upstream-features

@mfojtik
Copy link
Contributor

mfojtik commented Jun 27, 2016

UPSTREAM: kubernetes/kubernetes#28111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants