-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration for 0 downtime deployments #660
Comments
Hi Ben, As far as i know, there is no good solution this under "mode ip" other than increasing the number of pods. The root cause is the ingress resources is only aware of service, and it's not aware of pods However, to achieve 0 downtime deployment, you can use "mode instance" 😸 |
Hey @M00nF1sh thanks for the pointer! I can confirm that on my end it does indeed look like using the |
sure, PR are welcome 😸 . talked with @bigkraig , mode ip might also support 0 downtime in the future, but we don't know how to achieve that. maybe needs k8s core change |
Right, it would seem that to do 0 downtime with the Anyway, I'll look into making a PR to note that |
Could Pod Readiness Gates help here?
It introduces At first glance, it looks like a custom PodCondition like
It seems this would solve the issue of Pods being terminated before they are Healthy in the ALB. Edit: Actually I think that a normal Deployment rolling update should suffice, with |
Hey folks, I'm having a fair bit of trouble getting 0 downtime deployments to work. The issue:
After a new pod is passes its readiness check, the ALB target group places the Pod's IP in an "initial" state which can last for a couple of seconds.
However since the new pod is ready as far as K8s is concerned, it begins to terminate an old pod, which immediately enters a "draining" state on the target group. At this point there are no pods available to answer requests.
To some extent this can be handled by simply increasing the number of pods. This isn't really a solution though, it just lowers the probability that the rolling deployment outpaces the ALB's ability to keep up. If the AWS API were to undergo any kind of delay or outage, the deployment could complete without any live pods actually registered in the target group.
Is there any known way to require a pod show up as "healthy" in the target group before K8s considers it alive?
The text was updated successfully, but these errors were encountered: