pod termination might cause dropped connections #2366

nirnanaaa · 2021-11-16T14:09:11Z

Describe the bug
When a pod is Terminating it will receive a SIGTERM connection asking it to finish up work and after that it will proceed with deleting the pod. At the same time that the pod starts terminating, the aws-load-balancer-controller will receive the updated object, forcing it to start removing the pod from the target group and to initialize draining.

Both of these processes - the signal handling at the kubelet level and the removal of the Pods IP from the TG - are decoupled from one another and the SIGTERM might have been handled before or at the same time, that the target in the target group has started draining.
As result the pod might be unavailable before the target group has even started its own draining process. This might result in dropped connections, as the LB is still trying to send requests to the properly shutdown pod. The LB will in-turn reply with 5xx responses.

Steps to reproduce

Provision an ingress with an ALB attached

  alb.ingress.kubernetes.io/certificate-arn: xxx
  alb.ingress.kubernetes.io/healthcheck-interval-seconds: "10"
  alb.ingress.kubernetes.io/healthcheck-path: /
  alb.ingress.kubernetes.io/healthcheck-timeout-seconds: "5"
  alb.ingress.kubernetes.io/healthy-threshold-count: "2"
  alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
  alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=30
  alb.ingress.kubernetes.io/scheme: internet-facing
  alb.ingress.kubernetes.io/security-groups: xxx
  alb.ingress.kubernetes.io/tags: xxx
  alb.ingress.kubernetes.io/target-group-attributes: deregistration_delay.timeout_seconds=60
  alb.ingress.kubernetes.io/target-type: ip
  alb.ingress.kubernetes.io/unhealthy-threshold-count: "2"

Create a service and pods (multiple ones through a deployment work best) for this ingress
(add some delay/load to the cluster, that will cause the AWS requests to be slower or have to be retried)
startup an HTTP benchmark to produce some artificial load
rollout a change to the deployment or just evict some pods

Expected outcome

The SIGTERM should only start after the ALB has either removed the target from the Target Group or at least doesn't send any new traffic to the pod after it received the SIGTERM signal

Environment

an HTTP service

All our ingresses have

AWS Load Balancer controller version v2.2.4

Kubernetes version

Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.20-eks-8c579e", GitCommit:"8c579edfc914f013ff48b2a2b2c1308fdcacc53f", GitTreeState:"clean", BuildDate:"2021-07-31T01:34:13Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Using EKS (yes), if so version ? 1.18 eks.8

Additional Context:

We've been relying on Pod-Graceful-Drain, which unfortunately forks this controller and intercepts and breaks k8s controller internals.

You can achieve a pretty good result as well using a sleep as preStop, but that's not reliable at all - due to the fact that it's just a guessing game if your traffic will be drained after X seconds - and requires statically linked binaries to be mounted in each container or the existence of sleep in the operating system.

I believe this is not only an issue to this controller, but to k8s in general. So any hints and already existing tickets would be very welcome.

The text was updated successfully, but these errors were encountered:

M00nF1sh · 2021-11-17T20:22:27Z

@nirnanaaa
I'm not aware of any general solution that is better than a preStop hook. Since after all, the connection is controlled by both end of application(client/server) instead of LBs.
There are also ideas like propagate LB's connection draining status back to pod/controller, but that is not widely supported and not reliable as well (finished draining state doesn't mean all existing connection is closed)

In my mind, it can be handled in a protocol specific way by client/server if you have control over server's code
e.g. If the server receives a sigTerm, it can

continue listen for incoming connections. (only a limited number of new connections will coming since LB will start deregister the target soon)
tell the existing long running clients to migrate to use new connections in protocol-specific manner, e.g.:
- http: connection: close header in responses
- websocket: GOAWAY message.
waiting for all existing connection terminate and exit

Once we implemented above, then we only need to set a large terminationGracePeriodSeconds.

I have a sample app implemented above: https://github.com/M00nF1sh/ReInvent2019CON310R/blob/master/src/server.py#L24

nirnanaaa · 2021-11-18T03:51:41Z

Hey @M00nF1sh thanks for your input. I fear this is exactly what I thought was supposed to fix it. But thinking about it: if the server does all of this: what does prevent the LB from still sending traffic to the target - even for a brief period. Unfortunately We're hitting this exact scenario quite often.

So

the server starts to drain its own connections properly (like you described and like it's being handled by most frameworks automatically nowadays if signal propagation works correctly)
the LB controller will get the request to remove the target
the server is done draining connections (these don't even have to be long running connections) and is being Terminated.
the LB will send out the AWS request to deregisterTarget.
requests are still coming in, servers' already down
the Lb has finished processing and is now is not sending any more requests to the backend

Am I the only one that sees this as a problem? A preStop hook is not very reliable IMO as it's just eyeballing the timing issue, just like setTimeout doesn't fix a problem in javascript but only makes it less likely.

ejholmes · 2021-11-18T20:55:24Z

We also ran into this issue. The preStop hack generally works, but it's still a hack and it still seems to fail to synchronize the "deregistration before SIGTERM" properly. Even with this in place, we've seen intermittent cases where the container still seems to get a SIGTERM before the target is deregistered (possibly by the DeregisterTargets API call getting throttled).

This is a pretty serious issue for anyone trying to do zero-downtime deploys on top of Kubernetes with ALB's.

M00nF1sh · 2021-11-19T00:15:35Z

@nirnanaaa
I see. The remaining gap is that the pods itself finishes draining too fast before LBs deregister them.

For this case, a sleep is indeed needed since we don't have any information available on when the LBs actually stops sending targets. (even when the targets shows as draining after the controller made the deregisterTargets call, it still takes time for the changes to be actually propagated to ELB's dataplane).
If ELB have support for server-side retry, it could handle this nicely, but i'm not aware when that will be done.

nirnanaaa · 2021-11-19T03:56:31Z

I was just wondering whether this should be solved at a lower level - that's why I also opened an issue on k/k. This is not something just limited to this controller (although in cluster LBs are less likely to have this issue, it's still present).

And you're absolutely right. Client side retries would probably solve this. There are even some protocols like GRPC which could work around this problem, but the truth is that we cannot really control what's being run on the cluster itself - hence also my doubts about the use of sleep.

I thought about maybe having a more sophisticated preStop statically linked and mounted through a sidecar, that could delay the signal until a target has been removed from the LB, but fear it's also a hacky workaround that makes things even worse (especially considering API rate limits)

nirnanaaa · 2021-11-25T11:04:22Z

I've also drawn up a convenient picture detailling the issue further - the big, organge box is where things are happening decoupled from one another.

BrianKopp · 2021-11-28T14:40:46Z

This is indeed a problem that exists in kubernetes generally, not just ALB. We ran into this a lot using classic load balancers in proxy mode to nging ingress with external traffic policy local to get real IPs.

Using the v2 ELBs with ip target groups is a big improvement over the external traffic policy local mechanism requiring health check failures to get the node out of the instance list.

That being said, this still is an issue that can happen. Sleeps and prestop hooks are really the only game in town. I'm not aware of any kind of prestop gate like the readiness gate this controller can inject.

Is there a community binary that does this already OOTB? If not, that'd be a good little project.

Another alternative is to use a reverse proxy like ingress nginx be the target for your alb ingress instead of your application. Then your container lifecycle events for your alb targets will be much, much less frequent.

nirnanaaa · 2021-11-29T06:02:35Z

@BrianKopp we've thought about running a sidecar, which provides a statically linked binary to a common - in mem - volume, that the main container could use as preStop. I just couldn't come up with a generic solution to check for "is this pod still in any target group" - without either running into API throttles (and potentially blocking the main operation of the aws-lb-controller) - or completely breaking single responsibility principles.

BrianKopp · 2021-11-29T21:52:41Z

IIRC, sidecar for this sort of thing is a trap since a prestop hook interrupts the sigterm for its container, not all containers. Your http container would get its sigterm immediately.

I've actually begun thinking about starting a project to address this. My thought is to have an http service inside the cluster, say at drain-waiter. And you could call http://drain-waiter/drain-delay?ip=1.2.3.4&ingress=your-ingress-name&namespace=your-namespace&max-wait=60, and then you could have a prestop hook curling that url. We can get the hostname from the ingress object, and therefore filter the target groups very easily.

What do you think? Is this worth making a thing?

nirnanaaa · 2021-11-30T03:58:55Z

Oh I was not talking about processing the sigterm in the sidecar. If you read closely I only spoke about providing the binary ;)

For our use case even a simple HTTP query will not work if not done through some statically linked binary, since we cannot even rely on libc to be present.

BrianKopp · 2021-11-30T05:33:46Z

Ok I see what you were suggesting. I mean, if the requirement is that we cannot place any runtime demands on the http container, then yes we would need some kind of sidecar to be injected to provide some functionality, along with a mutating webhook to add the container and prestop hook, in case a prestop hook wasn't already present. That part seems like a bit of a minefield.

I was thinking about putting together something that would work for now until such a solution would be possible, if at all possible.

If one didn't want to place any dependency requirements on the main http container (eg curl), a prestop hook could wait for a signal over a shared volume from a lightweight curling sidecar

sftim · 2021-12-21T09:46:15Z

I think this is best addressed by extending the Pod API. Yes, that's not a trivial change. However, the Pod API is what the kubelet pays attention to. If you want the cluster to hold off sending SIGTERM then there needs to be a way in the API to explain why that's happening. This code doesn't have the levers to pull to make things change how we'd need.

Another option is to redefine SIGTERM to mean “get ready to shutdown but keep serving”. I don't think that's a helpful interpretation of SIGTERM though.

michaelsaah · 2022-03-17T18:57:03Z

question for those who've dealt with this: would you agree that a correct configuration to handle this issue looks like deregistrationDelay < sleep duration < terminationGracePeriod?

ejholmes · 2022-03-17T19:48:29Z

I think that's correct. This is generally what we've used in helm charts:

# ingress
alb.ingress.kubernetes.io/target-group-attributes: deregistration_delay.timeout_seconds={{ .Values.deregistrationDelay }}
# pod
terminationGracePeriodSeconds: {{ add .Values.deregistrationDelay .Values.deregistrationGracePeriod 30 }}
command: ["sleep", "{{ add .Values.deregistrationDelay .Values.deregistrationGracePeriod }}"]

Where deregistrationGracePeriod provides a buffer for DeregisterTargets getting rate limited. We still have issues with this, but the buffer period does help.

mbyio · 2022-04-26T23:57:46Z

For NLBs in IP mode (everything should be similar for other LBs and modes):

deregistration takes at least 2 minutes regardless of the deregistration delay setting
during deregistration, the pod may still receive new requests/connections even though it is in the process of deregistering
the preStop hook therefore has to sleep for at least 2 minutes
and the termination grace period has to be 2 minutes + the maximum time for any request to complete + time for any background processes to clean up + a short grace period in case there is a delay (eg. deregistration was rate limited)

ejholmes · 2022-04-27T18:14:15Z

deregistration takes at least 2 minutes regardless of the deregistration delay setting

That's an interesting finding. Is there some documentation you can point to that highlights this (assuming it's AWS side)?

mbyio · 2022-04-27T19:38:19Z

I believe it is all on AWS' side yes. No I couldn't find any documentation about this. All due respect to AWS engineers, their ELB documentation is horrible, missing a lot of important information. I found this by writing some programs that use raw TCP connections (so I can monitor every aspect) and manually triggering deregistration in various ELB configurations to record the timing. It consistently took 2 minutes.

BrianKopp · 2022-04-27T23:42:16Z

In your testing, did the target group state for the target show the ip as draining while it was receiving packets still? Did it receive packets after the target dropped completely out of the target group? I've got a project I'm working on to add a delay in the prestop hook to wait until the ip is out of the target group. Would that be helpful here?

mbyio · 2022-04-28T17:57:40Z

Yes, it did correctly show the IP on the target list as draining as soon as I requested deregistration. And when it was removed from the target list, it also stopped receiving new connections. So, if a tool was monitoring the target list and used that to delay sending SIGTERM to a pod until the pod is out of the target list (eg. using preStop), that would solve the problem. It would be the opposite of a readiness gate. I think one difficulty is AWS has some restrictive rate limits, so depending on your scale I don't think you can just have every pod hitting the API in the preStop hook.

dcarley · 2022-05-05T08:46:04Z

We were told this by AWS support:

Similarly, when you deregister a target from your Network Load Balancer, it is expected to take 90-180 seconds to process the requested deregistration, after which it will no longer receive new connections. During this time the Elastic Load Balancing API will report the target in 'draining' state. The target will continue to receive new connections until the deregistration processing has completed. At the end of the configured deregistration delay, the target will not be included in the describe-target-health response for the Target Group, and will return 'unused' with reason 'Target.NotRegistered' when querying for the specific target.

Sleeping for 180s (3m) still hasn't been reliable for us though, so we're currently at 240s (4m) 😭

nirnanaaa · 2022-05-05T08:55:42Z

we've actually started sharding our services across multiple aws accounts/eks clusters, just to lower the number of pending/throttled API requests and to increase the speed at which each controller can operate.

But then again: the probability of this error happening is still not zero. On a new AMI release we frequently experience dropped packages (even with X seconds of sleep as preStop)

mbyio · 2022-05-05T22:06:53Z

I think the controller should automatically add a preStop hook to pods, which waits until the controller indicates it is safe to start terminating. That would nip this in the bud once and for all.

mattjamesaus · 2023-03-31T05:34:42Z

We came across this issue and found ours to be a combination of both the Cluster Autoscaler (CA) and the Load Balancer Controller working in tandem when external traffic policy was set to cluster. We'd have nodes getting ready to be scaled in (being pretty much empty except for kube-proxy) then the CA would kick in and taint the node and terminate it well before it was deregistered by the LB Controller.

This would result in-flight requests being dropped from kube-proxy to another node (that ran our HAPRoxy pods). It appears now tho that the latest CA introduces a --node-delete-delay-after-taint which will gracefully wait an amount of seconds before killing the node. I'm about to test this but my assumption here is that if this delay is set to just longer than your deregistration timeout the node will be kept around long enough to be successfully pulled out of the target group and gracefully terminate all the in-flight connections going to kube-proxy.

This would still require adequate pod termination for whatever service gets the traffic but most handle that perfectly well already.

It seems like this is a pretty prevelant problem for what i assume is a very popular setup - if it does fix the issue we reaelly should call this out in the documentation that administrators need to be cognizant of the cluster autoscaler and LB controller potentially dropping requests in the externaltrafficpolicy cluster mode.

k8s-triage-robot · 2023-06-29T06:31:18Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

johndistasio · 2023-06-29T12:28:06Z

/remove-lifecycle stale

Ghilteras · 2023-07-13T18:00:54Z

I think the controller should automatically add a preStop hook to pods, which waits until the controller indicates it is safe to start terminating. That would nip this in the bud once and for all.

Doesn't seem to be working though as the pod dies immediately instead of gracefully shutdown, we had to remove the preStop hook and sleep 30s instead to make all upstream GRPC errors go away

mbyio · 2023-07-14T16:41:12Z

I think the controller should automatically add a preStop hook to pods, which waits until the controller indicates it is safe to start terminating. That would nip this in the bud once and for all.

Doesn't seem to be working though as the pod dies immediately instead of gracefully shutdown, we had to remove the preStop hook and sleep 30s instead to make all upstream GRPC errors go away

This automatic preSop hook was a suggestion for what they should implement, it isn't implemented yet.

mbyiounderdog · 2023-07-17T16:15:22Z

If you have a preStop hook that waits for shutdown, then it was not added by the controller.

Ghilteras · 2023-07-18T17:13:20Z

when I say wait for the shutdown I just mean a simple sleep and sure there are less errors using the sleep preStop, but as the OP said it's a hack, there's no way to make the controller gracefully terminate and the OP is also correct in saying that this is not a specific aws-load-balancer controller issue b/c we have exactly the same problem on the nginx ingress controller deployed on bare metal

johngmyers · 2023-07-18T17:52:09Z

I can't think of a good way for LBC to communicate to a preStop that the pod has been deregistered. The pod would have to be in a ServiceAccount that has RBAC to watch some resource that LBC would update or there'd have to be some network connection opened up between LBC and the pod.

Ghilteras · 2023-07-18T18:43:33Z

I think the issue here is just for long lived connections b/c the controller does not sever them when it goes into draining mode, so clients keep obsessing with them even when the sleep/grace period is over, hence the errors.

I don't know if there is a way to force the controller to sever the long lived connection when entering draining mode, so that when the clients would reconnect to the ingress the controller should not pick the pods in Terminating state and the newly re-established connections would be healthy

johngmyers · 2023-07-18T19:06:05Z

@Ghilteras no, the pod knows about long-lived connections, is perfectly capable of closing them itself, and knows when they have gone away. The problem is new connections that keep coming in from the load balancer. The pod does not know when the load balancer has finished deregistering it and thus there will no longer be any more new incoming connections.

mbyio · 2023-07-18T20:53:00Z

The pod would have to be in a ServiceAccount that has RBAC to watch some resource that LBC would update or there'd have to be some network connection opened up between LBC and the pod.

I'm no longer working at the company where I needed this day-to-day. However, we would have been willing to deal with a lot of setup, including service accounts etc, in order to have an automated fix for this bug. It was a major pain point. As you can see in these comments, it is also hard for many people to understand, and therefore, hard to work around.

Ghilteras · 2023-07-18T21:52:51Z

@Ghilteras no, the pod knows about long-lived connections, is perfectly capable of closing them itself, and knows when they have gone away. The problem is new connections that keep coming in from the load balancer. The pod does not know when the load balancer has finished deregistering it and thus there will no longer be any more new incoming connections.

How do you close and re-open a GRPC connection from the client to force it to dial another non Terminating pod? I don't see anything like this in the grpc libraries. If your client gets a GOAWAY all it can do is retry it, which will leverage the existing long lived connection to the Terminating pod, which means it will keep failing. There is no way of automatically handle this from client without wrapping it with custom logic that servers the connection on GOAWAY/draining errors from servers. All this because NGINX cannot close the long lived connection on its side when it's gracefully shutting down? I think I'm missing something here

mbyio · 2023-07-18T23:12:08Z

@Ghilteras You may be looking at the wrong Github issue. This issue is about a case where AWS load balancers can send new requests to terminating pods that have already stopped accepting new requests. NGINX is not involved.

Edit - oh I see, you mentioned the nginx ingress controller having the same problem. This issue is regarding connections from a load balancer to server pods. If you're using GRPC to connect to a load balancer, NGINX or an ALB, then that load balancer is intercepting the connections and doing load balancing. So there isn't really anything you can do on the client side to fix this.

johngmyers · 2023-07-18T23:19:33Z

@Ghilteras I'm saying the terminating server pod is capable of shutting down the long-lived connection.

Ghilteras · 2023-07-19T19:13:15Z

@Ghilteras I'm saying the terminating server pod is capable of shutting down the long-lived connection.

I don't see any way for this ingress controller (or others, as OP said this is not a specific issue of this controller) to be capable of shutting down long-lived connections, if you are aware of a way of doing this please share.

johngmyers · 2023-07-19T19:37:04Z

I wasn't talking about this ingress controller shutting down the connection.

Ghilteras · 2023-07-20T16:23:30Z

the GRPC server is already sending GOAWAYs, the issue is that when the clients reconnect they just end up going back to the Terminating pod, which should not be getting new traffic through the ingress, but it does and that's when we see the server throwing hundreds of connection reset by peer errors

johngmyers · 2023-07-21T04:12:38Z

@Ghilteras and thus the problem is with new connections coming in from the load balancer, as I previously stated.

Roberdvs · 2023-11-01T13:05:42Z

For anyone still struggling with the interaction between NLB and ingress-nginx-controller, this is the setup that seems to better mitigate the issues for us for HTTP traffic.

ingress-nginx chart values snippet:

service:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
    service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
    service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
    service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: "deregistration_delay.timeout_seconds=0"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: 2
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: 5

# Mantain existing number of replicas at all times
updateStrategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0
# Add a pause to make time for the pod to be registered in the AWS NLB target group before proceeding with the next
# https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1834#issuecomment-781530724
# https://alexklibisz.com/2021/07/20/speed-limits-for-rolling-restarts-in-kubernetes#round-3-set-minreadyseconds-maxunavailable-to-0-and-maxsurge-to-1
minReadySeconds: 180
# Add sleep on preStop to allow for graceful shutdown with AWS NLB
# https://github.com/kubernetes/ingress-nginx/issues/6928
# https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2366#issuecomment-1118312709
lifecycle:
  preStop:
    exec:
      command: ["/bin/sh", "-c", "sleep 240; /wait-shutdown"]
# -- Maximum unavailable pods set in PodDisruptionBudget.
maxUnavailable: 1

Also, externalTrafficPolicy on the service of type LoadBalancer is set to Cluster.

With this setup rolling updates are reliable enough albeit reaaally slow, mostly due to having to account for #1834 (comment)

Hope it helps someone, and if anyone detects any issues, improvements, or recommendations, it would be good to know.

P.S: have yet to play around with pod readiness gates

Other issues referenced:

venkatamutyala · 2023-11-15T04:09:58Z

Thanks @Roberdvs this is huge. I just came across the problem in my cluster while testing an upgrade and I was able to fix it within an hour by referencing your work. I have been running statuscake.com with 30sec checks to help verify your fix works as expected. Prior to the change I had a minute or two of downtime. After the change, I appear to be at a real 100% uptime. For anyone else coming across this, I implemented the following I am using the default NLB deployment method given I don't have AWS LB Controller installed. Please see @Roberdvs comment above if you are using the LB Controller.

        controller:
          service:
            annotations:
              service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
            type: "LoadBalancer"
            externalTrafficPolicy: "Local"
          # https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2366
          # https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2366#issuecomment-1788923154
          updateStrategy:
            rollingUpdate:
              maxSurge: 1
              maxUnavailable: 0
          # Add a pause to make time for the pod to be registered in the AWS NLB target group before proceeding with the next
          # https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1834#issuecomment-781530724
          # https://alexklibisz.com/2021/07/20/speed-limits-for-rolling-restarts-in-kubernetes#round-3-set-minreadyseconds-maxunavailable-to-0-and-maxsurge-to-1
          minReadySeconds: 180
          # Add sleep on preStop to allow for graceful shutdown with AWS NLB
          # https://github.com/kubernetes/ingress-nginx/issues/6928
          # https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2366#issuecomment-1118312709
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 240; /wait-shutdown"]

Note: I removedmaxUnavailable defined under the lifeCycle example from @Roberdvs as it appeared to be a default in my helm chart already

My versions:
chart repo: https://kubernetes.github.io/ingress-nginx'
name: ingress-nginx
chart version: v4.8.3

michaelsaah · 2023-12-05T18:02:42Z

was thinking about this again... has anyone thought about modifying the ALB controller to set a finalizer on the pods within the TG, and using that as a hook to prevent termination until it's been successfully removed from the TG? this would hinge on the SIGTERM not getting sent until after all finalizers have been removed, which I would expect to be true but would need to verify.

mbyio · 2023-12-06T20:12:23Z

@michaelsaah That sounds like an interesting approach to me. I think it is worth someone trying it out. As you said, it really just depends on if k8s will wait to send the SIGTERM until finalizers are removed.

ibalat · 2024-04-15T07:41:54Z

any update? I have same problem and I think taht it can be related also autoscaler. Please can you view it: kubernetes/autoscaler#6679

nirnanaaa changed the title ~~draining might cause dropped connections~~ pod termination might cause dropped connections Nov 16, 2021

nirnanaaa mentioned this issue Nov 17, 2021

Pod Termination handling kicks in before the ingress controller has had time to process kubernetes/kubernetes#106476

Open

stevehipwell mentioned this issue Nov 29, 2021

add(fluentd,fluent-bit): added lifecycle object fluent/helm-charts#185

Merged

kishorj mentioned this issue Dec 20, 2021

pods become deregistered then after re-register healthstatus reports still registering #2420

Closed

M00nF1sh mentioned this issue Feb 2, 2022

502 Errors when deployment scales up #2490

Closed

This was referenced Mar 7, 2022

[Helm] pod termination might cause dropped connections rancher/rancher#36789

Closed

[HELM] Implement configurable lifecycle hook preStop sleep duration. rancher/rancher#36819

Closed

mtparet mentioned this issue Jun 1, 2022

Zero downtime upgrade kubernetes/ingress-nginx#6928

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 29, 2023

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 29, 2023

karel1980 mentioned this issue Nov 28, 2023

Allow configuring terminationGracePeriodSeconds and lifecycle hooks oauth2-proxy/manifests#174

Merged

shraddhabang added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 14, 2024

ptc-mrucci mentioned this issue Mar 27, 2024

Delay between target removal in the TargetGroup and POD removal causing 502 error #3430

Closed

blancoramiro mentioned this issue May 13, 2024

Killing Python pods via rollout causes 502's freelawproject/courtlistener#3726

Open

oliviassss mentioned this issue Jun 12, 2024

Target group deletion causing inflight request 503's #3739

Open

pod termination might cause dropped connections #2366

pod termination might cause dropped connections #2366

Comments

nirnanaaa commented Nov 16, 2021 • edited Loading

M00nF1sh commented Nov 17, 2021 • edited Loading

nirnanaaa commented Nov 18, 2021 • edited Loading

ejholmes commented Nov 18, 2021

M00nF1sh commented Nov 19, 2021

nirnanaaa commented Nov 19, 2021 • edited Loading

nirnanaaa commented Nov 25, 2021 • edited Loading

BrianKopp commented Nov 28, 2021

nirnanaaa commented Nov 29, 2021

BrianKopp commented Nov 29, 2021 • edited Loading

nirnanaaa commented Nov 30, 2021

BrianKopp commented Nov 30, 2021

sftim commented Dec 21, 2021

michaelsaah commented Mar 17, 2022

ejholmes commented Mar 17, 2022 • edited Loading

mbyio commented Apr 26, 2022

ejholmes commented Apr 27, 2022

mbyio commented Apr 27, 2022

BrianKopp commented Apr 27, 2022

mbyio commented Apr 28, 2022

dcarley commented May 5, 2022

nirnanaaa commented May 5, 2022

mbyio commented May 5, 2022

mattjamesaus commented Mar 31, 2023 • edited Loading

k8s-triage-robot commented Jun 29, 2023

johndistasio commented Jun 29, 2023

Ghilteras commented Jul 13, 2023 • edited Loading

mbyio commented Jul 14, 2023

mbyiounderdog commented Jul 17, 2023

Ghilteras commented Jul 18, 2023

johngmyers commented Jul 18, 2023

Ghilteras commented Jul 18, 2023 • edited Loading

johngmyers commented Jul 18, 2023

mbyio commented Jul 18, 2023

Ghilteras commented Jul 18, 2023 • edited Loading

mbyio commented Jul 18, 2023 • edited Loading

johngmyers commented Jul 18, 2023

Ghilteras commented Jul 19, 2023

johngmyers commented Jul 19, 2023

Ghilteras commented Jul 20, 2023

johngmyers commented Jul 21, 2023

Roberdvs commented Nov 1, 2023 • edited Loading

venkatamutyala commented Nov 15, 2023 • edited Loading

michaelsaah commented Dec 5, 2023 • edited Loading

mbyio commented Dec 6, 2023

ibalat commented Apr 15, 2024

nirnanaaa commented Nov 16, 2021 •

edited

Loading

M00nF1sh commented Nov 17, 2021 •

edited

Loading

nirnanaaa commented Nov 18, 2021 •

edited

Loading

nirnanaaa commented Nov 19, 2021 •

edited

Loading

nirnanaaa commented Nov 25, 2021 •

edited

Loading

BrianKopp commented Nov 29, 2021 •

edited

Loading

ejholmes commented Mar 17, 2022 •

edited

Loading

mattjamesaus commented Mar 31, 2023 •

edited

Loading

Ghilteras commented Jul 13, 2023 •

edited

Loading

Ghilteras commented Jul 18, 2023 •

edited

Loading

Ghilteras commented Jul 18, 2023 •

edited

Loading

mbyio commented Jul 18, 2023 •

edited

Loading

Roberdvs commented Nov 1, 2023 •

edited

Loading

venkatamutyala commented Nov 15, 2023 •

edited

Loading

michaelsaah commented Dec 5, 2023 •

edited

Loading