Kong routes requests to targets deleted via Kong Ingress controller #6312

zackery-parkhurst · 2024-07-10T15:39:48Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Whenever a scale down event happens on a kubernetes pod from HPA scaling down or a deployment deleting pods in update strategy kong sends request to the pod after its already been deleted. Resulting in a request that hangs for 60 seconds until eventually kong cancels and responds with 504.

Notice in these logs you can see that the pod is deleted. Kong Ingress Controller updates the configuration in kong. But a request comes through in between that time and just hangs for 60 seconds.

Expected Behavior

Whenever I delete a pod, whether that happens through an HPA downscale event or a deployment. Kong should immediately no longer use the target ip.

However, there is a delay in the time from pod being deleted to kong updating its configuration. And if any requests come through during that time. They just hand and timeout.

Steps To Reproduce

1. Install Kong Ingress Controller
2. Create HPAs for a service and let it scale down
3. Have HPA scale down a services pods
4. Have a request come into kong after pod has been deleted but before kong updates its configuration. 
5. Witness Kong use stale ip

Kong Ingress Controller version

KIC image - kong/kubernetes-ingress-controller:3.0
Kong image - kong:3.5

Kubernetes version

Server Version: v1.28.9-eks-036c24b

Anything else?

The only other thing I would point out is that it appears kong does not sync the configuration until after the pod has been deleted. Which is odd as I though that kong watches the endpoint objects for changes. And the endpiont object is immediately updated the moment the pod is marked for deletion.

So theoretically kong should update itself before a pod is deleted. As a pod has a default termination grace period of 30 seconds. So when the pod is first marked for deletion the endpoint is immediately updated. Which should cause kong to start updating at that point.

And then on a final note. Regardless if kong is updating its configuration as soon as pod is marked for deletion. Or after its fully deleted still doesn't change the fact that there is always some latency between a pod being deleted or marked for deletion.....to kic updating its upstream targets and then sync'ing that with kong.
So technically there is always a window where a request can come in and hit a stale target and then sit and wait for 60 seconds(or however long timeout is for).

How can this be avoided? Or rectified as to not cause problems? As our issue is that kong takes 60 seconds waiting to timeout on an ip that no longer exists. The clients end up timing out or kong then times out and returns a 504.

randmonkey · 2024-07-12T06:38:20Z

@zackery-parkhurst KIC does not watch for HPA events but watch for services and endpoints. After KIC noticed that a pod is deleted (or marked as not ready that reflected to a change on related Endpoints), it will update its own cache immediately. While for configuring Kong, KIC does it in a 3 second period since doing a full sync on Kong config is a heavy operation. You can configure CONTROLLER_PROXY_SYNC_SECONDS to configure the period of applying configuration to Kong.

zackery-parkhurst · 2024-07-12T18:12:04Z

@randmonkey Thank you for the response.

I understand that KIC watches the endpoints to update the upstream targets. Its the 3 second sync period that is causing my issues.

As when a pod is deleted or marked for deletion. It takes KIC ~3 seconds to then update Kong. Which means if any requests come in during that time. Kong will then send requests to a target that is being shut down.

And if the request does not finish before the app shuts down its server. Then the request will just hang until the kong timeout is reached and kong kills it with a 504.

Thank you for the information on We could shorten the CONTROLLER_PROXY_SYNC_SECONDS , but that would probably increase the load on kong and therefore increase latency that kong takes to process requests. And still leave a time from when the endpoint changes. To when the KIC actually updates Gateway with changes. Which would be the 1 second it takes KIC to sync changes to Kong.

So what would be the recommended way to handle this situation? As if there is no way to prevent kong from sending a request to a pod that is being shut down/terminated?

Last thing:
I tried researching how to accomplish this but could not find out how to do other than manually setting upstream targets via Admin API. Right now, we use Ingress objects that point at a clusterIP service and Kong automatically updates the target endpoints. Is there a way to have KIC to configure upstream targets with either clusterIP or Service Name from the ingress object instead?

pmalek · 2024-07-12T19:15:09Z

@zackery-parkhurst you're looking for https://docs.konghq.com/kubernetes-ingress-controller/latest/reference/annotations/#ingresskubernetesioservice-upstream

zackery-parkhurst · 2024-07-12T19:43:14Z

That is exactly what I was looking for. Thank you!

zackery-parkhurst added the bug Something isn't working label Jul 10, 2024

randmonkey added the pending author feedback label Jul 12, 2024

randmonkey closed this as completed Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kong routes requests to targets deleted via Kong Ingress controller #6312

Kong routes requests to targets deleted via Kong Ingress controller #6312

zackery-parkhurst commented Jul 10, 2024 •

edited

Loading

randmonkey commented Jul 12, 2024

zackery-parkhurst commented Jul 12, 2024

pmalek commented Jul 12, 2024

zackery-parkhurst commented Jul 12, 2024

Kong routes requests to targets deleted via Kong Ingress controller #6312

Kong routes requests to targets deleted via Kong Ingress controller #6312

Comments

zackery-parkhurst commented Jul 10, 2024 • edited Loading

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Kong Ingress Controller version

Kubernetes version

Anything else?

randmonkey commented Jul 12, 2024

zackery-parkhurst commented Jul 12, 2024

pmalek commented Jul 12, 2024

zackery-parkhurst commented Jul 12, 2024

zackery-parkhurst commented Jul 10, 2024 •

edited

Loading