Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When multiple pods are selected: Requests are being client-side throttled #44

Closed
dgzlopes opened this issue Nov 14, 2022 · 7 comments
Closed

Comments

@dgzlopes
Copy link
Member

When trying to instantiate a PodDisruptor, with a selector that touches 15 pods, I get the following messages from time to time:

I1114 11:50:30.814928   48040 request.go:601] Waited for 4.391415667s due to client-side throttling, not priority and fairness, request: PATCH:https://kubernetes.docker.internal:6443/api/v1/namespaces/k6-cloud-crocospans/pods/grafana-agent-metrics-0/ephemeralcontainerstions

I wonder if we could be more gentle with out request pattern. Also, I wonder if this could be a problem on huge namespaces.

@pablochacin
Copy link
Collaborator

pablochacin commented Nov 14, 2022

@dgzlopes Thanks for opening this issue. Could you please elaborate on this comment:

I wonder if we could be more gentle with out request pattern.

@dgzlopes
Copy link
Member Author

dgzlopes commented Nov 14, 2022

I wonder if we could be more gentle with our request pattern.

I was reading https://kubernetes.io/docs/concepts/cluster-administration/flow-control/

And yeah, just wondering if, instead of relying on this throttling that the client provides, we should just make the freq of requests slower if there are many of them. If someone has increased the limits of their API Server or used an earlier Kubernetes version, this behavior could be dangerous.

@pablochacin
Copy link
Collaborator

That is interesting. The extension makes relatively few requests. One for checking if the pod already has the ephemeral container, one for patching the pod if not and then, it waits until the container is ready.

I suspect that is in this check, which uses a watch where maybe the call to the API Server can be optimized.

@pablochacin
Copy link
Collaborator

I haven't looked properly to the message. It points to a PATCH operation, which is executed when the agent is injected (before the watch I mentioned in my previous comment) So the issue comes for the number of concurrent agent injections, not only due to the number of concurrent watches.

@pablochacin
Copy link
Collaborator

pablochacin commented Nov 17, 2022

Regarding this issue, it is important to notice that it may affect the time it takes to inject all agents in the targets, but it is mostly inconsequential in most cases. Therefore there are several alternatives:

  1. Document as a warning message that can be ignored
  2. Prevent the message to be displayed (even when the throttling occurs)
  3. Increase the RQS parameter in the client configuration. However, this number cannot be arbitrarily high, so eventually the issue will arise when injecting agents in a large number of pods
  4. Implement a custom logic for limiting the number of concurrent injections

Those actions are not mutually exclusive and none ensures the problem will eventually arise.

@pablochacin
Copy link
Collaborator

Following the discussion in the Kubernetes community around this issues, it seems that the client-side throttling is no longer needed as the server-side "Priority and Fairness" is enable by default since v1.20 and should be disable. As this feature is still enable in the client, the simplest way to disable it is to set the RQS parameter in the client configuration to a value high enough to prevent the client-side throttling in most common usage scenarios.

Considering that the throttling happens when the agents are injected, we consider that a RQS limit of 50 with a burst of 100 should be sufficient for most cases.

@pablochacin
Copy link
Collaborator

Fixed by #55

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants