-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector making api requests to Kubernetes API server without using resource_version #16797
Comments
Opened a PR against kube-rs to allow setting a resource version (for list requests only, for now). |
Thanks @nabokihms! |
Thanks @nabokihms ! |
Thanks @nabokihms |
kube-rs changes have been merged into master branch. Once the new version of the client is out, I will make the changes to vector accordingly. |
The new option is added to vector. Under the hood, this option makes vector use resource_version=0 for all initial list requests. |
A note for the community
Problem
This is to split issues 16753 to its own.
Context
We use vector to deliver
kubernetes_logs
to our kafka cluster which will be later processed and ingested into Humio. Vector is deployed as a daemon set in our kubernetes clusteres (each with >1000 nodes running).We recently had an outage in one of our kubernetes clusters (with ~1100 nodes running). There was a failure in ETCD leader node, which resulted in a cascaded failure where pods making 1000x API calls to our API server which eventually brought the kubernetes control plane down entirely.
In the process of remediation, we identified vector as one of the candidate that was hammering the API server. Shutting down vector along with a few other daemon sets eventually reduced the traffic on Control Plane components, which allows ETCD nodes to recover.
Issue: resource_version not set when making API requests to Kube API server
Based on this issue: #7943, resource_version was set to 0 in vector 0.18 - 0.20. PR #11714 adopted kube-rs and dropped the change in #9974. When looking at the Audit logs from kube-api-server, we don't see the resource_version set in the request URL, which makes us wonder if this was an regression
Sample request in audit logs:
Version
vector 0.27.0 (x86_64-unknown-linux-gnu 5623d1e 2023-01-18)
References
#7943
#16753
The text was updated successfully, but these errors were encountered: