-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Purpose of kube_pod_status_ready{condition="unknown"}
#2380
Comments
As far as I can tell that seems to (still) be a valid value for a PodCondition. You can find (a little bit) more about it in Pod conditions. |
I was suspecting that, just kind of surprised I hadn't seen that status actually in any of our metrics as we get containers in that state fairly regularly. It's unfortunate KSM exposes these like this, the cardinality caused by it is extreme. |
On the one hand, I don't think it would be wise to omit such label value given that it seems to still be a valid status. On the other hand, I can see how this might not be of interest and could impact cardinality. I wonder if we could have some kind of metric cardinality enforcement mechanism like we have for kube-apiserver, though I'm not sure if it would be too much just for this one metric/label. There are probably better ways to solve this, though I can't think of anything right now.
You mean you often have containers with |
KSM has a lot of the status metrics that are similar to this one where it has many label + value combinations for each pod that can explode cardinality and cost, especially if you are using a metrics vendor which bills by the cardinality/active series. So I think some broader cardinality enforcement might be broadly useful. As far as Ready Status, I might be mixing |
The metrics that have conditions are some are our most problematic ones as well. I've been thinking of two ways to address them:
|
Some of these metrics are marked as stable, so removing a label might not be something we would want to do. However, I can't tell if kube-state-metrics really follows the Kubernetes metrics stability framework. |
Ah. Another option might be:
|
For 2, one of our metrics vendors supports dropping data based on metric value, which we've implemented for many of our high cardinality metrics that are mostly zeros. So far, many of them tend to fall into 2 camps:
When we were adding these drops, the main drawback was that any gauges/math might not work without the zero value, but can be worked around. For 3, I had wondered if anyone had proposed/questioned how overloaded some of these status metrics become with the # of labels/label combinations that end up being created. I don't really think there'd be huge downside to breaking these up into individual metrics vs the current solution. Would another option be to have a runtime flag to just not emit the metrics if they are zero? |
Dropping zeros would be very useful. If there was an option to enable for KSM, that would cut out about ~55% of the series that we don't use.
Agree. A metric like
|
This adds a new flage `--metric-drop-value` that allows metrics with a value matching the flag to not be emitted. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering by value. This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
I do wonder if there is some use case to this structure we might be missing. But at scale the way these metrics are laid out is egregious. |
This adds a new flage `--metric-drop-value` that allows metrics with a value matching the flag to not be emitted. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering by value. This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
/triage accepted |
/remove-kind bug |
/assign @dgrisonnet |
This adds a new flag `--metric-keep-true` that allows metrics with series indicating false/0 value to be dropped. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering specifically for metrics with "condition" labels and various state metrics (e.g. kube_pod_status_phase). This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
This adds a new flag `--metric-keep-true` that allows metrics with series indicating false/0 value to be dropped. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering specifically for metrics with "condition" labels and various state metrics (e.g. kube_pod_status_phase). This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
This adds a new flag `--metric-keep-true` that allows metrics with series indicating false/0 value to be dropped. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering specifically for metrics with "condition" labels and various state metrics (e.g. kube_pod_status_phase). This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
This adds a new flag `--metric-keep-true` that allows metrics with series indicating false/0 value to be dropped. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering specifically for metrics with "condition" labels and various state metrics (e.g. kube_pod_status_phase). This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
This adds a new flag `--metric-keep-true` that allows metrics with series indicating false/0 value to be dropped. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering specifically for metrics with "condition" labels and various state metrics (e.g. kube_pod_status_phase). This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
This adds a new flag `--metric-keep-true` that allows metrics with series indicating false/0 value to be dropped. This is useful for reducing cardinality of metrics where many of the values are 0. In larger clusters, the size of the returned metrics can be quite large which can be costly to process and store in some environments. The plumbing adds support for filtering series by label name, label value or value, but this PR only sets up filtering specifically for metrics with "condition" labels and various state metrics (e.g. kube_pod_status_phase). This idea for this feature was discussed in kubernetes#2380. Related kubernetes#2116
What happened:
What is the purpose of the unknown condition on
kube_pod_status_ready
metric?Searching through my metrics for a fairly high number of pods and nodes, I don't see that status ever being true.
So my initial impression is this is needlessly wasting cardinality in a way that is very hard to avoid.
Anything else we need to know?:
Environment:
kubectl version
): 1.28The text was updated successfully, but these errors were encountered: