Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube_pod_container_state_started is not returned for some pods #1467

Closed
AnastasiaBlack opened this issue Apr 28, 2021 · 3 comments · Fixed by #1519
Closed

kube_pod_container_state_started is not returned for some pods #1467

AnastasiaBlack opened this issue Apr 28, 2021 · 3 comments · Fixed by #1519
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@AnastasiaBlack
Copy link

AnastasiaBlack commented Apr 28, 2021

What happened:
We have multiple pods (jobs) that run for a short period of time (the time from container start to termination varies from 2-3 seconds to a couple of minutes for each pod). We use a query kube_pod_container_state_started{pod="needePodName"} in Prometheus and it appears that it works for some pods and doesn't show the result for others (while other queries, including kube_pod_start_time show the result for all of the pods).

What you expected to happen:
kube_pod_container_state_started shows result for every pod, and not just for some of them.

How to reproduce it (as minimally and precisely as possible):
Run multiple pods with short-time living containers, check that a query kube_pod_container_state_started to prometheus doesn't show the results for some of the pods.

Anything else we need to know?:
If we check the pods with kubectl describe pod [pod_name] - the values we need (container start time) are present. When we make a query with kube_pod_start_time for every pod - we get the expected result (that is the timestamp when the pod started). But now we want to get the timestamps when the Containers were started.

Environment:

  • kube-state-metrics version: v.2.0.0
  • Kubernetes version (use kubectl version): v.1.19.6
@harjas27
Copy link
Contributor

harjas27 commented May 3, 2021

This metric is captured from the field State.Running.StartedAt in the container status of pod.
In the span of 2-3 seconds, the state of the container changes from Waiting -> Running -> Terminated. Whenever the state is updated, the metrics for that pod are overwritten in the metrics map. Since the state changes to Terminated, the metric is not captured. This might be the reason for the metric not being collected for some of the pods.
As a fix, we can update

Value: float64((cs.State.Running.StartedAt).Unix()),

to take the value from either cs.State.Running.StartedAt or cs.State.Terminated.StartedAt depending on the state of the container.
@lilic

@AnastasiaBlack
Copy link
Author

That would be great!

@AnastasiaBlack
Copy link
Author

Hello! Is there any news on this issue? Will it be implemented in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants