Support kube_pod_ready_time metric #1465

sgrzemski · 2021-04-26T06:39:33Z

What would you like to be added:

Hello kube-state-metrics team!
I am a happy user of your metrics software. I would like the kube-state-metrics to report also the time, when pod became ready (passing readiness probe). According to docs, there are already a couple of gauges in seconds: kube_pod_start_time, kube_pod_container_state_started, etc.

Why is this needed:

I would like to be able to measure the time needed for the container to become fully operational and healthy. I already have the created and start timestamp, so a simple delta query in prometheus would do the trick if a metric reporting the ready time would be implemented.

Describe the solution you'd like

Query the Kubernetes API to get the ready timestamp.

Additional context

The text was updated successfully, but these errors were encountered:

lilic · 2021-04-29T13:59:48Z

Hey 👋 Can you explain where this API is at? If k8s API reports this, it sounds good to me.

Note that ContainerState is the only thing that reports StartedAt. I haven't looked into StartTime if that can be used somehow.

sgrzemski · 2021-05-14T09:20:59Z

Pardon my delay, I was off for some time.
I took a look at the code and it looks like kube-state-metrics uses Pod objects, Pod.Status.StartTime specifically, to create the kube_pod_start_time metric. However, according to Pod Lifecycle docs, PodStatus should have an array of PodConditions, containing the following information:

PodScheduled: the Pod has been scheduled to a node.
ContainersReady: all containers in the Pod are ready.
Initialized: all init containers have started successfully.
Ready: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services.

Those come with two useful properties called:

lastProbeTime: Timestamp of when the Pod condition was last probed.
lastTransitionTime: Timestamp for when the Pod last transitioned from one status to another.

This information should be enough to form a metric called kube_pod_ready_time and with a simple PromQL get the time needed for the pod to start.

sgrzemski · 2021-05-17T07:19:43Z

I've patched v1.9.8 release with some additional code to report both ContainersReady and Ready timestamps and the transition between states can happen multiple times (e.g. pod stopped passing readiness probes). Quoting Pod Lifecycle docs:

Pod is evaluated to be ready only when both the following statements apply:

All containers in the Pod are ready.
All conditions specified in readinessGates are True.

I will change those metrics to report latest timestamp and match with your current convention: kube_pod_status_ready_time and kube_pod_status_containers_ready_time and prepare a PR.

brancz · 2021-06-07T13:20:38Z

I'm pretty sure the kubelet exposes metrics about the readiness probes. I think it's the kubelet's responsibility to expose this.

szymon-grzemski · 2021-06-14T12:23:37Z

I'm pretty sure the kubelet exposes metrics about the readiness probes. I think it's the kubelet's responsibility to expose this.

I am running 1.19+ and I am seeing kubelet_pod_start_duration_seconds_bucket, _sum and _count in Prometheus, but they are node level, not per specific pods.

slamdev · 2021-08-10T06:24:47Z

@lilic @brancz any plans to merge this? looking forward to use this metric

kevinwubert · 2021-10-15T22:43:18Z

I was looking for something just like this! Was there something blocking this from getting merged into kube state metrics 2?

fpetkovski · 2021-10-18T11:42:36Z

It looks like the PR has gone stale. Would you be interested in wrapping up the work?

sgrzemski · 2021-10-18T12:22:44Z

Would love to! I will update in the next couple of days.

SpectralHiss · 2021-12-15T11:53:59Z

👍 This is useful for doing accurate total pod start time calculation, instead of trying to infer from ready count or something, in my particular case trying to benchmark the effect of some Istio sidecar settings on startup time.
any updates @sgrzemski ?

PrayagS · 2022-02-16T10:00:04Z

Looking forward to testing out this feature. @sgrzemski Are we stuck somewhere?

My team has a similar use case where we're trying to figure out the time it takes a pod to be scheduled to a particular node. We can get hold of the time when the pod transitioned to PodScheduled and report that as a metric.

k8s-triage-robot · 2022-05-17T10:42:34Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-06-16T11:30:42Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

stat-johan · 2022-07-01T10:21:37Z

This would be really nice to have, @sgrzemski

fpetkovski · 2022-07-01T17:34:49Z

/remove-lifecycle rotten

fpetkovski · 2022-07-01T17:35:07Z

/remove-lifecycle stale

qingguee · 2022-09-16T00:32:21Z

Looking forward for this PR be merged, because I find that the "kube_pod_status_ready" has 2s delay to show ready status which compare with Ready timestamp from pod condition. I have reported a Issue, but no response yet.
#1830

So, we can't rely on "kube_pod_status_ready". If we want to calculate POD startup time, that's an issue.
It's always an issue by using metrics to calculate time, so we need a metrics can return ready timestamp which can get from pod conditions.

sumanthkumarc · 2022-09-27T12:04:08Z

This would be a really great metric to have. Helps us to understand the time taken for services to come up in cluster.

max-rocket-internet · 2022-12-14T15:55:53Z

The metric would be incredibly valuable! For example to know:

Seconds until pod is scheduled
Seconds until pod is Ready

coleary-hyperscience · 2022-12-28T20:56:51Z

I'm still pretty new to Prometheus, but I'm using this query to collect an almost equivalent metric. Please let me know if you find this useful or foresee any issues with it: sort_desc(max(sum_over_time(kube_pod_status_phase{namespace=~"$namespace", phase="Pending"}[$__range])/4) by (pod))
This returns the approximate (30 sec accuracy) time (in min) for each pod in the pending state.

vijaynidhi85 · 2023-01-10T10:18:46Z

I'm still pretty new to Prometheus, but I'm using this query to collect an almost equivalent metric. Please let me know if you find this useful or foresee any issues with it: sort_desc(max(sum_over_time(kube_pod_status_phase{namespace=~"$namespace", phase="Pending"}[$__range])/4) by (pod)) This returns the approximate (30 sec accuracy) time (in min) for each pod in the pending state.

Could I know why the /4 is required?

coleary-hyperscience · 2023-01-11T16:52:31Z

Could I know why the /4 is required?

For sure, it looks to me like kube_pod_status_phase pings 4 times every min. So it is to convert the sum of those 4 pings to a min rate.

Not a great way to do it, but seems to be working for me, at least until this other PR gets merged.

jkdihenkar · 2024-04-16T12:24:00Z

Hi @sgrzemski - can you share us the export of the dashboard that you've plotted based on these metrics?

sgrzemski added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 26, 2021

sgrzemski mentioned this issue May 17, 2021

ISSUE-1465: Added kube_pod_status_ready_time and kube_pod_status_containers_ready_time #1482

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 17, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 16, 2022

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 1, 2022

liangyuanpeng mentioned this issue Sep 18, 2022

Add metrics of kube_pod_status_ready_time and kube_pod_status_containers_ready_time #1837

Closed

ryanrolds mentioned this issue Dec 29, 2022

Add metrics of kube_pod_status_ready_time and kube_pod_status_containers_ready_time redux #1938

Merged

k8s-ci-robot closed this as completed in #1938 Jan 19, 2023

bbdouglas mentioned this issue Jul 18, 2023

Directly emit container ready time metric #2119

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support kube_pod_ready_time metric #1465

Support kube_pod_ready_time metric #1465

sgrzemski commented Apr 26, 2021

lilic commented Apr 29, 2021

sgrzemski commented May 14, 2021

sgrzemski commented May 17, 2021 •

edited

Loading

brancz commented Jun 7, 2021

szymon-grzemski commented Jun 14, 2021

slamdev commented Aug 10, 2021

kevinwubert commented Oct 15, 2021

fpetkovski commented Oct 18, 2021

sgrzemski commented Oct 18, 2021

SpectralHiss commented Dec 15, 2021 •

edited

Loading

PrayagS commented Feb 16, 2022

k8s-triage-robot commented May 17, 2022

k8s-triage-robot commented Jun 16, 2022

stat-johan commented Jul 1, 2022

fpetkovski commented Jul 1, 2022

fpetkovski commented Jul 1, 2022

qingguee commented Sep 16, 2022

sumanthkumarc commented Sep 27, 2022

max-rocket-internet commented Dec 14, 2022

coleary-hyperscience commented Dec 28, 2022

vijaynidhi85 commented Jan 10, 2023

coleary-hyperscience commented Jan 11, 2023

jkdihenkar commented Apr 16, 2024

Support kube_pod_ready_time metric #1465

Support kube_pod_ready_time metric #1465

Comments

sgrzemski commented Apr 26, 2021

lilic commented Apr 29, 2021

sgrzemski commented May 14, 2021

sgrzemski commented May 17, 2021 • edited Loading

brancz commented Jun 7, 2021

szymon-grzemski commented Jun 14, 2021

slamdev commented Aug 10, 2021

kevinwubert commented Oct 15, 2021

fpetkovski commented Oct 18, 2021

sgrzemski commented Oct 18, 2021

SpectralHiss commented Dec 15, 2021 • edited Loading

PrayagS commented Feb 16, 2022

k8s-triage-robot commented May 17, 2022

k8s-triage-robot commented Jun 16, 2022

stat-johan commented Jul 1, 2022

fpetkovski commented Jul 1, 2022

fpetkovski commented Jul 1, 2022

qingguee commented Sep 16, 2022

sumanthkumarc commented Sep 27, 2022

max-rocket-internet commented Dec 14, 2022

coleary-hyperscience commented Dec 28, 2022

vijaynidhi85 commented Jan 10, 2023

coleary-hyperscience commented Jan 11, 2023

jkdihenkar commented Apr 16, 2024

sgrzemski commented May 17, 2021 •

edited

Loading

SpectralHiss commented Dec 15, 2021 •

edited

Loading