[receiver/kubeletstats] Add `k8s.container.cpu.node.utilization` metric #32295

ChrsMark · 2024-04-10T11:31:28Z

Description:

At the moment. We calculate the k8s.container.cpu_limit_utilization as a ratio of the container's limits at

opentelemetry-collector-contrib/receiver/kubeletstatsreceiver/internal/kubelet/cpu.go

Line 30 in 867d670

cpuMetrics.LimitUtilization(mb, currentTime, value/r.cpuLimit)

.

Similarly we can calculate the cpu utilization as ratio of the whole node's allocatable cpu, if we divide by the total number of node's cores.

We can retrieve this information from the Node's Status.Capacity, for example:

$ k get nodes kind-control-plane -ojsonpath='{.status.capacity}'
{"cpu":"8","ephemeral-storage":"485961008Ki","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"32564732Ki","pods":"110"}

Performance concerns

In order to get the Node's capacity we need an API call to the k8s API in order to get the Node object.
Something to consider here is the performance impact that this extra API call would bring. We can always choose to have this metric as disabled by default and clearly specify in the docs that this metric comes with an extra API call to get the Node of the Pods.

The good thing is that kubeletstats receiver target's only one node so I believe it's a safe assumption to only fetch the current node because all the observed Pods will belong to the one single local node. Correct me if I miss anything here.

In addition, instead of performing the API call explicitly on every single scrape we can use an informer instead and leverage its cache. I can change this patch to this direction if we agree on this.

Would love to hear other's opinions on this.

Todos

✅ 1) Apply this change behind a feature gate as it was indicated at #27885 (comment)
✅ 2) Use an Informer instead of direct API calls.

Link to tracking Issue:
ref: #27885

Testing:

I experimented with this approach and the results look correct. In order to verify this I deployed a stress Pod on my machine to consume a target cpu of 4 cores:

apiVersion: v1
kind: Pod
metadata:
  name: cpu-stress
spec:
  containers:
  - name: cpu-stress
    image: polinux/stress
    command: ["stress"]
    args: ["-c", "4"]

And then the collected container.cpu.utilization for that Pod's container was at 0,5 as exepcted, based that my machine-node comes with 8 cores in total:

Unit test is also included.

Documentation:

Added: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32295/files#diff-8ad3b506fb1132c961e8da99b677abd31f0108e3f9ed6999dd96ad3297b51e08

jinja2 · 2024-05-29T16:39:19Z

It looks like we'd be introducing more combinations of utilization metric for different resource types cpu/memory/ephemeral_storage for both k8s.container and k8s.pod. Are pre-computed utilization metric for different limits like we have in this receiver for container, pod, and now node useful to users?

I understand one of the arguments for doing the utilization calculation in this receiver has been due to users wanting all resource usage metrics from the same receiver, and currently users need to run both the kubeletstats (reports usage for container) and the k8scluster (reports node's capacity/allocatable) receiver to be able to calculate the utilization. Would we instead consider reporting the node's capacity metric from the kubeletstats receiver instead of doing utilization calculations in the receiver? With the node's capacity/allocatable and pod's/container's resource usage (have to make sure the node id is available for filtering) being reported from kubeletstats receiver, the user has the option to calculate the utilization in their backend observability platform. Imo, this gives user more flexibility in deciding what they consider as limit, or whether they want to track it per container or per pod

ChrsMark · 2024-05-29T17:11:09Z

Are pre-computed utilization metric for different limits like we have in this receiver for container, pod, and now node useful to users?

From my point of view, yes :)

Would we instead consider reporting the node's capacity metric from the kubeletstats receiver instead of doing utilization calculations in the receiver?

Wouldn't that mean that we will have the same metric potentially emitted from 2 different receivers at the same time?

In general, that's mostly covered already back in #24905 (comment) when the k8s.container.cpu_limit_utilization and the rest were proposed and added.

From my perspective agreeing on unambiguous metrics (even as optional) that we can provide ootb is the goal here. If we pass the responsibility to users/backends to do such calculations/namings then we kind of start losing the vendor/backend agnostic idea.

Note: the k8s.pod.cpu_limit/request_utilization metrics are not provided at all if there are containers with no limit/request set. That means that the utilization metric as ratio against the node's "limit" is the one that will be always there.

ChrsMark · 2024-05-30T07:34:45Z

Based on the discussions at #32295 (comment) I have pushed a change to rename the metric to k8s.container.cpu.node.utilization.
That's better aligned with the SemConv guidelines, and also preserves the k8s.container.cpu.node.* as a namespace in case we want to add an additional metric in the future to cover the allocatable related ratio.

@andrzej-stencel @povilasv @TylerHelmuth @dmitryax @jinja2 please take another look.

jinja2 · 2024-05-30T19:56:34Z

Wouldn't that mean that we will have the same metric potentially emitted from 2 different receivers at the same time?

I think it is acceptable to have some metrics be reported by both k8scluster and kubeletstats receiver. k8scluster receiver is run as a deployment so usually only one instance of the receiver is doing the watching/reporting for all nodes/pods in cluster. We have seen missed updates with massive clusters or clusters with higher churn rate, etc. kubeletstats receiver reporting the limit/request and/or node capacity/allocatable for the local node and pods would be a beneficial addition for users who do not want to run both receivers if they are only looking for metrics related to resource usage, and also a more scalable alternative for k8scluster receiver.

From my perspective agreeing on unambiguous metrics (even as optional) that we can provide ootb is the goal here. If we pass the responsibility to users/backends to do such calculations/namings then we kind of start losing the vendor/backend agnostic idea.

Tbh, I don't think this pre-computed metric adds much value for k8s users. If the goal is to make sure that users have the option of comparing usage against node's capacity then reporting just the capacity instead of the pre-computed utilization is a more flexible option, because it allows k8s admins and users to calculate "utilization" as they see fit for their k8s setup. I can think of a few scenarios where a k8s admin would calculate usage against node's cap/allo but these are not in-line with the metric being added here. For e.g., when looking at a node's utilization to determine whether an admin needs to add more capacity, I am more likely to sum the resource requests of all pods on the node and compare that against the node's allocatable instead of looking at the actual usage of the containers, which kind of does not matter because even if the container is using only 1 Gi out of a 10Gi request, the node has still reserved that 10Gi and is not usable for any other container on the node. As an application developer deploying to a k8s cluster, I am more likely to see the utilization wrt my pod's request/limit instead of the node's capacity. And if I have not set a limit on my container, the limit of resource such a container is technically allowed to use is the nodes allocatable - sum(request of all other pods on this node). These are just some examples for how a k8s user/admin might think of a limit as imposed by the node's availability of resource. I agree that there is probably a good usecase to measure usage of a container's cpu wrt to node's capacity, but there are plenty of such combinations when we try to compute utilization wrt k8s node's available resource which might be equally or more useful imo, and a good way to accommodate these different scenarios is to keep the metrics more flexible instead of pre-computing.

ChrsMark · 2024-05-31T07:11:46Z

I agree that there is probably a good usecase to measure usage of a container's cpu wrt to node's capacity, but there are plenty of such combinations when we try to compute utilization wrt k8s node's available resource which might be equally or more useful imo, and a good way to accommodate these different scenarios is to keep the metrics more flexible instead of pre-computing.

Thank's for the additional information!

I guess users are always able to apply such calculations in their back-ends if they want to, even with the metrics that are provided today. In this, I don't see all these ideas really being blockers for this optional metric.

For reference, this one has been used in Kibana for years now so being able to retrieve this ootb through the Collector would only be of benefit. Having the utilization calculated against a hard limit is a valuable, non relative, indicator to observe over time, build alerts/reports on top of etc. It can also come handy in order to quickly drill down to specific workloads while investigating infra resource utilization issues.

Something to not underestimate is that query aggregations like those mentioned can become "expensive" sometimes, so I see value in having the option to do such pre-calculations on the edge. I would be more than happy to see any additional one, that we see as appropriate, in the future.

Ultimately, providing this kind of metrics as optional is flexible enough since users that want to calculate them on their back-ends they can do so, while those that want them pre-calculated on the edge will still have the option for that.

…metric Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

andrzej-stencel · 2024-05-31T10:09:23Z

I agree with @ChrsMark that this metric is valuable.

The discussion around naming and exact meaning (computing vs. node's capacity or allocatable cpu) would have been much easier if the semantic conventions were already specified for k8s. However, I don't think we should block adding new useful metrics to the collector until there are semantic conventions for them first. This is also not what we have been doing historically.

Ideally, I would have each metric created by this receiver (and other receivers) document its semantic conventions status, or whether a semantic convention even exists for the metric. That would make it easier for users to reason about a specific metric's stability. Given that the receiver as a whole is in beta stability, we would like users to be aware of the fact that not all metrics in the component might be equally stable. If we'd want this to be done, that would be a separate PR that should not block this one or others.

@TylerHelmuth

**Description:** <Describe what has changed.>  This PR adds the `k8s.pod.cpu.node.utilization` metric. Follow up from #32295 (comment) (cc @TylerHelmuth) . **Link to tracking Issue:** <Issue number if applicable> Related to #27885. **Testing:** <Describe what testing was performed and which tests were added.> Adjusted the respective unit test to cover this metric as well. **Documentation:** <Describe the documentation added.> Added Tested with a single container Pod: ![podCpu](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/9a0069c2-7077-4944-93b6-2dde00979bf3) --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com> Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com> Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>

@TylerHelmuth

…-telemetry#33390) **Description:** <Describe what has changed.>  This PR adds the `k8s.pod.cpu.node.utilization` metric. Follow up from open-telemetry#32295 (comment) (cc @TylerHelmuth) . **Link to tracking Issue:** <Issue number if applicable> Related to open-telemetry#27885. **Testing:** <Describe what testing was performed and which tests were added.> Adjusted the respective unit test to cover this metric as well. **Documentation:** <Describe the documentation added.> Added Tested with a single container Pod: ![podCpu](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/9a0069c2-7077-4944-93b6-2dde00979bf3) --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com> Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com> Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>

…ion` metrics (#33591) **Description:** <Describe what has changed.>  Similar to #32295 and #33390, this PR adds the `k8s.{container,pod}.memory.node.utilization` metrics. **Link to tracking Issue:** <Issue number if applicable> #27885 **Testing:** <Describe what testing was performed and which tests were added.> Added unit test. **Documentation:** <Describe the documentation added.> Added ### Manual testing 1. Using the following target Pod: ```yaml apiVersion: v1 kind: Pod metadata: name: memory-demo spec: containers: - name: memory-demo-ctr image: polinux/stress resources: requests: memory: "8070591Ki" limits: memory: "9070591Ki" command: ["stress"] args: ["--vm", "1", "--vm-bytes", "800M", "--vm-hang", "4"] ``` 2. ![memGood](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/fae04b30-59ca-4d70-8446-f54b5a085cf7) On a node of 32,5G memory the 800Mb container/Pod consumes the `0.8/32.5=0.0246...=0.025`. --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

…etry#32591) **Description:** <Describe what has changed.>  This PR implements some basic refactoring required for open-telemetry#32295. Doing this in a standalone PR was requested at open-telemetry#32295 (comment). The change moves a library method called `newNodeSharedInformer` from the `k8sattributes` processor package to a generic one since it can be re-used by multiple various components. Note: I'm not sure if a changelog is required for this change, so I left it out for now. **Link to tracking Issue:** <Issue number if applicable> **Testing:** <Describe what testing was performed and which tests were added.> **Documentation:** <Describe the documentation added.> Signed-off-by: ChrsMark <chrismarkou92@gmail.com> Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>

…ic (open-telemetry#32295) **Description:** <Describe what has changed.>  At the moment. We calculate the `k8s.container.cpu_limit_utilization` as [a ratio of the container's limits](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/kubeletstatsreceiver/documentation.md#k8scontainercpu_limit_utilization) at https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/867d6700c31446172e6998e602c55fbf7351831f/receiver/kubeletstatsreceiver/internal/kubelet/cpu.go#L30. Similarly we can calculate the cpu utilization as ratio of the whole node's allocatable cpu, if we divide by the total number of node's cores. We can retrieve this information from the Node's `Status.Capacity`, for example: ```console $ k get nodes kind-control-plane -ojsonpath='{.status.capacity}' {"cpu":"8","ephemeral-storage":"485961008Ki","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"32564732Ki","pods":"110"} ``` ## Performance concerns In order to get the Node's capacity we need an API call to the k8s API in order to get the Node object. Something to consider here is the performance impact that this extra API call would bring. We can always choose to have this metric as disabled by default and clearly specify in the docs that this metric comes with an extra API call to get the Node of the Pods. The good thing is that `kubeletstats` receiver target's only one node so I believe it's a safe assumption to only fetch the current node because all the observed Pods will belong to the one single local node. Correct me if I miss anything here. In addition, instead of performing the API call explicitly on every single `scrape` we can use an informer instead and leverage its cache. I can change this patch to this direction if we agree on this. Would love to hear other's opinions on this. ## Todos ✅ 1) Apply this change behind a feature gate as it was indicated at open-telemetry#27885 (comment) ✅ 2) Use an Informer instead of direct API calls. **Link to tracking Issue:** <Issue number if applicable> ref: open-telemetry#27885 **Testing:** <Describe what testing was performed and which tests were added.> I experimented with this approach and the results look correct. In order to verify this I deployed a stress Pod on my machine to consume a target cpu of 4 cores: ```yaml apiVersion: v1 kind: Pod metadata: name: cpu-stress spec: containers: - name: cpu-stress image: polinux/stress command: ["stress"] args: ["-c", "4"] ``` And then the collected `container.cpu.utilization` for that Pod's container was at `0,5` as exepcted, based that my machine-node comes with 8 cores in total: ![cpu-stress](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/3abe4a0d-6c99-4b4e-a704-da5789dde01b) Unit test is also included. **Documentation:** <Describe the documentation added.> Added: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32295/files#diff-8ad3b506fb1132c961e8da99b677abd31f0108e3f9ed6999dd96ad3297b51e08 --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

@TylerHelmuth

…-telemetry#33390) **Description:** <Describe what has changed.>  This PR adds the `k8s.pod.cpu.node.utilization` metric. Follow up from open-telemetry#32295 (comment) (cc @TylerHelmuth) . **Link to tracking Issue:** <Issue number if applicable> Related to open-telemetry#27885. **Testing:** <Describe what testing was performed and which tests were added.> Adjusted the respective unit test to cover this metric as well. **Documentation:** <Describe the documentation added.> Added Tested with a single container Pod: ![podCpu](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/9a0069c2-7077-4944-93b6-2dde00979bf3) --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com> Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com> Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>

…ion` metrics (open-telemetry#33591) **Description:** <Describe what has changed.>  Similar to open-telemetry#32295 and open-telemetry#33390, this PR adds the `k8s.{container,pod}.memory.node.utilization` metrics. **Link to tracking Issue:** <Issue number if applicable> open-telemetry#27885 **Testing:** <Describe what testing was performed and which tests were added.> Added unit test. **Documentation:** <Describe the documentation added.> Added ### Manual testing 1. Using the following target Pod: ```yaml apiVersion: v1 kind: Pod metadata: name: memory-demo spec: containers: - name: memory-demo-ctr image: polinux/stress resources: requests: memory: "8070591Ki" limits: memory: "9070591Ki" command: ["stress"] args: ["--vm", "1", "--vm-bytes", "800M", "--vm-hang", "4"] ``` 2. ![memGood](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/fae04b30-59ca-4d70-8446-f54b5a085cf7) On a node of 32,5G memory the 800Mb container/Pod consumes the `0.8/32.5=0.0246...=0.025`. --------- Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

github-actions bot added the receiver/kubeletstats label Apr 10, 2024

github-actions bot requested review from dmitryax and TylerHelmuth April 10, 2024 11:31

ChrsMark mentioned this pull request Apr 10, 2024

[receiver/kubeletstat] Review cpu.utilization naming #27885

Open

ChrsMark force-pushed the k8s_container_metrics branch 2 times, most recently from cde2d37 to 05b1b5e Compare April 15, 2024 08:23

github-actions bot added the internal/k8sconfig label Apr 15, 2024

ChrsMark force-pushed the k8s_container_metrics branch from 9efefe3 to 299b6b4 Compare April 15, 2024 13:07

github-actions bot added the processor/k8sattributes k8s Attributes processor label Apr 15, 2024

github-actions bot requested review from fatsheep9146 and rmfitzpatrick April 15, 2024 13:07

ChrsMark force-pushed the k8s_container_metrics branch 4 times, most recently from 797d2dc to 6450540 Compare April 16, 2024 07:20

github-actions bot added connector/datadog exporter/datadog Datadog components internal/kubelet internal/metadataproviders labels Apr 16, 2024

github-actions bot requested review from Aneurysm9, dashpole, dineshg13, liustanley, mackjmr, mx-psi and songy23 April 16, 2024 07:21

ChrsMark force-pushed the k8s_container_metrics branch 4 times, most recently from 2d60221 to e70ce18 Compare April 16, 2024 11:29

ChrsMark force-pushed the k8s_container_metrics branch from eb2a93c to f0af96f Compare May 29, 2024 15:46

ChrsMark force-pushed the k8s_container_metrics branch from f0af96f to 5a5e2c2 Compare May 30, 2024 07:04

ChrsMark force-pushed the k8s_container_metrics branch 2 times, most recently from 9ac3346 to 42fc09e Compare May 30, 2024 07:39

andrzej-stencel changed the title ~~[receiver/kubeletstats] Add k8s.container.cpu.node_limit_utilization metric~~ [receiver/kubeletstats] Add k8s.container.cpu.node.utilization metric May 30, 2024

ChrsMark force-pushed the k8s_container_metrics branch 3 times, most recently from 3b543df to 8b224eb Compare May 30, 2024 18:38

ChrsMark added 2 commits May 31, 2024 10:12

[receiver/kubeletstats] Add k8s.container.cpu.node_limit_utilization …

d5662b5

…metric Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

replace node map with single struct

fb652fc

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

ChrsMark force-pushed the k8s_container_metrics branch from 8b224eb to 343bb18 Compare May 31, 2024 07:13

rename to k8s.container.cpu.node.utilization

ceb5d34

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>

ChrsMark force-pushed the k8s_container_metrics branch from 343bb18 to ceb5d34 Compare May 31, 2024 09:48

andrzej-stencel merged commit 4dd718b into open-telemetry:main May 31, 2024
162 checks passed

github-actions bot added this to the next release milestone May 31, 2024

ChrsMark mentioned this pull request Jun 5, 2024

[receiver/kubeletstats] Add k8s.pod.cpu.node.utilization metric #33390

Merged

ChrsMark mentioned this pull request Jun 17, 2024

[receiver/kubeletstats] Add k8s.{container,pod}.memory.node.utilization metrics #33591

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/kubeletstats] Add `k8s.container.cpu.node.utilization` metric #32295

[receiver/kubeletstats] Add `k8s.container.cpu.node.utilization` metric #32295

ChrsMark commented Apr 10, 2024 •

edited

Loading

jinja2 commented May 29, 2024

ChrsMark commented May 29, 2024

ChrsMark commented May 30, 2024

jinja2 commented May 30, 2024

ChrsMark commented May 31, 2024 •

edited

Loading

andrzej-stencel commented May 31, 2024

[receiver/kubeletstats] Add k8s.container.cpu.node.utilization metric #32295

[receiver/kubeletstats] Add k8s.container.cpu.node.utilization metric #32295

Conversation

ChrsMark commented Apr 10, 2024 • edited Loading

Performance concerns

Todos

jinja2 commented May 29, 2024

ChrsMark commented May 29, 2024

ChrsMark commented May 30, 2024

jinja2 commented May 30, 2024

ChrsMark commented May 31, 2024 • edited Loading

andrzej-stencel commented May 31, 2024

[receiver/kubeletstats] Add `k8s.container.cpu.node.utilization` metric #32295

[receiver/kubeletstats] Add `k8s.container.cpu.node.utilization` metric #32295

ChrsMark commented Apr 10, 2024 •

edited

Loading

ChrsMark commented May 31, 2024 •

edited

Loading