Improve CPU performance measurement precision #132

mergenci · 2023-09-04T10:43:40Z

Description of your changes

This PR addresses issues raised in [1] and comments that follow. Results are reported in [2].

Switch from node_exporter's per-node CPU metrics to cadvisor's per-container metrics.
Query only experiment duration for CPU metrics.
Very short running experiments may not have enough measurements for performance to be reported. Handle such cases by querying outside the experiment duration.

I have:

Run make reviewable test to ensure this PR is ready for review. (I used build submodule commit a6e25a to work around golangci-lint issue)

How has this code been tested

I ran the following command against my cluster, which has necessary software (UXP, provider-aws, prometheus) installed.

cd cmd/perf
go run main.go --mrs manifests/ecr.yaml=5 --provider-pods "$(kubectl get pods -o name | grep provider-aws | cut -d/ -f2)" --provider-namespace upbound-system --node "$(k get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}'):9100" --step-duration 1s

Signed-off-by: Cem Mergenci <cmergenci@gmail.com>

cmd/perf/internal/quantify.go

So far, we've been using node_exporter's metrics, which reported metrics per node. We switch to using cadvisor's per container metrics, which measure provider performance in isolation. So far, we've been reporting CPU utilization as a percentage of total CPU resources. 100% CPU utilization on an 8-CPU machine meant all cores were fully utilized. We switch to reporting CPU utilization as a percentage of one CPU. Full utilization of an 8-CPU machine is now reported as 800%. Signed-off-by: Cem Mergenci <cmergenci@gmail.com>

mergenci added 3 commits September 1, 2023 14:23

Add missing space to sample command invocation.

3d4a958

Signed-off-by: Cem Mergenci <cmergenci@gmail.com>

Clarify error message returned on assertion failure.

be99402

Signed-off-by: Cem Mergenci <cmergenci@gmail.com>

Calculate duration in a method.

066f909

Signed-off-by: Cem Mergenci <cmergenci@gmail.com>

mergenci requested review from ulucinar and sergenyalcin as code owners September 4, 2023 10:43

mergenci mentioned this pull request Sep 4, 2023

Inherit golangci-lint version from build submodule. #133

Merged

1 task

sttts reviewed Sep 5, 2023

View reviewed changes

cmd/perf/internal/quantify.go Outdated Show resolved Hide resolved

mergenci force-pushed the cpu-measurement-precision branch from 61e104b to 0197340 Compare September 6, 2023 11:46

mergenci force-pushed the cpu-measurement-precision branch from 0197340 to 7a82110 Compare September 6, 2023 11:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve CPU performance measurement precision #132

Improve CPU performance measurement precision #132

mergenci commented Sep 4, 2023

Improve CPU performance measurement precision #132

Are you sure you want to change the base?

Improve CPU performance measurement precision #132

Conversation

mergenci commented Sep 4, 2023

Description of your changes

How has this code been tested