Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds WatchList Latency to APIResponsivenessPrometheus #2764

Conversation

p0lyn0mial
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

WatchList latency is gathered for 50th, 90th and 99th duration quantiles for watch list requests broken down by group, resource, scope.

The new metric (kubernetes/kubernetes#120490) allows for comparing watch-list requests with standard list requests and measuring performance of the new requests in general.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 8, 2024
@k8s-ci-robot k8s-ci-robot requested review from mborsz and wojtek-t July 8, 2024 14:24
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 8, 2024
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments - overall it looks the way I wanted it to do.

watchListLatencyMetricName = "apiserver_watch_list_duration_seconds"
// watchListLatencyQuery placeholders must be replaced with (1) quantile (2) query window size
watchListLatencyQuery = "histogram_quantile(%.2f, sum(rate(%v_bucket{}[%v])) by (group, version, resource, scope, le))"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into test results:
https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/perf-tests/2764/pull-perf-tests-clusterloader2/1810319004401668096/artifacts/APIResponsivenessPrometheus_simple_load_2024-07-08T14:50:01Z.json

I see only one entry for watchlist (pod list on namespace scope).

Is that expected? What is issueing this request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question.
do we have metrics so that i could execute the prom query manually ?
could it be that the run was using a server with enabled watchlist feature ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it be that the run was using a server with enabled watchlist feature ?

it looks like this was our case, the most recent run doesn't have any entries of watchlist (the feature was turned off on the server some time ago)

https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/perf-tests/2764/pull-perf-tests-clusterloader2/1818234814688399360/artifacts/APIResponsivenessPrometheus_simple_load_2024-07-30T11:06:18Z.json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was wondering why we've seen only one such entry, but I guess it's because it's disabled in client-go by default (and was enabled only in KCM). So that makes sense.

}

func (ex *fakeQueryExecutor) Query(query string, _ time.Time) ([]*model.Sample, error) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove empty line

@wojtek-t wojtek-t self-assigned this Jul 26, 2024
@p0lyn0mial
Copy link
Contributor Author

/test perf-tests-clusterloader2

@k8s-ci-robot
Copy link
Contributor

@p0lyn0mial: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-perf-tests-benchmark-kube-dns
  • /test pull-perf-tests-clusterloader2
  • /test pull-perf-tests-clusterloader2-e2e-gce-scale-performance-manual
  • /test pull-perf-tests-clusterloader2-kubemark
  • /test pull-perf-tests-util-images
  • /test pull-perf-tests-verify-all-python
  • /test pull-perf-tests-verify-dashboard
  • /test pull-perf-tests-verify-lint
  • /test pull-perf-tests-verify-test

The following commands are available to trigger optional jobs:

  • /test pull-perf-tests-100-adhoc
  • /test pull-scheduler-perf
  • /test soak-tests-capz-windows-2019

Use /test all to run the following jobs that were automatically triggered:

  • pull-perf-tests-clusterloader2
  • pull-perf-tests-clusterloader2-kubemark
  • pull-perf-tests-verify-all-python
  • pull-perf-tests-verify-lint
  • pull-perf-tests-verify-test

In response to this:

/test perf-tests-clusterloader2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@p0lyn0mial
Copy link
Contributor Author

/test pull-perf-tests-clusterloader2

@p0lyn0mial p0lyn0mial force-pushed the upstream-api-responsiveness-watch-list branch from 71544a3 to bead663 Compare July 30, 2024 11:33
@wojtek-t
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 30, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: p0lyn0mial, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 30, 2024
@k8s-ci-robot k8s-ci-robot merged commit c76b37d into kubernetes:master Jul 30, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants