Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update metrics info for site #633

Merged
merged 1 commit into from
Jul 30, 2024

Conversation

googs1025
Copy link
Member

update metrics info for site

fix: #613

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 30, 2024
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 30, 2024
Copy link

netlify bot commented Jul 30, 2024

Deploy Preview for kubernetes-sigs-jobset ready!

Name Link
🔨 Latest commit 49e89c5
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-jobset/deploys/66a8375246773100084ef53b
😎 Deploy Preview https://deploy-preview-633--kubernetes-sigs-jobset.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@googs1025
Copy link
Member Author

Prometheus Metrics

JobSet exposes prometheus metrics to monitor the health
of the controller.

Installation Examples

The following example show how to install the Prometheus Operator for JobSet system.

JobSet controller health

Use the following metrics to monitor the health of the jobset controller:

Metric name Type Description Labels
controller_runtime_reconcile_errors_total Counter The total number of reconciliation errors encountered by each controller. controller: name of controller (i.e. use value jobset to obtain metrics for jobset controller)
controller_runtime_reconcile_time_seconds Histogram The latency of a reconciliation attempt in seconds. controller: name of controller (i.e. use value jobset to obtain metrics for jobset controller)

JobSet metrics

Use the following metrics to monitor the health of the jobsets created by the jobset controller:

Metric name Type Description Labels
jobset_failed_total Counter The total number of failed JobSets. jobset_name: name of jobset
jobset_completed_total Counter The total number of completed JobSets. jobset_name: name of jobset

@googs1025
Copy link
Member Author

In order to facilitate reading, I directly paste the markdown here.

@danielvegamyhre
Copy link
Contributor

/lgtm
/approve

Thanks for keeping the docs up to date!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 30, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danielvegamyhre, googs1025

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 30, 2024
@k8s-ci-robot k8s-ci-robot merged commit 09968fc into kubernetes-sigs:main Jul 30, 2024
12 checks passed
@danielvegamyhre danielvegamyhre mentioned this pull request Aug 19, 2024
20 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add monitoring metrics for jobset
3 participants