This repository has been archived by the owner on Sep 12, 2023. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add proposal for Prometheus metrics coverage #77
Add proposal for Prometheus metrics coverage #77
Changes from 4 commits
cf4c0b1
3b2daaa
4893e7a
49a0e8a
1aa73e7
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to make sure the scope. This is outside operator. By default cadvisor expose the metrics and user can use these by their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think it's good to document this here so we know that we don't need to report these metrics by ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terrytangyuan Forgot to mention this one. Do you think it is more appropriate to make this
Gauge
as well? Do you want to represent the history failures or the current failed jobs?Can we list the metrics label in this doc as well? This is important and useful, too. Like we can combine
pending jobs
running jobs
andfailed jobs
into one metric job_status{status="pending/failed/running"}, WDYT?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep it as it is for now so that the metrics are consistent for metrics with past tense v.s. metrics with present continuous tense. Currently there are no labels yet as it's hard to differentiate metrics with two different tenses and choose different metric types for those metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I am thinking if we should change to
job_duration_from_created_to_complated_seconds_total
. Another thing is seems it would be good to usecomplete
deleted
as labels, but duration requires two and it would be a little bit hard to query. I think adding labels into metrics to distinguish them makes sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am following the naming practice outlined here: https://prometheus.io/docs/practices/naming/. I prefer the current naming without label as it's more intuitive but we can certainly revisit/revise later.