-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SBD watchdog and msgwait metrics #174
Conversation
@MalloZup How are these kind of metric being formatted on other cases or maybe on other exporters? I think the usage of label more flexible and elegant, but at the same time it might make harder for the consumer. Imagine that we use label and implement only 2 metrics. When a third label is introduced we might break dashboards that might not have consistent filters, while a new metric would be ignored. With that my vote is for separated metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MalloZup as I mentioned in the JIRA issue, the way I would go about this is to add a single metric with both a device
and a type
label, and add all the timeouts.
There is no real downside to expose also the other two types that you have left out.
The final result should be something like this:
# TYPE ha_cluster_sbd_device_timeout gauge
ha_cluster_sbd_device_timeout{type="watchdog",device="/dev/vdc"} 5
ha_cluster_sbd_device_timeout{type="allocate",device="/dev/vdc"} 2
ha_cluster_sbd_device_timeout{type="loop",device="/dev/vdc"} 1
ha_cluster_sbd_device_timeout{type="msgwait",device="/dev/vdc"} 10
@diegoakechi for these kind of use cases, the best practice is exactly the opposite of using separate metrics. See https://prometheus.io/docs/practices/instrumentation/#use-labels |
Sure, in that case lets go this way. Also, make sure to include all the metrics, so we avoid any filtering problem in the future. Thanks @stefanotorresi |
@stefanotorresi I'm ok to use the
as follow-up with @yan-gao discussion we don't need to expose the other 2 metrics since they are more confusing then helping thx for reviews. I will rework this during the day. |
okay, sounds good to me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I suggest just a few minor tweaks to the labels and names.
5c0ec06
to
428cb9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx should be ok now
todo:
Note:
I have implemented the metric like
I didn't found any useful doc about the
loop
andallocate
metric so I even wondered if we should expose them. (In fact I haven't but if they are meant to be useful, we should definitely document this in the suse-doc)See:
https://documentation.suse.com/sle-ha/15-SP2/html/SLE-HA-all/cha-ha-storage-protect.html#sec-ha-storage-protect-watchdog-timings
@diegoakechi cc
@nick-wang cc
I personally think that if we are exposing only 2 metrics we can create for each one a new metric without label.
If we want to expose 4 then I will refactor accordingly with a
type
label. I am ok with both directions