You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not sure where to put this request, or I can submit a PR for discussion, but lustre_disk_io_total would be nicer if it was histogram'med instead of having it as a counter, but I guess it depends on what people are using the metric for
The text was updated successfully, but these errors were encountered:
@jcftang This is a perfectly fine place to put it. We appreciate the feedback for sure. I think we avoided histograms in general for this first-ish pass through Lustre data, but if you think it'd be more valuable as a histogram it's absolutely something we can investigate.
The first one is to see the frequency of block sizes, which this metric sort of has already but it doesn't really seem to fit in with what prometheus's histogram_quantile() function does (I may be doing something wrong with my query), this would help in identifying if the system is reasonably setup on the OST side.
The second use case is to get a histogram of the latencies related to each block size, though I'm not sure if this data is available or not from lustre. This could be used as an indicator of some sort to validate if the system is running in an expected way.
Having those two things would help quite a bit in tuning/identifying behaviours on the OST side and also seeing roughly what users are doing
I'm not sure where to put this request, or I can submit a PR for discussion, but lustre_disk_io_total would be nicer if it was histogram'med instead of having it as a counter, but I guess it depends on what people are using the metric for
The text was updated successfully, but these errors were encountered: