-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add openmetrics exemplar support #320
Conversation
Is there anyway you can give me a high-level overview of exemplars and how a typical application, with one of the official Prometheus client SDKs, is using them? I'm finding it hard to conceptualize, and based on the changes so far, I think we need to step back and talk about the feature first before trying to design the interface to it. |
Absolutely, I'll write some stuff that I know about it here, let me know if there is any other details that you would like to know.
Yes, I find it easier to reason about something after writing some code, so I started with that, not intending this to be the solution, but just something to have a discussion around. Use caseExemplars is a prometheus invention that can be use to attach additional data to some samples for a metric. An exemplar consist of the recorded value, a timestamp when it happen, and additional labels, the timestamp is optional (havent tried what happens if it is not set, guessing that the position when rendering it might get the collection time and not exactly when it happen). This is then used in e.g grafana to show little dots in graphs when creating graphs over these metrics. When hovering these dots the additional labels are shown. The most common use for this is to add a trace id to the exemplar labels, so that grafana can add a link into the tracing system. This is one of the main trace discoverability features when using tempo as a tracing backend. An example screenshot of how it is displayed in grafana: ExportingFor prometheus to be able to ingest exemplars, they need to be exported with the metric. Each label set for a metric can expose one exemplar. So in prometheus there will at most be one exemplar per metric + label set per collection. If multiple exemplars are written to the same metric + label set within the same collection, it will just overwrite the exemplar. (I would say that exemplars make most sense when using prometheus histograms where there can be one exemplar per bucket, since the bucket is also a label) The format it is exposed in is defined in open metrics. Other implementationsI've mostly looked at how this is implemented in the prometheus golang client. E.g how it is used on histograms. It is backed by a atomic value store that just gets overwritten. The value that is stored is found here Other thoughtsI noticed now that exemplars should not be implemented for summaries and gauges, only for histograms and counters, and for counters the openmetrics documentation is a big vague. IMO it would be enough to implement only for histograms, but that is still the hardest one so we might as well implement it for both. I'm not sure what happens if a metric doesn't get their exemplar updated and gets collected several times. I think prometheus handles that, but I need to do some more research. |
Just a note: this is still on my backlog to review, things have just been a bit hectic for me lately. 😅 |
No worries and no stress, but thanks for letting me know! I'm also on discord most of the time if you want to discuss something when you look at this |
So, my main question after reading your explanation (thank you for that!): do the exemplars have to be logically related? Which is to say... if I have two concurrent tasks/threads/whatever emitting metrics and both of them hit the random number generator lottery of "you should track these metrics as exemplars", do all the metrics they touch need to have the same examplar label (trace ID, etc) or could some of the metrics have trace ID 1, and some have trace ID 2, etc? Like, in terms of what the exemplar values are when the scrape endpoint is observed after both of those tasks/threads/whatever have finished and emitted all their metrics. |
They can be different, there is no connection between exemplars over different metrics. |
Alright, that's good news. 👍🏻 Depending on the behavior necessary, it seems like it could be possible to get away with sampling a value at the point of actually rendering the metrics. That is, every time the metrics are rendered -- which is just when we get a scrape request, or our interval to hit the push gateway ticks -- we collect the outstanding histogram samples and pick one of them to be our new exemplar. Avoiding new exporter-specific methods seems like the highest priority item in my mind. We should ideally be able to just collect exemplars with people using the |
Sorry for being so slow to reply. Yes, getting the exemplar on render would definitely be enough, and it sound like you are on to something smart, but I don't understand how exactly 😅. How would one add an exemplar, if there where no exporter specific method for it? |
Right, so my thought is that the render logic would essentially be responsible for figuring out if it was time to sample a new exemplar for each unique histogram. So we'd have the exemplar value itself, and probably a timestamp for "when was this exemplar observed?". In the render logic, where it checks to see if it needs to consume any more raw samples from the underlying histogram storage, we'd see how long ago we last captured an exemplar for the given histogram. If we've exceeded our timeout, then we take one of the samples we just consumed and make it our exemplar, and update our timestamp. Explained more contextually: We start with an exporter that has a default initial state: no metrics observed yet, etc. We'll refer to the time that these actions/operations occur with the
So, overall, I'm forcing some design decisions here for the sake of explaining my idea:
|
But how do we register the exemplar labels with the value? they need to be added when recording the value, as they contain data that is not know by the metric itself, most commonly a trace id that refers to some external tracing system. That trace id will be different for every value recorded (as long as there are not multiple metrics recorded during the same request, which I thought your initial question about) |
Are the labels for exemplars only meant for exemplars period? Like would you not typically include, say, a trade ID label unless you wanted it to be an exemplar? |
Yes exactly, its a separate label set that the regular label set for the metric itself. The common use case is to attach a trace id to a specific observation, and what I mean is a distributed trace id that comes from outside, like from jaeger, tempo, zipkin etc. Usually when you build some kind of web server with incoming http requests, there will be some ingress gateway that initiates a trace, creates a trace id and attaches it to the http request as headers, and then your web server reads these and attaches a bunch of "spans" to it, and upload those to whatever tracing backend you have, a trace id can span over multiple web servers, if they do requests to each other within the same initial request. The exemplar is then used to find a trace in the tracing backend from a graph built by metrics. Usually on latency histograms, so that when you see your latency graph and want to figure out why some requests take a long time, you click on the little exemplar dot representing a specific observed request that took a long time to process. Using the golang client, the definition would look something like this, specifying the metrics labels and buckets var (
histogram = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "foo_latency",
Buckets: []float64{0.01, 0.1, 1, 10},
},
[]string{"path"}
)
) Then when observing a value, it would be something like this, very simplified, but getting the trace id from the request, and adding it to the exemplar label-set. histogram.With(prometheus.Labels{"path": "/foo"}).(prometheus.ExemplarObserver).ObserveWithExemplar(
latency, prometheus.Labels{"traceID": req.Header.Get("x-trace-id")},
) This is what the prometheus exporter will render, just to clarify what it looks like:
In this example there are two different requests that have been recorded with an exemplar, one in bucket le=1 and one in bucket le=10, with their exact value, when they where observed, and this extra set of labels that include their trace id. (there can maximum be one observed exemplar per metric + metric labelset + bucket, if two are observed for the same, the later one just replaces the first one) Sorry if I'm poorly trying to explain things that you already know, but just trying to explain the flow of how these things are used in the kind of area that I work with. |
Hey @tobz, I lost the momentum a bit here, but would appreciate some feedback. I'm happy to continue on this if I know what way to go. |
Hey @fredr! I'm no stranger to losing momentum. :) I've started getting more serious about planning out the remaining work to bring With that said, I think my biggest concern, sort of right from the beginning: it doesn't feel right for there to be macros/methods/etc that are specific to Prometheus. I'm certainly not against third-party exporters having their own specific macros or something of the sort... but it's not the design pattern I want to promote in What I would want to see is figuring out a way to make the determination of when exemplars should be tracked something that functions more like a scoped behavior i.e. a function that takes a closure, and changes a thread-local to influence the behavior of any code running the closure. Alternatively, and maybe I'm still not entirely understanding how exemplars are typically tracked/triggered, but some sort of approach that was deterministic i.e. for any metric that has a specific label key, which is configured as part of the exporter itself, sample updates to metrics with that label at a configurable rate to derive exemplars. With an approach as described above, it maintains one of the original design goals of I'm happy to continue providing feedback on the design as long as it's along the lines as described above, because that represents both the most ergonomic path, and easiest to maintain path, in my eyes. If executing on what I've posted above feels like it would consume too much time, I totally understand. |
That sounds great, do you know already if that will have any implications on how buckets are registered for prometheus? or any other upcoming prometheus related changes?
Alright, my thinking behide the current poc implementation and suggestions was to keep all things that are specific to prometheus in the prometheus-exporter, but you have a good point that you then wouldn't get those specific features out of the box when using libraries that exposes metrics via this lib.
I'm not sure how to implement that, so I'll have to do some digging, if there is any such implementations in this or other crates that you know about, I would appreciate any pointers.
Thanks, I'll keep at it whenever I have time to spare, this would be a really useful feature for trace discoverability. |
I will close this PR as I currently struggle to find time to put into this, and maybe someone else want to pick it up. |
This is a work in progress implementation for #175
Opening this as a draft to have a discussion to see if this is a valid approach to go forward with, or if there is a different path that is cleaner/better.
I've tried to keep all changes to the prometheus exporter, as this is a prometheus-specific feature, but have had to add downcasting to the recorder, so that a handle to the specific recorder can be fetched. But maybe there is some more generic api that can be used. I'm not very familiar with other metrics systems, but maybe it is common to add additional data to observations?
There is an example of how to increase a counter and record a sample in a histogram with exemplars. This is extra problematic for histograms, as we need to know what bucket to assign the exemplar with.
I'm also thinking about what would be a good macro api for this. Since an exemplar is just an other set of labels, we can't just add it as an other parameter.
e.g. with the following its hard to know what labels should be metric labels and which should be exemplar labels
We could probably say that exemplar labels are always expressions, so something like would probably be possible (I haven't written any macros, but I'm guessing this is possilbe)
It might be confusing still, as it is easy to mix the different labels up.
Also, where would these macros live? I don't think we can add them to metrics-macros (behind a feature), since that would be a cyclic dependency if it needed to depend on the PrometheusRecorder.