-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define Exemplar requirements in the Metrics SDK spec #1797
Comments
Requirements for SDK + Metric ExemplarsHere's a set of requirements for Metric Exemplars based on some prototype exemplar sampling work I've done, as well as looking at existing Exemplar implementations. This is for discussion (for now) and i'll formalize into a PR once the aggregator section in the SDK is a bit more fleshed out, as this relies on aggregators. Basics
Sampling
Built-in ImplementationsThe following built-in samplers SHOULD be provided with easy configuration:
Prometheus ExporterWhen exporting to Prometheus, the following should happen:
Prototype implementation can be found here; |
@jsuereth nice summary! Here are my feedback:
I think similar to Span Limits, we will have some limits on how many samples are we going to allow at maximum (for a bucket, for a time series data point. etc.). This could be useful for sync instruments where users are taking too many samples, or can be useful for pull exporter scenario where we don't want to hold the samples for too long time (e.g. if the scraper stopped pulling for hours). When we need to "merge" histograms based on interpolation (whenever lossless merge is not available), samples can actually go to the new buckets with 100% confidence (because we have the raw information such as the duration). |
I prefer when "Sampling" means something statistical is taking place, and the word "Exemplar" explicitly suggests a selection technique that is not sampling. Thus, Sampling should be an option and instead of "Always-off sampling" or "No-sampling", maybe just "No exemplars". Instead of "Preserve-latest-with-sampled-trace", maybe "Latest exemplars". When it comes to sampling, open-telemetry/oteps#148 has recommendations for using exemplars to convey sample events with a |
Do we prefer to model MIN/MAX as exemplars (e.g. a cumulative sum 100, with the MAX 5 and MIN 2)? |
Do we allow users to control what data to report with exemplars (e.g. I want the trace id / span id and all the items in the baggage vs. I just need trace id / span id)? |
I like this phrasing. When propsing defaults I'll use this.
Yes, I'm working on reservoir sampling in the Java Metrics prototype right now so we can see how well it does in practice. Specifically, right now Prometheus (and OpenCensus) sample with a "take-latest-per-histogram-bucket" approach (for histogram aggregation). I like the idea of reservoir sampling, and I like the idea of it being the default. The only question in my mind is if we should have a "sample like OpenCensus/Prometheus" hook here.
This is a good point. Want to call out a few things:
So, I don't think baggage-labels on Exemplar is initially important here, but it's a good use case to follow up with. From my view, that's some kind of
I think MIN/MAX could be exemplars (possible where we add labels denoting this). However, I don't think that should be the default behavior, and it makes consuming the data a bit harder. I'd prefer reservoir sampling and knowing your min/max are min/max BUT we could encode min/max into exemplars. |
Proposal here: https://github.com/jsuereth/opentelemetry-specification/tree/wip-exemplar blocked on #1804 |
What are you trying to achieve?
The metrics data model specification has covered Exemplar here.
The goal is to have the SDK specification supporting exemplars.
Related to #1260.
The text was updated successfully, but these errors were encountered: