-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus histograms #1165
Prometheus histograms #1165
Conversation
Instead of letting the thread die, we log the exception with a stack trace to ease debugging.
The spec for histograms has been merged in elastic/apm-server#5360
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. 👍
It seems like some users will likely want to modify the defaults on these histogram metrics, specifically the buckets
. Are we planning to make that configurable, or are we going to just point users to the prometheus bridge?
@basepi it already is configurable in the API call when creating a histogram, see e.g. the tests: https://github.com/elastic/apm-agent-python/pull/1165/files#diff-478c95bc12f712e1c8be230780ea39fe00e2213dd8162c81656d4199c0881e35R91 Generally, the metrics API isn't considered public, but I think we should reconsider it at some point, as it covers most use cases now. Forcing people to use the |
Yeah we should probably document that and make some public API entry-points (since we're largely trying to move away from interacting with the Client object directly). I created an issue: #1166 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only really looked closely at the Prometheus bits and how things are being sent to the server. I think that we should consolidate on one approach for converting bucket ranges to single values. Please take a look at the comments in the apm-server code I linked and let me know if you think that's a reasonable approach.
sample_pos = 0 | ||
prev_val = 0 | ||
counts = [] | ||
values = [] | ||
name = self._registry.client.config.prometheus_metrics_prefix + name | ||
while sample_pos < len(samples): | ||
sample = samples[sample_pos] | ||
if "le" in sample.labels: | ||
values.append(float(sample.labels["le"])) | ||
counts.append(sample.value - prev_val) | ||
prev_val = sample.value | ||
sample_pos += 1 | ||
|
||
else: | ||
# we reached the end of one set of buckets/values, this is the "count" sample | ||
self.histogram(name, unit=unit, buckets=values, **sample.labels).val = counts | ||
prev_val = 0 | ||
counts = [] | ||
values = [] | ||
sample_pos += 3 # skip sum/created samples | ||
|
||
METRIC_MAP = { | ||
"counter": _prom_counter_handler, | ||
"gauge": _prom_gauge_handler, | ||
"summary": _prom_summary_handler, | ||
"histogram": _prom_histogram_handler, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In APM Server we have some code for handling OpenTelemetry Metrics, following Prometheus's histogram_quantile method: https://github.com/elastic/apm-server/blob/f7c7b0c6873d40f533c072b3d9751c8fe285e63e/processor/otel/metrics.go#L191-L252
I think we should probably follow the same approach here:
- for +Inf "le", use the preceding bucket's value
- for the first bucket only: if it has a negative "le", use the value as-is; otherwise use half its value (midpoint to zero)
- for all other buckets, use the midpoint from that bucket's value to the preceding bucket's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the same strategy that metricbeat is using to convert prometheus histograms?
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the same strategy that metricbeat is using to convert prometheus histograms?
Good question. Looking at https://github.com/elastic/beats/blob/master/x-pack/metricbeat/module/prometheus/collector/histogram.go, no - there's some similarity, but it's not the same. Also I miswrote the rules above, now corrected.
Metricbeat is doing the following:
- for +Inf "le", interpolate by adding half the difference between the previous two buckets to the second last bucket.
- for the first bucket, use the midpoint from the bucket's value to zero. I think that's probably a bug, as it could lead to incorrect ordering of histogram values? e.g. if you have le=-100 then le=-99, then the first one will report as -50, and the second as -99.5
- for all other buckets, use the midpoint from that bucket's value to the preceding bucket's
@exekias WDYT about changing Metricbeat's implementation to what I outlined in #1165 (comment)? Or otherwise, can you elaborate on the rationale behind the current implementation? We should probably all do the same thing. In retrospect, it would be helpful if Elasticsearch accepted bucket ranges 😅
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?
t-digest. Prometheus histograms are general and may have negative values, which won't work with HDRHistogram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point, we've discussed whether to encode in _meta if the histogram is optimized for tdigest or hdrhistogram. Do you remember where we landed?
Seems relevant as the default transaction duration metrics are stored with hdrhistogram in mind whereas we need to use tdigest for prometheus to accoutn for negative values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was in an email thread with Lens folks. I think everyone agreed that it was a reasonable idea, but I haven't taken the idea any further yet. Lens will default to (like Elasticsearch) tdigest, and there's an issue open to support hdrhistogram: elastic/kibana#98499.
I've thought a little more about that since, and I think we would only ever end up setting it for known fields like transaction.duration.histogram
. I'm not aware of any metrics APIs that restrict to positive-value histograms, which would be necessary to default to hdrhistogram instead. If there are any such metrics APIs, then it would be a reasonable extension, but for now I think we should just assume tdigest is going to be used for application metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never thought about negative le
values, is that a possibility at all? I think we can adjust Metricbeat implementation if so. FYI @ChrsMark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the underlying measurement can be negative (energy flow, temperature, monetary values etc), the bucket limits also have to be negative, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a little bit more of research, and negative buckets are indeed a possibility, although so badly supported that I don't think they can be used for real: https://github.com/prometheus/client_golang/blob/cb5c8ff250fc376cc4088d824a262565fde83612/prometheus/histogram.go#L50-L56
Anyway I'm happy to adjust the implementation, as it won't change anything for positive buckets.
Any thoughts on why putting +Inf on the last bucket value? My intent with interpolating a higher value so it moves the needle as much as previous midpoints. Again... if many values are ending in the +Inf
bucket that's considered a bad config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on why putting +Inf on the last bucket value? My intent with interpolating a higher value so it moves the needle as much as previous midpoints. Again... if many values are ending in the +Inf bucket that's considered a bad config.
I assume this question is for me, based on the description in my initial comment. It's not +Inf
on the last bucket, but the second last bucket's upper bound. I mainly chose that following what Prometheus's histogram_quantile
function does:
If a quantile is located in the highest bucket, the upper bound of the second highest bucket is returned.
My thinking is that since we don't have any insight into that last bucket, so reporting anything greater than the second last bucket's upper limit risks overstating the value. Reporting the second last bucket's upper limit is a bit more conservative, ensuring that we report a valid lower bound for percentile queries that fall beyond the configured buckets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation, I've created elastic/beats#26903 to adjust Beats implementation 👍
elasticapm/metrics/base_metrics.py
Outdated
if counter is not noop_metric: | ||
val = counter.val | ||
if val or not counter.reset_on_collect: | ||
samples[labels].update({name: {"value": val, "type": "counter"}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should omit "type": "counter"
if the counter is resetting; just omit "type" altogether for now. Counter metrics are expected to be monotonically increasing.
@@ -375,6 +396,46 @@ def val(self, value): | |||
self._val, self._count = value | |||
|
|||
|
|||
class Histogram(BaseMetric): | |||
DEFAULT_BUCKETS = (0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10, float("inf")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@axw what are the default buckets that APM Server creates for transaction duration metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many to list! APM Server uses HDRHistogram with 2 significant figures by default.
sample_pos = 0 | ||
prev_val = 0 | ||
counts = [] | ||
values = [] | ||
name = self._registry.client.config.prometheus_metrics_prefix + name | ||
while sample_pos < len(samples): | ||
sample = samples[sample_pos] | ||
if "le" in sample.labels: | ||
values.append(float(sample.labels["le"])) | ||
counts.append(sample.value - prev_val) | ||
prev_val = sample.value | ||
sample_pos += 1 | ||
|
||
else: | ||
# we reached the end of one set of buckets/values, this is the "count" sample | ||
self.histogram(name, unit=unit, buckets=values, **sample.labels).val = counts | ||
prev_val = 0 | ||
counts = [] | ||
values = [] | ||
sample_pos += 3 # skip sum/created samples | ||
|
||
METRIC_MAP = { | ||
"counter": _prom_counter_handler, | ||
"gauge": _prom_gauge_handler, | ||
"summary": _prom_summary_handler, | ||
"histogram": _prom_histogram_handler, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the same strategy that metricbeat is using to convert prometheus histograms?
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes LGTM FWIW :) thanks
* catch and log exceptions in the interval timer function Instead of letting the thread die, we log the exception with a stack trace to ease debugging. * implement histogram metrics and a prometheus histogram bridge The spec for histograms has been merged in elastic/apm-server#5360 * trying to debug failure that only happens on CI... * adapt prometheus histograms to use centroids instead of upper limit buckets * move midpoint calculation into base metrics
What does this pull request do?
The spec for histograms has been merged in
elastic/apm-server#5360. This implements the agent-side
histogram, as well as a Prometheus bridge.