Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus histograms #1165

Merged
merged 9 commits into from
Jul 19, 2021
Merged

Conversation

beniwohli
Copy link
Contributor

What does this pull request do?

The spec for histograms has been merged in
elastic/apm-server#5360. This implements the agent-side
histogram, as well as a Prometheus bridge.

Instead of letting the thread die, we log the exception with
a stack trace to ease debugging.
@beniwohli beniwohli added this to the 7.14 milestone Jun 14, 2021
@beniwohli beniwohli requested review from basepi and axw June 14, 2021 12:15
@apmmachine
Copy link
Contributor

apmmachine commented Jun 14, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-07-19T09:55:26.732+0000

  • Duration: 25 min 18 sec

  • Commit: ff607b1

Test stats 🧪

Test Results
Failed 0
Passed 9521
Skipped 8604
Total 18125

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 9521
Skipped 8604
Total 18125

Copy link
Contributor

@basepi basepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. 👍

It seems like some users will likely want to modify the defaults on these histogram metrics, specifically the buckets. Are we planning to make that configurable, or are we going to just point users to the prometheus bridge?

@beniwohli
Copy link
Contributor Author

@basepi it already is configurable in the API call when creating a histogram, see e.g. the tests: https://github.com/elastic/apm-agent-python/pull/1165/files#diff-478c95bc12f712e1c8be230780ea39fe00e2213dd8162c81656d4199c0881e35R91

Generally, the metrics API isn't considered public, but I think we should reconsider it at some point, as it covers most use cases now. Forcing people to use the prometheus_client if they aren't migrating away from Prometheus in the first place feels like a crutch

@basepi
Copy link
Contributor

basepi commented Jun 14, 2021

Yeah we should probably document that and make some public API entry-points (since we're largely trying to move away from interacting with the Client object directly). I created an issue: #1166

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only really looked closely at the Prometheus bits and how things are being sent to the server. I think that we should consolidate on one approach for converting bucket ranges to single values. Please take a look at the comments in the apm-server code I linked and let me know if you think that's a reasonable approach.

Comment on lines +89 to +115
sample_pos = 0
prev_val = 0
counts = []
values = []
name = self._registry.client.config.prometheus_metrics_prefix + name
while sample_pos < len(samples):
sample = samples[sample_pos]
if "le" in sample.labels:
values.append(float(sample.labels["le"]))
counts.append(sample.value - prev_val)
prev_val = sample.value
sample_pos += 1

else:
# we reached the end of one set of buckets/values, this is the "count" sample
self.histogram(name, unit=unit, buckets=values, **sample.labels).val = counts
prev_val = 0
counts = []
values = []
sample_pos += 3 # skip sum/created samples

METRIC_MAP = {
"counter": _prom_counter_handler,
"gauge": _prom_gauge_handler,
"summary": _prom_summary_handler,
"histogram": _prom_histogram_handler,
}
Copy link
Member

@axw axw Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In APM Server we have some code for handling OpenTelemetry Metrics, following Prometheus's histogram_quantile method: https://github.com/elastic/apm-server/blob/f7c7b0c6873d40f533c072b3d9751c8fe285e63e/processor/otel/metrics.go#L191-L252

I think we should probably follow the same approach here:

  • for +Inf "le", use the preceding bucket's value
  • for the first bucket only: if it has a negative "le", use the value as-is; otherwise use half its value (midpoint to zero)
  • for all other buckets, use the midpoint from that bucket's value to the preceding bucket's

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the same strategy that metricbeat is using to convert prometheus histograms?
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?

Copy link
Member

@axw axw Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the same strategy that metricbeat is using to convert prometheus histograms?

Good question. Looking at https://github.com/elastic/beats/blob/master/x-pack/metricbeat/module/prometheus/collector/histogram.go, no - there's some similarity, but it's not the same. Also I miswrote the rules above, now corrected.

Metricbeat is doing the following:

  • for +Inf "le", interpolate by adding half the difference between the previous two buckets to the second last bucket.
  • for the first bucket, use the midpoint from the bucket's value to zero. I think that's probably a bug, as it could lead to incorrect ordering of histogram values? e.g. if you have le=-100 then le=-99, then the first one will report as -50, and the second as -99.5
  • for all other buckets, use the midpoint from that bucket's value to the preceding bucket's

@exekias WDYT about changing Metricbeat's implementation to what I outlined in #1165 (comment)? Or otherwise, can you elaborate on the rationale behind the current implementation? We should probably all do the same thing. In retrospect, it would be helpful if Elasticsearch accepted bucket ranges 😅

Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?

t-digest. Prometheus histograms are general and may have negative values, which won't work with HDRHistogram.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, we've discussed whether to encode in _meta if the histogram is optimized for tdigest or hdrhistogram. Do you remember where we landed?
Seems relevant as the default transaction duration metrics are stored with hdrhistogram in mind whereas we need to use tdigest for prometheus to accoutn for negative values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was in an email thread with Lens folks. I think everyone agreed that it was a reasonable idea, but I haven't taken the idea any further yet. Lens will default to (like Elasticsearch) tdigest, and there's an issue open to support hdrhistogram: elastic/kibana#98499.

I've thought a little more about that since, and I think we would only ever end up setting it for known fields like transaction.duration.histogram. I'm not aware of any metrics APIs that restrict to positive-value histograms, which would be necessary to default to hdrhistogram instead. If there are any such metrics APIs, then it would be a reasonable extension, but for now I think we should just assume tdigest is going to be used for application metrics.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never thought about negative le values, is that a possibility at all? I think we can adjust Metricbeat implementation if so. FYI @ChrsMark

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the underlying measurement can be negative (energy flow, temperature, monetary values etc), the bucket limits also have to be negative, right?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little bit more of research, and negative buckets are indeed a possibility, although so badly supported that I don't think they can be used for real: https://github.com/prometheus/client_golang/blob/cb5c8ff250fc376cc4088d824a262565fde83612/prometheus/histogram.go#L50-L56

Anyway I'm happy to adjust the implementation, as it won't change anything for positive buckets.

Any thoughts on why putting +Inf on the last bucket value? My intent with interpolating a higher value so it moves the needle as much as previous midpoints. Again... if many values are ending in the +Inf bucket that's considered a bad config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on why putting +Inf on the last bucket value? My intent with interpolating a higher value so it moves the needle as much as previous midpoints. Again... if many values are ending in the +Inf bucket that's considered a bad config.

I assume this question is for me, based on the description in my initial comment. It's not +Inf on the last bucket, but the second last bucket's upper bound. I mainly chose that following what Prometheus's histogram_quantile function does:

If a quantile is located in the highest bucket, the upper bound of the second highest bucket is returned.

My thinking is that since we don't have any insight into that last bucket, so reporting anything greater than the second last bucket's upper limit risks overstating the value. Reporting the second last bucket's upper limit is a bit more conservative, ensuring that we report a valid lower bound for percentile queries that fall beyond the configured buckets.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation, I've created elastic/beats#26903 to adjust Beats implementation 👍

if counter is not noop_metric:
val = counter.val
if val or not counter.reset_on_collect:
samples[labels].update({name: {"value": val, "type": "counter"}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should omit "type": "counter" if the counter is resetting; just omit "type" altogether for now. Counter metrics are expected to be monotonically increasing.

@@ -375,6 +396,46 @@ def val(self, value):
self._val, self._count = value


class Histogram(BaseMetric):
DEFAULT_BUCKETS = (0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10, float("inf"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axw what are the default buckets that APM Server creates for transaction duration metrics?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many to list! APM Server uses HDRHistogram with 2 significant figures by default.

Comment on lines +89 to +115
sample_pos = 0
prev_val = 0
counts = []
values = []
name = self._registry.client.config.prometheus_metrics_prefix + name
while sample_pos < len(samples):
sample = samples[sample_pos]
if "le" in sample.labels:
values.append(float(sample.labels["le"]))
counts.append(sample.value - prev_val)
prev_val = sample.value
sample_pos += 1

else:
# we reached the end of one set of buckets/values, this is the "count" sample
self.histogram(name, unit=unit, buckets=values, **sample.labels).val = counts
prev_val = 0
counts = []
values = []
sample_pos += 3 # skip sum/created samples

METRIC_MAP = {
"counter": _prom_counter_handler,
"gauge": _prom_gauge_handler,
"summary": _prom_summary_handler,
"histogram": _prom_histogram_handler,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the same strategy that metricbeat is using to convert prometheus histograms?
Is that storage format optimizing for tdigest or hdrhistogram-based percentile aggregations?

@beniwohli beniwohli merged commit dec91b4 into elastic:master Jul 19, 2021
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes LGTM FWIW :) thanks

beniwohli added a commit to beniwohli/apm-agent-python that referenced this pull request Sep 14, 2021
* catch and log exceptions in the interval timer function

Instead of letting the thread die, we log the exception with
a stack trace to ease debugging.

* implement histogram metrics and a prometheus histogram bridge

The spec for histograms has been merged in
elastic/apm-server#5360

* trying to debug failure that only happens on CI...

* adapt prometheus histograms to use centroids instead of upper limit buckets

* move midpoint calculation into base metrics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants