Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify preferred aggregation temporality; give Views a selection strategy #2314

Closed
wants to merge 22 commits into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Feb 4, 2022

Changes

Clarify "preferred aggregation temporality" and introduce a mechanism to configure, via View, whether the preferred aggregation temporality is honored.

Sum, Histogram, and ExpontentialHistogram data points add a new "strategy" allowing them to honor or override the Reader/Exporter's preference. UpDownCounter instruments (sync and async), in particular, will override the exporter preference and choose cumulative by default. A view configuration can change this.

Note: a hypothetical strategy to override the preference and choose delta is not introduced, though it conceptually makes sense. This could be added later if anyone finds a use.

I propose the name "Stateless" as the name of a preference that requires no long term state, which means to use Delta temporality for synchronous instruments and leave asynchronous instruments in their natural Cumulative form. In a future PR, I will propose adding a new preference called Stateless, described as:

- **Stateless**: The Metric Exporter prefers Delta temporality for
synchronous instruments and Cumulative temporality for asynchronous
instruments to avoid long-term memory.

Part of #2065

@jmacd jmacd requested review from reyang, pirgeo and a team February 4, 2022 00:01
Copy link
Member

@pirgeo pirgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying this. I generally like it, I am just not entirely bought into the idea that cumulative->delta conversion is something that most people would not want (see my comment there)

specification/metrics/api.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
@arminru arminru added area:sdk Related to the SDK spec:metrics Related to the specification/metrics directory labels Feb 4, 2022
@jmacd
Copy link
Contributor Author

jmacd commented Feb 4, 2022

In open-telemetry/opentelemetry-go#2350, I wrote the PR "removes a bunch of complexity that the Metrics SIG group decided not to specify for the default SDK."

Maintaining Cumulative-to-Delta conversion in the SDK has a definite cost and as far as I know is not useful to very many. We are debating a scenario where a user has a non-OTLP protocol that supports Delta measurements and Gauge measurements but does not support Cumulative measurements. The precedent we know about (StatsD) would use Gauge for these measurements. I polled the NewRelic engineers during that conversation, who have a protocol that is strictly based on Deltas, and they confirmed that they would convert Cumulatives to Gauges. @jack-berg could you confirm?

The test question we should answer is if you have a number like Heap Usage, which is modeled as an OTel UpDownCounter, and you have a protocol that does Gauge and Delta, would you really like to see your Heap Usage turned into a Delta? I believe the answer is no.

We had this discussion in November before the current Spec was written, before I removed Subtract() support from the OTel-Go implementation. I had implemented Subtract() for Sum points but not for Histogram points, and the total complexity for this (nearly useless, IMO) feature did not warrant keeping it.

The current specification is very unclear, and to my reading never required an implementation to implement Cumulative-to-Delta translation:

[OpenTelemetry SDK](../overview.md#sdk)
authors MAY choose the best idiomatic design for their language:

* Whether to treat the temporality settings as recommendation or enforcement.
  For example, if the temporality is set to Delta, would the SDK want to perform
  Cumulative->Delta conversion for an [Asynchronous
  Counter](./api.md#asynchronous-counter), or downgrade it to a
  [Gauge](./datamodel.md#gauge), or keep consuming it as Cumulative due to the
  consideration of [memory
  efficiency](./supplementary-guidelines.md#memory-management)?
* Refer to the [supplementary
  guidelines](./supplementary-guidelines.md#aggregation-temporality), which have
  more context and suggestions.

I don't believe I'm actually changing the specification here, except to introduce the term "Stateless" to describe a preference for no changes of temporality, which is neither always Cumulative nor always Delta.

@jmacd
Copy link
Contributor Author

jmacd commented Feb 4, 2022

No one has challenged the idea that "Stateless" be a good name for the preference to not modify temporality in an SDK.

There are a number of reviewers who seem to think Cumulative-to-Delta translation is needed for backends that support only Delta temporality. I emphatically do not believe this is true. In any case, if this were true it might lead to a specification change that requires SDKs support Cumulative-to-Delta, which is clearly not included in the current specification. In that case "Delta" is the natural name for a preference to convert all data into delta temporality.

You might point to the existence of the OTC cumulativetodelta processor as proof that someone wants this. The use that we know of for that code was a Prometheus-to-Cloudwatch pipeline, where the Cumulative values are first turned into Deltas and then turned into Rates. The point here is that Deltas were not the end goal of that transformation, and CloudWatch does not directly support Deltas. You can convert Cumulative to Delta to Rate data for backends that support only Gauges, but here we are discussing a very esoteric feature that belongs in a collector processor, not in the SDK.

Co-authored-by: Georg Pirklbauer <georg.pirklbauer@dynatrace.com>
@jmacd
Copy link
Contributor Author

jmacd commented Feb 4, 2022

I put this PR together mainly because:

(1) I thought we had decided against requiring Cumulative-to-Delta. I we want to change that to a requirement, we should open a new PR.
(2) I think the "stateless" preference should be called out as it's the one that requires no long term memory

There is another potential preference that my employer finds appealing, it's like Stateless except for UpDownCounter. We prefer UpDownCounter in cumulative form because otherwise we would have to read old data or maintain checkpoints to know the current value of the metric, and the current value of an UpDownCounter metric is semantically useful.

This preference could be named "StatelessMonotonic", since it is stateless except for the one synchronous non-monotonic instrument. Why prefer to use Delta for Counter and Histogram, but not UpDownCounter, you ask? The totals of a Counter or Histogram are meaningful totals, but because these counts are monotonic there is little-to-no semantic loss in the use of Deltas.

@jack-berg
Copy link
Member

Maintaining Cumulative-to-Delta conversion in the SDK has a definite cost and as far as I know is not useful to very many. We are debating a scenario where a user has a non-OTLP protocol that supports Delta measurements and Gauge measurements but does not support Cumulative measurements. The precedent we know about (StatsD) would use Gauge for these measurements. I polled the NewRelic engineers during that conversation, who have a protocol that is strictly based on Deltas, and they confirmed that they would convert Cumulatives to Gauges. @jack-berg could you confirm?
The test question we should answer is if you have a number like Heap Usage, which is modeled as an OTel UpDownCounter, and you have a protocol that does Gauge and Delta, would you really like to see your Heap Usage turned into a Delta? I believe the answer is no.

This PR's timing has really convenient timing because I've actually just been looking into this exact scenario more closely. As mentioned, our metrics system uses delta, and we do convert both cumulative non-monotonic sum data and delta non-monotonic sum to gauges. And as you suggest, having heap usage represented by asynchronous up down counter turned into a delta is problematic, since it means that in order to determine the current heap usage at a point in time Tn in time you have to sum up all the delta values from T0 -> Tn. And this doesn't work with gauge data because when gauges undergo temporal aggregation, the last value wins semantics means that the deltas required to compute current heap usage are dropped.

I agree with this change 👍.

Copy link
Member

@pirgeo pirgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After considering this in more detail, here are some more thoughts:
I am not sure if we can lump together async UpDownCounter and async Counter instruments in this PR. Lets start with the UpDownCounter:

I think what is throwing me off is that the async UpDownCounter's current value is called "cumulative". It is the value of something that is observed at the time the callback is called. The "cumulative" wording seems to refer to the fact that the changes are made by adding and subtracting values from the underlying value, which is later observed. The UpDownCounter itself does not have a temporality, it just reports the current value of the underlying object(s), just like a gauge does. Are values always "cumulative" just because they are created and updated in an additive fashion? To me, it does not seem to have anything to do with the export temporality.

With that being said, the UpDownCounter DOES feel very similar to a gauge. There are very fine nuances between exporting these as a gauge vs. exporting them as a non-monotonic cumulative sum (to put it into OTLP terms, and to confirm my understanding that this is what UpDownCounters would be exported as). I understand that most vendors do not care about the possibility of adding together the heap size of two processes (something using a Sum would provide out of the box) or provide ways of adding two gauges together anyway. To conclude, and to actually bring this thread back to the question about Cumulative->Delta conversion, I also agree that this conversion is probably not necessary for UpDownCounters. UpDownCounters (especially async ones) are conceptually so close to Gauges that I agree that only exporting the total value recorded by the callback (the cumulative/current value) makes sense.

For the Async Counter however, this is a different story, and I think we need to focus our attention on that. Consider something like CPU time or page faults. If I want Deltas in my backend, I'd want the delta to the previous value, and not the total value on every export. In this case I definitely want to be able to add CPU times together to aggregate them. In this case language like "CPU time since the last export" makes sense again (which is admittedly not entirely true for Heap size or queue length).
Therefore, I am a strong proponent of keeping the Delta conversion at least for Async Counters. Dropping support for it will render Async counter data mostly useless for delta-only backends (exporting them as gauges without calculating a delta will lead to an ever-increasing metric).

specification/metrics/api.md Outdated Show resolved Hide resolved
specification/metrics/api.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
@jack-berg
Copy link
Member

FYI, @pirgeo makes a really good point about this change improving the behavior of async up down counter but making the experience worse for async counters. Still interested in this topic - taking some more time to think carefully about the various options and the repercussions.

@jmacd
Copy link
Contributor Author

jmacd commented Feb 11, 2022

Thank you @pirgeo and @jack-berg. I agree with the points raised and am willing to reconsider my position.
It sounds like three distinct preferences are real and potentially useful:

  1. Cumulative: All cumulative
  2. StatelessMonotonic: Delta Histogram, Delta Counter, Cumulative Async Counter, Cumulative UpDownCounter
  3. DeltaMonotonic: Delta Histogram, Delta Counter, Cumulative UpDownCounter, Delta Async Counter, Cumulative UpDownCounter

It seems like the following preferences are less useful, potentially not needed:

  1. Delta: All delta
  2. FullyStateless: Delta UpDownCounter, ...

@jack-berg
Copy link
Member

@jmacd I think there might be some typos in three preferences you mention: StatelessMonotonic doesn't mention Async UpDownCounter, and DeltaMonotonic mentions CumulativeUpDownCounter twice. Here's my take on the most useful aggregation temporality semantics for a delta backend. It would be super useful if there was an aggregation temporality strategy that would produce this.

Instrument Export Temporality + Aggregation Use Case Explanation
Histogram Delta Histogram HTTP Server Duration Delta histograms include use min / max, minimize memory usage in SDK
Sync Counter Delta Sum Count of processed bytes Useful to analyze diff of bytes processed, minimize memory usage in SDK
Async Counter Delta Sum Process CPU Time Useful to know diff in CPU time, not cumulative CPU time
Sync UpDownCounter Cumulative Sum Items in a queue Useful to know size of queue, not change in size of queue
Async UpDownCounter Cumulative Sum Process Memory Usage Useful to analyze current memory usage, not change in memory usage
Gauge Gauge Temperature Temporality not applicable

In light of this conversation and the push to stabilize the metric SDK, I'm worried that the SDK's language around MetricReader and MetricExporter may lead to SDK design that makes it difficult to allow aggregation temporality strategies like we're discussing here.

The metric SDK spec says that MetricReader and MetricExporter have methods to access their preferred aggregation temporality. This lends itself to SDK design like java's in which MetricExporter#getPreferredTemporality() returns an enum type called AggregationTemporality whose values are DELTA and CUMULATIVE, and which could not be extended to include the other strategies.

If we want to support these types of strategies I think it's important to adjust the SDK to rename "preferred temporality" for MetricReader and MetricExporter to something like "aggregation temporality strategy". Aggregation temporality strategies at SDK launch might only include "ALWAYS_DELTA" and "ALWAYS_CUMULATIVE", but could be extended with additional strategies later.

@jmacd
Copy link
Contributor Author

jmacd commented Feb 14, 2022

@jack-berg Thank you. Yes, I left a lot of bytes out of my summary, and you filled it in nicely. Speaking as a vendor now, we would be glad if users configured the temporality preference / strategy that you outlined.

@alanwest
Copy link
Member

The test question we should answer is if you have a number like Heap Usage, which is modeled as an OTel UpDownCounter, and you have a protocol that does Gauge and Delta, would you really like to see your Heap Usage turned into a Delta? I believe the answer is no.

I agree that a delta non-monotonic sum is not useful here. I believe this is true irrespective of sync vs. async. That is, I can't think of an UpDownCounter use case where I'd ever want things turned into a delta.

Maybe a slightly different spin on things - instead of expanding on the notion on preferred temporality, what if UpDownCounters were exempt from temporality much like gauges. From the perspective of the OTLP data model, measurements from UpDownCounters would always result in a non-monotonic sum with temporality unspecified (if unspecified is not allowed then I guess just cumulative).

@alanwest
Copy link
Member

alanwest commented Mar 3, 2022

@jmacd Thank you for the deep thought you've put into to this effort.

I just want to make sure I'm reading your table correctly ... I'm not following the distinction between the "Stateless" and "Selected Temporality, Stateless Preference" columns. Is the former simply communicating which temporality results in no need for the SDK to maintain state - whereas the "Selected Temporality, Stateless Preference" column represents the actual effect of the Stateless Temporality setting?

Or put another way, setting PreferredTemporality=Stateless means the SDK would actually need to maintain state for synchronous UpDownCounters?

Co-authored-by: Alan West <3676547+alanwest@users.noreply.github.com>
@reyang
Copy link
Member

reyang commented Mar 4, 2022

We've discussed this during the Friday triage meeting. This PR is considered as "very important and nice to have, but NOT necessarily blocking the stable release of the SDK spec".

@jmacd will scope down this PR, leaving "stateless" out. We can take this change in the initial SDK spec release.

@reyang reyang added the release:allowed-for-ga Editorial changes that can still be added before GA since they don't require action by SIGs label Mar 4, 2022
@reyang reyang added this to the Metrics API/SDK Stable Release milestone Mar 4, 2022
@reyang reyang removed this from the Metrics API/SDK Stable Release milestone Mar 4, 2022
@jmacd
Copy link
Contributor Author

jmacd commented Mar 4, 2022

I removed a single line from this PR:

- **Stateless**: The Metric Exporter prefers Delta temporality for
synchronous instruments and Cumulative temporality for asynchronous
instruments to avoid long-term memory.

PTAL.

@jmacd jmacd changed the title Metrics SDK: Clarify specification for preferred aggregation temporality Clarify preferred aggregation temporality; give Views a selection strategy Mar 4, 2022
@jmacd
Copy link
Contributor Author

jmacd commented Mar 4, 2022

@reyang pointed out an inconsistency in the OTLP exporter SDK spec which refers to the aggregation temporality, not to the preference. I consider this to be an accidental omission. See the fix: c5a5a74

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -501,7 +501,7 @@ pattern](https://en.wikipedia.org/wiki/Asynchronous_method_invocation) and
See the [general requirements for asynchronous instruments](#asynchronous-instrument-api).

Note: Unlike [Counter.Add()](#add) which takes the increment/delta value, the
callback function reports the absolute value of the counter. To determine the
callback function reports the current value of the counter. To determine the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand the change, but would prefer this in a separate PR since is about just "clarification" for API.

@@ -672,25 +679,53 @@ The SDK SHOULD provide a way to allow `MetricReader` to respond to
idiomatic approach, for example, as `OnForceFlush` and `OnShutdown` callback
functions.

### Preferred aggregation temporality
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too complicated for our users. I would like to think of a simpler solution.

Can we have our MetricReader support a configuration option that says exactly for each type of instrument what to produce?

I feel that we complicate things here because we have this "notion" of preferred, because of these we are trying to say "you may prefer delta, but you don't know what you want, I will give you cumulative for this".

I think this is too complicated, have we consider that every "exporter" instead of having a "preferred" to actually come with a concrete temporality <type, temporality> list so they have full control, and we don't try to be smarter like "Preferred".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have our MetricReader support a configuration option that says exactly for each type of instrument what to produce?

That was one of the initial proposals, the "5-way" function as it has been called in the review thread above. I think that's fine as a means of configuring the exporters that support both options. If we accept the 5-way function, which seems easy for the in-memory exporter, and fix the output temporality for the console and OTLP exporters to always cumulative, then:

(1) vendors that want always-delta or sometimes-delta will provide implementations of the 5-way function they prefer
(2) Does this mean we can erase the concept of preferred temporality entirely?

@@ -1,4 +1,4 @@
# OpenTelemetry Metrics Exporter - OTLP
1# OpenTelemetry Metrics Exporter - OTLP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1# OpenTelemetry Metrics Exporter - OTLP
# OpenTelemetry Metrics Exporter - OTLP

@jmacd
Copy link
Contributor Author

jmacd commented Mar 4, 2022

I've been convinced that we should eliminate the concept of preferred aggregation temporality, see #2314 (comment)

@jmacd
Copy link
Contributor Author

jmacd commented Mar 5, 2022

See #2404

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sdk Related to the SDK release:allowed-for-ga Editorial changes that can still be added before GA since they don't require action by SIGs spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants