Metrics: Scope the Views API #466

jmacd · 2020-02-13T22:02:25Z

The OpenCensus system gave us a precedent for a "Views" API over metrics, allowing users to configure how instruments are aggregated, including which operator to use and which dimensions to use.

We have debated whether the developers, who are responsible for defining metric instruments, should have the ability to declare aggregations directly, at the point of definition. As an alternative, we have discussed the excluding this ability from the instrument declaration and requiring all aggregations to be configured through a Views API.

In the 0.4 specification we are considering to remove WithRecommendedKeys which is the only existing facility in the specification that lets a user configure aggregation.

As a whole, the Views API needs to be scoped for OpenTelemetry. I believe this should be considered an SDK API, meaning the default SDK will implement this facility. Will a Views API be written for every language? (I assume not.) Will some languages expect to export raw events, to allow a Views API to execute in the OTel collector? (I assume so.)

The text was updated successfully, but these errors were encountered:

toumorokoshi · 2020-02-17T06:10:12Z

I'm not confident on my perspective on every issue here, but wanted to add some thoughts regardless:

Static vs Dynamic / API driven configuration

Is there some way that both can supported? It's certainly true that many existing solutions do not provide a way to dynamically configure the aggregation, but it seems to me like the common developer workflow would be:

as a developer, I may choose a cardinality-conservative aggregation, but pass more labels than I need. As cardinality on metrics generally correlates to cost, conservative to start is a a pragmatic option.
As issues arise, I will need to use more of the labels to better understand the drivers of my metric. I will probably turn this on temporarily, as it helps me debug an issue at that point in time, and turn it off afterward once the issue is investigated.

Thus I feel like having a static way to initially define the aggregation in the normally operating case, and allowing dynamic overrides for deeper debugging, is the most desirable scenario.

Will a Views API be written for every language

I feel like it's valuable to ensure consistency between the availability of such APIs across languages. It would be quite a shock if one rewrote a service in a different language, only to find that one has lost some flexibility in their telemetry in the process.

Why is it expected that a views API would not exist in every language?

APIs in the collector

Definitely, the collector is the right place for this type of dynamic dimensionality configuration. I'd almost say that to the point that we shouldn't implement dynamic dimensionality anywhere except the collector, but I imagine that would make it very difficult for those who want to adopt views but can't use the collector for some reason.

Naming: Aggregation over Views?

Just a random thought. Views felt to me like a somewhat intuitive name, but I think aggregations actually is a very eloquent way of describing this feature.

SergeyKanzhelev · 2020-02-17T22:26:11Z

One important note about views - it's great to allow multiple views be defined on a single metric with the different set of dimensions and aggregation interval. Each view is defined by and for a separate exporter.

jmacd · 2020-02-20T19:37:51Z

@rghetia and @c24t will take ownership of this.

jmacd · 2020-02-20T19:40:36Z

I think aggregations actually is a very eloquent way of describing this feature.

+1

jmacd · 2020-02-20T19:44:46Z

having a static way to initially define the aggregation in

Would you accept having to call a second API to configure the default aggregation keys in your SDK? There is a question about keeping or removing the WithRecommendedKeys option, which is provided to allow you to set the aggregation dimensions at the point where you allocate a metric instrument.

toumorokoshi · 2020-02-27T15:29:20Z

Would you accept having to call a second API to configure the default aggregation keys in your SDK?

I'm thinking through this a bit. It would be nice to have a way to define the aggregation I want by default, and do it in a single call with the construction of the metric. I would imagine this could work by passing an aggregation to the metric constructor itself.

One major reason is that in Python, annotating functions with decorators that define a metric is common. So it would be nice to have a single definition:

@counter("foo.bar", labels=("env", "service"), Aggregation=MinMaxSumCountAggregator())
def perform_work():
     ...

(Sorry I know that isn't precisely the interface in the spec, I'm just giving a conceptual example)

toumorokoshi · 2020-02-27T16:43:11Z

Would you accept having to call a second API to configure the default aggregation keys in your SDK?

I'm thinking through this a bit. It would be nice to have a way to define the aggregation I want by default, and do it in a single call with the construction of the metric. I would imagine this could work by passing an aggregation to the metric constructor itself.

One major reason is that in Python, annotating functions with decorators that define a metric is common. So it would be nice to have a single definition:

@counter("foo.bar", labels=("env", "service"), Aggregation=MinMaxSumCountAggregator())
def perform_work():
     ...

(Sorry I know that isn't precisely the interface in the spec, I'm just giving a conceptual example)

tsloughter · 2020-02-29T16:01:45Z

+1 to being in the SDK. The API allows developers to record metrics for their logic but then the end user decides what, if anything, they want to do with those recordings.

The only exception is the counter. I like the counter as an optimization by the developer for times a count is really all that makes sense for a metric. In many languages it can be optimized to a very efficient update function that can be done inline and skip any collection of metric recordings and aggregation logic.

Allowing the raw events (except for counter in my opinion) to be sent to the collector for aggregation, so views being defined there, is a nice idea too. A compact protocol for sending those is a must in that case.

So I disagree with allowing an aggregation on a counter by the API like in @toumorokoshi 's python example. Or at the very least, if aggregation is to be part of the API, that should have to be a @measure to have an aggregation that isn't sum.

toumorokoshi · 2020-03-01T22:14:23Z

this is a tricky situation, as I understand that exposing some things via the API is undesirable, and one aspect here is to reduce the amount of work / background one has to have on the API consumer side.

I think it's very common for the one who instruments to be opinionated in how the data is aggregated.

Within Zillow Group, we have a global aggregation configuration setting for timing metric (similar to statsite's max,min,avg,pXX). A common ask for uses it to have different aggregations for their specific metric. They have some understanding of the behavior of the metric (e.g. high volume) that would dictate them choosing the aggregation.

It could be a two-call operation to 1. instrument and 2. choose the aggregation, but how would that look to the user? One thing I like about our current trace api is that the user basically needs to know nothing about how the SDK is actually configured. I can configure the span processor and exporter in a generic fashion, and the user has control over what to instrument, and a way to tell the exporter that a specific trace/span must be emitted (the TraceFlags). If possible, I'd like the metric aspect of things to be similarly hands off.

c24t · 2020-04-09T21:41:51Z

Linking the views OTEP here: open-telemetry/oteps#89.

jsuereth · 2021-03-25T18:43:42Z

My $.02 here, we have three use cases:

I think having a default aggregation / "view" for instruments is a wonderful productivity boon for users, especially around synchronous instruments that need to be aggregated before reporting at known intervals.
I think there's a use case where the default aggregation is NOT the one the user wants. in scenarios where the author of the instrument understands the full semantics and wants a different default aggregation, they likely need a mechanism to do so. In OpenCensus since all aggregation required configuration, this was a solved problem.
I think it'd be highly useful for folks to report MULTIPLE metric data streams with different aggregations from the same instrument. This may be more important if I consume some instrumentation lirbary and want to reconfigure their default aggregation to work better with my backend. As @toumorokoshi points out, this could also be a temporary aggregation to diagnose a problem in production.

OpenCensus views solved 1 and 2, but I think struggled a bit with 3. Whatever solutions OTel provides, I think we should target these 3 use cases, and possibly with 3 mechanisms (not one). I.e. I don't think a "views" API should be used to solve use case 3. I also don't think an API which solves 3 should be the only way to deal with a bad default aggregation (use case 2).

Can we split the discussion into two problems/solutions:

An API to allow users to choose a different default aggregation. This should be an acceptable option if we find an Instrument type that could be aggregated in more than one way. (Use case 2)
An API/Configuration/Processing that allows users to provide additional aggregations on existing instruments. (Use case 3)

tsloughter · 2021-03-25T21:24:06Z

Agreed.

Also, OpenCensus solved (3) well, was one of its strengths in my opinions. For example, you could aggregate the request latency measure as a count of the number of requests.

jmacd added the spec:metrics Related to the specification/metrics directory label Feb 13, 2020

jmacd added this to the v0.5 milestone Feb 13, 2020

jmacd mentioned this issue Feb 13, 2020

Combined Metrics API updates for v0.3 (Add Observer instrument, remove Gauge instrument, etc.) #430

Merged

jmacd mentioned this issue Feb 20, 2020

Metrics: Consider removing the WithRecommendedKeys option #463

Closed

c24t assigned rghetia and c24t Feb 20, 2020

mxiamxia mentioned this issue May 21, 2020

ability to capture all metric raw data without aggregation #617

Open

carlosalberto modified the milestones: v0.5, v0.6 Jun 9, 2020

andrewhsu added release:required-for-ga Must be resolved before GA release, or nice to have before GA area:api Cross language API specification issue area:sdk Related to the SDK priority:p2 Medium priority level and removed area:api Cross language API specification issue labels Jul 28, 2020

andrewhsu assigned jkwatson and bogdandrutu and unassigned rghetia, c24t and jkwatson Oct 20, 2020

andrewhsu added priority:p1 Highest priority level and removed priority:p2 Medium priority level labels Oct 20, 2020

tsloughter mentioned this issue Nov 27, 2020

Views API open-telemetry/opentelemetry-erlang#163

Closed

beanliu mentioned this issue Jan 18, 2021

Metric View API Prototype open-telemetry/opentelemetry-go#1473

Closed

This was referenced Jan 25, 2021

Supports Histogram Aggregation open-telemetry/opentelemetry-java#2575

Closed

Histogram Aggregation - core SDK implementation open-telemetry/opentelemetry-java#2581

Closed

reyang assigned reyang and unassigned bogdandrutu Mar 3, 2021

reyang modified the milestones: v0.6, Metrics API/SDK Experimental Release Mar 3, 2021

reyang mentioned this issue May 28, 2021

Add more details to SDK MeterProvider and View #1730

Merged

jsuereth closed this as completed in #1730 Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics: Scope the Views API #466

Metrics: Scope the Views API #466

jmacd commented Feb 13, 2020

toumorokoshi commented Feb 17, 2020

SergeyKanzhelev commented Feb 17, 2020

jmacd commented Feb 20, 2020

jmacd commented Feb 20, 2020 •

edited

Loading

jmacd commented Feb 20, 2020

toumorokoshi commented Feb 27, 2020

toumorokoshi commented Feb 27, 2020

tsloughter commented Feb 29, 2020

toumorokoshi commented Mar 1, 2020

c24t commented Apr 9, 2020

jsuereth commented Mar 25, 2021

tsloughter commented Mar 25, 2021

Metrics: Scope the Views API #466

Metrics: Scope the Views API #466

Comments

jmacd commented Feb 13, 2020

toumorokoshi commented Feb 17, 2020

Static vs Dynamic / API driven configuration

Will a Views API be written for every language

APIs in the collector

Naming: Aggregation over Views?

SergeyKanzhelev commented Feb 17, 2020

jmacd commented Feb 20, 2020

jmacd commented Feb 20, 2020 • edited Loading

jmacd commented Feb 20, 2020

toumorokoshi commented Feb 27, 2020

toumorokoshi commented Feb 27, 2020

tsloughter commented Feb 29, 2020

toumorokoshi commented Mar 1, 2020

c24t commented Apr 9, 2020

jsuereth commented Mar 25, 2021

tsloughter commented Mar 25, 2021

jmacd commented Feb 20, 2020 •

edited

Loading