How to handle case-insensitive instrument name collisions in metric SDK #3539

MrAlias · 2023-06-05T17:36:33Z

The metric API defines instrument names as being case insensitive:

They are case-insensitive, ASCII strings.

And the metric SDK "MUST aggregate data from identical Instruments together in its export pipeline." Where, "[t]he term identical applied to Instruments describes instances where all identifying fields are equal." Where an instrument's name, kind, unit, description, and number type are all considered identifying.

This implies that an instruments with names request_counter, request_Counter, and Request_Counter (assuming all other identifying fields are equal) need to all be aggregated together. However, downstream, instrument names can be case-sensitive. What should the instrument name be for the produced metric stream? Should it just be the first name registered?

The text was updated successfully, but these errors were encountered:

jack-berg · 2023-06-05T18:14:47Z

However, downstream, instrument names can be case-sensitive.

Where can they be case sensitive down stream?

Should it just be the first name registered?

This is what the java implementation does. The python implementation converts all metric names to lower case (see more in discussion here).

MrAlias · 2023-06-05T18:49:26Z

However, downstream, instrument names can be case-sensitive.

Where can they be case sensitive down stream?

There's no restriction on the OTLP metric name to be case-sensitive, right?

https://github.com/open-telemetry/opentelemetry-proto/blob/793cfbd0a0604cd042fc9ddc4cc1cd60c51c4229/opentelemetry/proto/metrics/v1/metrics.proto#L165-L166

It could be the case that if you run your app and send request_counter and on restart send request_Counter (i.e. it then becomes the first registered), the OTLP will transport both and a backend would interpret them differently.

MrAlias · 2023-06-05T18:52:54Z

Should it just be the first name registered?

This is what the java implementation does. The python implementation converts all metric names to lower case (see more in discussion here).

Hmm, this seems problematic. If a user in Java is able to measure requestCount but in Python they get requestcount their back-end system will have two metric streams for data they want to be the same.

It seems like an under-specified area of the specification. Is there a backwards compatible way we can specify the behavior here so all SDKs produce the same telemetry?

jack-berg · 2023-06-05T19:05:25Z

There's no restriction on the OTLP metric name to be case-sensitive, right?

Given that according to the API, metric names are "They are case-insensitive, ASCII strings.", I think a consumer which treats them as case sensitive would risk unexpected behavior.

I don't see any reference to case sensitivity of metric name in the metric data model document, or metric proto definitions... Perhaps this was an oversight?

MrAlias · 2023-06-05T19:07:22Z

I don't see any reference to case sensitivity of metric name in the metric data model document, or metric proto definitions... Perhaps this was an oversight?

Yeah, I think some clarification is needed here.

carlosalberto · 2023-06-26T12:28:10Z

A clarification would be great IMHO

MrAlias · 2023-06-27T15:21:50Z

Notes from the specification SIG meeting:

Ideally we push the data as far down the processing/transmitting pipeline as possible. For example, if there is an exporter that restricts naming, ideally the data gets to that exporter and it is handled there
We should provide a sane default and have users be able to change the behavior with a view if they want.

The proposal we settled on is:

Handle case-insensitive instrument name conflicts that conflict when evaluated with case sensitivity for data streams by using the first value seen
Log a warning on subsequent conflicting name conflicts

This will mean the Python implementation would need to change. @open-telemetry/python-approvers thoughts?

jack-berg · 2023-06-27T16:23:36Z

Log a warning on subsequent conflicting name conflicts

Is the same name in a different case considered a conflict?

MrAlias · 2023-06-27T18:18:02Z

Log a warning on subsequent conflicting name conflicts

Is the same name in a different case considered a conflict?

As far as I can tell, yes. The name is defined as a case-insensitive string. Therefore, names that differ only by case are equivalent, but not identical. The second instrument the user requests will return an instrument other than what the user asked for if the name normalization strategy mentioned above is followed.

ocelotl · 2023-07-06T19:40:39Z

Notes from the specification SIG meeting:

Ideally we push the data as far down the processing/transmitting pipeline as possible. For example, if there is an exporter that restricts naming, ideally the data gets to that exporter and it is handled there

We should provide a sane default and have users be able to change the behavior with a view if they want.

The proposal we settled on is:

Handle case-insensitive instrument name conflicts that conflict when evaluated with case sensitivity for data streams by using the first value seen

Log a warning on subsequent conflicting name conflicts

This will mean the Python implementation would need to change. @open-telemetry/python-approvers thoughts?

Should we change our implementation now? Or is there a plan to add the results of the discussion to the spec and once this is added to the spec should we reimplement?

MrAlias · 2023-07-06T20:00:57Z

Should we change our implementation now? Or is there a plan to add the results of the discussion to the spec and once this is added to the spec should we reimplement?

There is a plan to update the specification, I just need to find the time to open the PR. I would wait until that is merged before changing.

MrAlias · 2023-07-06T20:50:55Z

Looking at the existing duplicate instrument registration guidelines:

[...]
3. Otherwise (e.g., use of multiple units), the SDK SHOULD pass through the data by reporting both Metric objects and emit a generic warning describing the duplicate instrument registration.

I'm wondering if instead of unifying the data in the SDK, just export two streams (like would be done in the case of different units) and emit a warning?

@jack-berg thoughts?

Resolves open-telemetry#3539

MrAlias · 2023-07-25T20:43:17Z

Based on this, JS is currently not differentiating between case-insensitive instrument names. I think their current behavior is to export multiple streams based on my reading of that issue.

jack-berg · 2023-07-25T21:26:13Z

I'm in favor or using the first value seen, as proposed here, for the following reasons:

IF the name argument of the API is a case sensitive string type (i.e. not some language specific case insensitive variant), then taking the first is the least surprising thing to do given the API specification. Producing two metric streams is unexpected - surely that isn't the spirit of the specification even if its currently ambiguous. I can make the case for converting to lower case, since that's what an implementation might do when converting to OTLP if the API was structured to accept a case insensitive string variant. Still, its more surprising than taking the first.
The lack of case sensitive language in the metric data model or proto is an accidental omission. See that the first version of the metric API spec included the case insensitive instrument name language. At that time, the metric data model lacked the details it has today, missing any description of the timeseries model or metric name. To me, its clear that the omission is a accidental consequence of the spec being written by multiple authors in a piecemeal fashion. We should fix the bug rather than cement it as a feature.

dyladan · 2023-07-26T18:49:45Z

What would the guidance be for SDKs like JS that already made a different choice? Should we coalesce on the new behavior as a bug fix and just accept that it is breaking, or should we flag this for 2.0?

edit: to be clear, we didn't make this choice consciously. we simply missed it while building the metric sdk

MrAlias · 2023-07-26T19:16:16Z

What would the guidance be for SDKs like JS that already made a different choice? Should we coalesce on the new behavior as a bug fix and just accept that it is breaking, or should we flag this for 2.0?

edit: to be clear, we didn't make this choice consciously. we simply missed it while building the metric sdk

I see this as a bug fix.

First off, there shouldn't be any public interfaces or functions that change from this. Only the export behavior.

Specifically for the duplicate stream case: the duplicate instrument registration section already specifies that identical instruments need to aggregated into a single data stream:

To accommodate the recommendations from the data model, the SDK MUST aggregate data from identical Instruments together in its export pipeline.

If an implementation is not unifying these streams, I think it then follows that merging the streams is a bug fix.

As for the Python case, where all names are being changed to lowercase, it is a bit more of a grey area. I don't know of any place that specifically says the casing of an instrument name cannot be changed in the specification, but I expect we want to preserve the casing a user provides. I could see it as a bug to not do so.

- Refactor the "Duplicate instrument registration" section - Clarify how to handle when instrument names differ by only their casing: 1. Return the first-seen instrument name for all conflicting instrument names 2. Log a warning Resolves #3539

- Refactor the "Duplicate instrument registration" section - Clarify how to handle when instrument names differ by only their casing: 1. Return the first-seen instrument name for all conflicting instrument names 2. Log a warning Resolves open-telemetry#3539

MrAlias added question Question for discussion spec:metrics Related to the specification/metrics directory labels Jun 5, 2023

github-actions bot assigned jmacd Jun 5, 2023

MrAlias mentioned this issue Jun 5, 2023

metric identity and aggregation for duplicate instrument registration open-telemetry/opentelemetry-go#3835

Closed

carlosalberto added the priority:p1 Highest priority level label Jun 26, 2023

MrAlias mentioned this issue Jul 12, 2023

Return the same agg for identical instruments open-telemetry/opentelemetry-go#4201

Closed

MrAlias added a commit to MrAlias/opentelemetry-specification that referenced this issue Jul 17, 2023

Specify how to handle instrument name conflicts

7c5143e

Resolves open-telemetry#3539

MrAlias mentioned this issue Jul 17, 2023

Specify how to handle instrument name conflicts #3606

Closed

This was referenced Jul 26, 2023

Specify how to handle instrument name conflict #3626

Merged

Verify compliant metric SDK specification implementation: MeterProvider/Resolving duplicate instrument registration conflicts open-telemetry/opentelemetry-go#3653

Closed

MrAlias added this to Go: Metric SDK (GA) Jul 27, 2023

github-project-automation bot moved this to Todo in Go: Metric SDK (GA) Jul 27, 2023

MrAlias moved this from Todo to In Progress in Go: Metric SDK (GA) Jul 27, 2023

MrAlias unassigned jmacd Jul 27, 2023

MrAlias self-assigned this Jul 27, 2023

MrAlias mentioned this issue Aug 7, 2023

metrics: Define name syntax #3643

Closed

carlosalberto closed this as completed in #3626 Aug 9, 2023

github-project-automation bot moved this from In Progress to Done in Go: Metric SDK (GA) Aug 9, 2023

dyladan mentioned this issue Aug 11, 2023

Metric names are case sensitive open-telemetry/opentelemetry-js#4057

Closed

lzchen mentioned this issue Sep 28, 2023

Metric naming conventions reasoning open-telemetry/opentelemetry-python#3207

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle case-insensitive instrument name collisions in metric SDK #3539

How to handle case-insensitive instrument name collisions in metric SDK #3539

MrAlias commented Jun 5, 2023

jack-berg commented Jun 5, 2023

MrAlias commented Jun 5, 2023

MrAlias commented Jun 5, 2023

jack-berg commented Jun 5, 2023

MrAlias commented Jun 5, 2023

carlosalberto commented Jun 26, 2023

MrAlias commented Jun 27, 2023

jack-berg commented Jun 27, 2023

MrAlias commented Jun 27, 2023

ocelotl commented Jul 6, 2023

MrAlias commented Jul 6, 2023

MrAlias commented Jul 6, 2023

MrAlias commented Jul 25, 2023

jack-berg commented Jul 25, 2023

dyladan commented Jul 26, 2023 •

edited

Loading

MrAlias commented Jul 26, 2023 •

edited

Loading

How to handle case-insensitive instrument name collisions in metric SDK #3539

How to handle case-insensitive instrument name collisions in metric SDK #3539

Comments

MrAlias commented Jun 5, 2023

jack-berg commented Jun 5, 2023

MrAlias commented Jun 5, 2023

MrAlias commented Jun 5, 2023

jack-berg commented Jun 5, 2023

MrAlias commented Jun 5, 2023

carlosalberto commented Jun 26, 2023

MrAlias commented Jun 27, 2023

jack-berg commented Jun 27, 2023

MrAlias commented Jun 27, 2023

ocelotl commented Jul 6, 2023

MrAlias commented Jul 6, 2023

MrAlias commented Jul 6, 2023

MrAlias commented Jul 25, 2023

jack-berg commented Jul 25, 2023

dyladan commented Jul 26, 2023 • edited Loading

MrAlias commented Jul 26, 2023 • edited Loading

dyladan commented Jul 26, 2023 •

edited

Loading

MrAlias commented Jul 26, 2023 •

edited

Loading