Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle case-insensitive instrument name collisions in metric SDK #3539

Closed
MrAlias opened this issue Jun 5, 2023 · 16 comments · Fixed by #3626
Closed

How to handle case-insensitive instrument name collisions in metric SDK #3539

MrAlias opened this issue Jun 5, 2023 · 16 comments · Fixed by #3626
Assignees
Labels
priority:p1 Highest priority level question Question for discussion spec:metrics Related to the specification/metrics directory

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Jun 5, 2023

The metric API defines instrument names as being case insensitive:

They are case-insensitive, ASCII strings.

And the metric SDK "MUST aggregate data from identical Instruments together in its export pipeline." Where, "[t]he term identical applied to Instruments describes instances where all identifying fields are equal." Where an instrument's name, kind, unit, description, and number type are all considered identifying.

This implies that an instruments with names request_counter, request_Counter, and Request_Counter (assuming all other identifying fields are equal) need to all be aggregated together. However, downstream, instrument names can be case-sensitive. What should the instrument name be for the produced metric stream? Should it just be the first name registered?

@jack-berg
Copy link
Member

However, downstream, instrument names can be case-sensitive.

Where can they be case sensitive down stream?

Should it just be the first name registered?

This is what the java implementation does. The python implementation converts all metric names to lower case (see more in discussion here).

@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 5, 2023

However, downstream, instrument names can be case-sensitive.

Where can they be case sensitive down stream?

There's no restriction on the OTLP metric name to be case-sensitive, right?

https://github.com/open-telemetry/opentelemetry-proto/blob/793cfbd0a0604cd042fc9ddc4cc1cd60c51c4229/opentelemetry/proto/metrics/v1/metrics.proto#L165-L166

It could be the case that if you run your app and send request_counter and on restart send request_Counter (i.e. it then becomes the first registered), the OTLP will transport both and a backend would interpret them differently.

@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 5, 2023

Should it just be the first name registered?

This is what the java implementation does. The python implementation converts all metric names to lower case (see more in discussion here).

Hmm, this seems problematic. If a user in Java is able to measure requestCount but in Python they get requestcount their back-end system will have two metric streams for data they want to be the same.

It seems like an under-specified area of the specification. Is there a backwards compatible way we can specify the behavior here so all SDKs produce the same telemetry?

@jack-berg
Copy link
Member

There's no restriction on the OTLP metric name to be case-sensitive, right?

Given that according to the API, metric names are "They are case-insensitive, ASCII strings.", I think a consumer which treats them as case sensitive would risk unexpected behavior.

I don't see any reference to case sensitivity of metric name in the metric data model document, or metric proto definitions... Perhaps this was an oversight?

@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 5, 2023

I don't see any reference to case sensitivity of metric name in the metric data model document, or metric proto definitions... Perhaps this was an oversight?

Yeah, I think some clarification is needed here.

@carlosalberto carlosalberto added the priority:p1 Highest priority level label Jun 26, 2023
@carlosalberto
Copy link
Contributor

A clarification would be great IMHO

@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 27, 2023

Notes from the specification SIG meeting:

  • Ideally we push the data as far down the processing/transmitting pipeline as possible. For example, if there is an exporter that restricts naming, ideally the data gets to that exporter and it is handled there
  • We should provide a sane default and have users be able to change the behavior with a view if they want.

The proposal we settled on is:

  • Handle case-insensitive instrument name conflicts that conflict when evaluated with case sensitivity for data streams by using the first value seen
  • Log a warning on subsequent conflicting name conflicts

This will mean the Python implementation would need to change. @open-telemetry/python-approvers thoughts?

@jack-berg
Copy link
Member

Log a warning on subsequent conflicting name conflicts

Is the same name in a different case considered a conflict?

@MrAlias
Copy link
Contributor Author

MrAlias commented Jun 27, 2023

Log a warning on subsequent conflicting name conflicts

Is the same name in a different case considered a conflict?

As far as I can tell, yes. The name is defined as a case-insensitive string. Therefore, names that differ only by case are equivalent, but not identical. The second instrument the user requests will return an instrument other than what the user asked for if the name normalization strategy mentioned above is followed.

@ocelotl
Copy link
Contributor

ocelotl commented Jul 6, 2023

Notes from the specification SIG meeting:

  • Ideally we push the data as far down the processing/transmitting pipeline as possible. For example, if there is an exporter that restricts naming, ideally the data gets to that exporter and it is handled there
  • We should provide a sane default and have users be able to change the behavior with a view if they want.

The proposal we settled on is:

  • Handle case-insensitive instrument name conflicts that conflict when evaluated with case sensitivity for data streams by using the first value seen
  • Log a warning on subsequent conflicting name conflicts

This will mean the Python implementation would need to change. @open-telemetry/python-approvers thoughts?

Should we change our implementation now? Or is there a plan to add the results of the discussion to the spec and once this is added to the spec should we reimplement?

@MrAlias
Copy link
Contributor Author

MrAlias commented Jul 6, 2023

Should we change our implementation now? Or is there a plan to add the results of the discussion to the spec and once this is added to the spec should we reimplement?

There is a plan to update the specification, I just need to find the time to open the PR. I would wait until that is merged before changing.

@MrAlias
Copy link
Contributor Author

MrAlias commented Jul 6, 2023

Looking at the existing duplicate instrument registration guidelines:

[...]
3. Otherwise (e.g., use of multiple units), the SDK SHOULD pass through the data by reporting both Metric objects and emit a generic warning describing the duplicate instrument registration.

I'm wondering if instead of unifying the data in the SDK, just export two streams (like would be done in the case of different units) and emit a warning?

@jack-berg thoughts?

@MrAlias
Copy link
Contributor Author

MrAlias commented Jul 25, 2023

Based on this, JS is currently not differentiating between case-insensitive instrument names. I think their current behavior is to export multiple streams based on my reading of that issue.

@jack-berg
Copy link
Member

I'm in favor or using the first value seen, as proposed here, for the following reasons:

  • IF the name argument of the API is a case sensitive string type (i.e. not some language specific case insensitive variant), then taking the first is the least surprising thing to do given the API specification. Producing two metric streams is unexpected - surely that isn't the spirit of the specification even if its currently ambiguous. I can make the case for converting to lower case, since that's what an implementation might do when converting to OTLP if the API was structured to accept a case insensitive string variant. Still, its more surprising than taking the first.
  • The lack of case sensitive language in the metric data model or proto is an accidental omission. See that the first version of the metric API spec included the case insensitive instrument name language. At that time, the metric data model lacked the details it has today, missing any description of the timeseries model or metric name. To me, its clear that the omission is a accidental consequence of the spec being written by multiple authors in a piecemeal fashion. We should fix the bug rather than cement it as a feature.

@dyladan
Copy link
Member

dyladan commented Jul 26, 2023

What would the guidance be for SDKs like JS that already made a different choice? Should we coalesce on the new behavior as a bug fix and just accept that it is breaking, or should we flag this for 2.0?

edit: to be clear, we didn't make this choice consciously. we simply missed it while building the metric sdk

@MrAlias
Copy link
Contributor Author

MrAlias commented Jul 26, 2023

What would the guidance be for SDKs like JS that already made a different choice? Should we coalesce on the new behavior as a bug fix and just accept that it is breaking, or should we flag this for 2.0?

edit: to be clear, we didn't make this choice consciously. we simply missed it while building the metric sdk

I see this as a bug fix.

First off, there shouldn't be any public interfaces or functions that change from this. Only the export behavior.

Specifically for the duplicate stream case: the duplicate instrument registration section already specifies that identical instruments need to aggregated into a single data stream:

To accommodate the recommendations from the data model, the SDK MUST aggregate data from identical Instruments together in its export pipeline.

If an implementation is not unifying these streams, I think it then follows that merging the streams is a bug fix.

As for the Python case, where all names are being changed to lowercase, it is a bit more of a grey area. I don't know of any place that specifically says the casing of an instrument name cannot be changed in the specification, but I expect we want to preserve the casing a user provides. I could see it as a bug to not do so.

carlosalberto pushed a commit that referenced this issue Aug 9, 2023
- Refactor the "Duplicate instrument registration" section
- Clarify how to handle when instrument names differ by only their
casing:
1. Return the first-seen instrument name for all conflicting instrument
names
  2. Log a warning

Resolves #3539
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:p1 Highest priority level question Question for discussion spec:metrics Related to the specification/metrics directory
Projects
No open projects
6 participants