Proposal for flattening attributes from OTLP messages #2736

pirgeo · 2022-08-17T14:54:46Z

Changes

Adds a new document, attribute-precedence.md, which is intended to give guidance on how to flatten out attributes when transforming OTLP to a flat set of attributes with unique keys (e.g. Prometheus/OpenMetrics labels).
This document is intended as a supplementary guideline.

Note that the guidelines are valid for semantic convention attributes as well as for custom attributes, where producers can put whatever they want. These custom attributes might overwrite other custom attributes or semantic convention attributes.

Related issues #2535

yurishkuro · 2022-08-17T16:05:25Z

specification/common/attribute-precedence.md

+### Traces
+
+```
+Span.attributes > ScopeSpans.scope.attributes > ResourceSpans.resource.attributes


What is a ScopeSpan?

ScopeSpans is the wrapper combining multiple spans and a Scope: https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto#L64

I see. If I were implementing a flattening scheme, I would keep ALL attributes, e.g. by adding distinguishable prefix for each category, instead of letting them override each other.

@yurishkuro yes. Another way of putting this: Why flatten in the first place?

The fact that OpenMetrics specifies a "flat" set of attributes does not mean we should flatten resource and scope-level attributes. OpenMetrics specifies how to join resource attributes using the target_into metric, and @dashpole has #2703 in progress, specifying how to join scope attributes using an opentelemetry_scope_info metric.

Another way of putting this: Why flatten in the first place?

No, I don't have an issue with flattening, it may be necessary due to the limitations of a target telemetry platform. But it does not mean that flattening should be a lossy transformation, which is what you're proposing.

The big argument against adding a prefix is that it changes the attribute key. Users can not query for the attribute they added, since the name changed. This impacts semantic conventions as well: Should all semantic conventions be prefixed? I feel like we would need to do that to stay consistent.

But it does not mean that flattening should be a lossy transformation, which is what you're proposing.

That is true, but this is the idea behind this proposal: If you want to, you can overwrite attributes. If you don't want to overwrite attributes, you can rename them (e.g., add a prefix explicitly). In either case, you will get the attributes that you defined, and don't have to go looking for the renamed version.

I haven't thought of doing renaming instead of overwriting, but it is worth considering as an option.

A couple thoughts that may help:

Attribute conflicts are likely going to be very rare. AFAIK, there isn't currently any real semantic convention that uses the same attribute name on different levels.

There are existing implementations that flatten and overwrite, see e.g. Zipkin exporter in Collector so we have a precedent.

I do like the fact that prefixing makes flattening a non-destructive operation. However, in real-world use cases you are likely mostly interested in the most specific value of the attribute, recorded at the innermost level, so overwriting seems to be a natural behavior.

If we could have real examples where attribute names can conflict it would help to make a decision. The responsible_team example I is confusing because we do have a convention for that already and it is called service.namespace and is supposed to be recorded on the Resource only. Any other examples that we can look at?

If opinions are split on this it is also possible to make this behavior configurable (i.e. to overwrite or to prefix) but I would try to avoid this complication if possible.

If you want to, you can overwrite attributes.

This is a good point. I buy the logic that conflicts are unlikely to be common, and thus that this proposal is going to work reasonably well in most situations. If a conflict does arise, the user has an escape hatch with the ability to wrap the exporter with logic that adds a prefix to the key in conflict.

scheler · 2022-08-17T20:01:00Z

specification/common/attribute-precedence.md

+attribute2: scope-attribute-2         # overwrites attribute2 on resource
+attribute3: resource-attribute-3      # from the resource, not overwritten
+attribute4: data-point-2-attribute-4  # overwrites attribute4 from the scope
+```


Asking just for my understanding. Can you list a real life example of an attribute that could be present both at the resource and at the level of a span/metric/log_record?

One thing that comes to mind is a responsible_team. In the resource attribute, you would set the department that is responsible for the respective product, and you can overwrite it on Spans/Metrics if the code producing them is maintained by a specific team within that department. That way, the resource would be the "fallback" but you can be more specific if you want to.

Thanks, I used to think we will never have same attribute in both the resource and the signals since they serve a different purpose, but looks like there can be a few valid cases.

I want to question and understand your example though. The team responsible for running my service pod could be devops (specified at resource level), but the team responsible for the application is an engineering/dev team (specified in the span). Don't you want to capture both?

I agree, that they should never overlap, but since its possible to put whatever you want on the attributes it might happen.

For the example, I thought about it more like the team that developed the code. The team running the code is separate in my opinion. responsible_team might not be the best name for the attribute, maybe separate attributes like dev_team and devops_team would be better if you want to distinguish them.

I was thinking of an attribute value of responsible_team=dev (or dev_team=dev) on the resource for all the code that was written before instrumentation was added. All code written (and instrumented) later will add the actual team as an attribute on the span/metric/log (e.g. responsible_team=java-service-team/dev_team=java-service-team).

@scheler does this answer your question?

@pirgeo since there are no examples from the existing set of attribute semantic conventions that overlap between resource and signal, it would good to add some text to the specification on how to deal with it if we see valid use-cases in future. Maybe as part of #2753.

github-actions · 2022-08-30T04:03:54Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

specification/common/attribute-precedence.md

tigrannajaryan · 2022-08-18T17:32:44Z

specification/common/attribute-precedence.md

+### Traces
+
+```
+Span.attributes > ScopeSpans.scope.attributes > ResourceSpans.resource.attributes


I haven't thought of doing renaming instead of overwriting, but it is worth considering as an option.

A couple thoughts that may help:

Attribute conflicts are likely going to be very rare. AFAIK, there isn't currently any real semantic convention that uses the same attribute name on different levels.

There are existing implementations that flatten and overwrite, see e.g. Zipkin exporter in Collector so we have a precedent.

I do like the fact that prefixing makes flattening a non-destructive operation. However, in real-world use cases you are likely mostly interested in the most specific value of the attribute, recorded at the innermost level, so overwriting seems to be a natural behavior.

If we could have real examples where attribute names can conflict it would help to make a decision. The responsible_team example I is confusing because we do have a convention for that already and it is called service.namespace and is supposed to be recorded on the Resource only. Any other examples that we can look at?

If opinions are split on this it is also possible to make this behavior configurable (i.e. to overwrite or to prefix) but I would try to avoid this complication if possible.

Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com>

jack-berg · 2022-09-08T21:17:14Z

specification/common/attribute-precedence.md

+non-hierarchical representation (e.g., Prometheus/OpenMetrics labels).
+In the case of OpenMetrics, the set of labels is flat and must have unique
+labels only
+(<https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#labelset>).


As @jmacd points out here, there is ongoing work to solve this in a different way specifically for prometheus/openmetrics in #2703. It would be better to use zipkin as an example here.

github-actions · 2022-09-16T04:05:12Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

joaopgrassi · 2022-09-20T08:31:11Z

Not stale

github-actions · 2022-10-14T04:04:32Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2022-10-22T03:49:58Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Proposal for flattening attributes from OTLP messages

5f391b2

pirgeo requested review from a team August 17, 2022 14:54

github-actions bot assigned carlosalberto Aug 17, 2022

add changelog entry

66e04f3

yurishkuro reviewed Aug 17, 2022

View reviewed changes

arminru added area:sdk Related to the SDK spec:miscellaneous For issues that don't match any other spec label labels Aug 17, 2022

jmacd mentioned this pull request Aug 17, 2022

Add the short_name scope attribute #2702

Closed

scheler reviewed Aug 17, 2022

View reviewed changes

pirgeo mentioned this pull request Aug 22, 2022

Clarify scope attributes for zipkin and jaeger exporters #2744

Open

github-actions bot added the Stale label Aug 30, 2022

Merge branch 'main' into attribute-flattening

4c16f3b

tigrannajaryan reviewed Aug 30, 2022

View reviewed changes

arminru removed the Stale label Sep 2, 2022

Update specification/common/attribute-precedence.md

c631230

Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com>

jsuereth mentioned this pull request Sep 7, 2022

Refine which attributes of Resource contribute to Metric Identity. #2775

Open

jack-berg reviewed Sep 8, 2022

View reviewed changes

tigrannajaryan mentioned this pull request Sep 9, 2022

Add attribute value precedence of scope attribute in the spec #2774

Closed

github-actions bot added the Stale label Sep 16, 2022

github-actions bot removed the Stale label Sep 21, 2022

pirgeo and others added 3 commits September 23, 2022 11:01

use zipkin as an example

69d8501

Merge branch 'main' into attribute-flattening

5f6f526

Merge branch 'main' into attribute-flattening

89ebba0

github-actions bot added the Stale label Oct 14, 2022

Merge branch 'main' into attribute-flattening

c51a2d3

github-actions bot closed this Oct 22, 2022

tigrannajaryan mentioned this pull request Jul 19, 2023

Add data structures to model entity events as log records. open-telemetry/opentelemetry-collector-contrib#23565

Closed

arminru mentioned this pull request Nov 14, 2023

Allow Prometheus exporter to add resource attributes to metric attributes #3761

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for flattening attributes from OTLP messages #2736

Proposal for flattening attributes from OTLP messages #2736

pirgeo commented Aug 17, 2022

yurishkuro Aug 17, 2022

pirgeo Aug 17, 2022

yurishkuro Aug 17, 2022

jmacd Aug 17, 2022

yurishkuro Aug 17, 2022

pirgeo Aug 18, 2022

tigrannajaryan Aug 18, 2022

jack-berg Sep 8, 2022

scheler Aug 17, 2022

pirgeo Aug 18, 2022

scheler Aug 18, 2022

pirgeo Aug 22, 2022

pirgeo Aug 30, 2022

scheler Sep 5, 2022

github-actions bot commented Aug 30, 2022

tigrannajaryan Aug 18, 2022

jack-berg Sep 8, 2022

github-actions bot commented Sep 16, 2022

joaopgrassi commented Sep 20, 2022

github-actions bot commented Oct 14, 2022

github-actions bot commented Oct 22, 2022

Proposal for flattening attributes from OTLP messages #2736

Proposal for flattening attributes from OTLP messages #2736

Conversation

pirgeo commented Aug 17, 2022

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 30, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 16, 2022

joaopgrassi commented Sep 20, 2022

github-actions bot commented Oct 14, 2022

github-actions bot commented Oct 22, 2022