Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support map values and nested values for attributes #376

Closed
tigrannajaryan opened this issue Dec 6, 2019 · 74 comments · Fixed by open-telemetry/opentelemetry-proto#157
Closed
Labels
area:api Cross language API specification issue area:semantic-conventions Related to semantic conventions enhancement New feature or request needs discussion Need more information before all suitable labels can be applied release:after-ga Not required before GA release, and not going to work on before GA
Milestone

Comments

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Dec 6, 2019

After #368 gets merged we will have support for array values.

If we add support for maps and nesting it will allow to represent arbitrary nested data structures in attribute values if needed.

This will apply to span and resource attributes.

@jmacd
Copy link
Contributor

jmacd commented Dec 6, 2019

Can we explicitly state that this applies to resources too? I believe that span attributes and resources (which are only specified in the proto, currently) are specified with the same structure.

@Oberon00
Copy link
Member

Are there any use cases for arbitrary nesting? I think (multi)maps would be useful to store, e.g., HTTP headers, but what would be the rationale for arbitrary nesting?

@shengxil
Copy link

shengxil commented May 5, 2020

Arbitrary nesting map can represent the classified values e.g. {"http" : {"url":...,"method":...}} or {"sql" : {"query":...,"engine":...}}. It can also host vendor specific data like {"aws": {"account_id":...}} in the situations when Resource isn't the right place, e.g. client side metrics

@jmacd
Copy link
Contributor

jmacd commented May 5, 2020

In #579, Tigran's example seems to contain a use-case. The resource of "application B" is a set of key-value attributes.

shengxil added a commit to shengxil/opentelemetry-specification that referenced this issue May 12, 2020
tigrannajaryan pushed a commit to tigrannajaryan/exp-otelproto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376

## Motivation

There are several reasons for this change:

- The API defines that attributes values
  [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows for nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to current state.

More importantly, performance critical applications can use Gogo ProtoBuf
generator (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogoproto proposed approach uses 40% less memory than the current schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is futureproof to adding new attribute types, has enough
flexibility to represent simple and complex attribute values for all telemetry
types and can be made fast by custom code generation for applications where it
matters.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simples approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse if in the future we need to add more data types to
attributes. This approach is not scalable for future needs and is semantically
wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field types (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This simplifies the schema however it results in significantly bigger AnyValue
in-memory. In vast majority of cases attribute values are strings. Integer and
boolean values are also used (although significantly less frequently than
strings). Floating point number, arrays and maps are likely going to be
diminishingly rare in the attributes. If we kept all these value types in
AnyValue we would pay the cost for all these fields although almost always only
string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also is not futureproof
and will become worse as we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord
attributes. This would allow to eliminate some of the requirements that we do
not yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/exp-otelproto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376

## Motivation

There are several reasons for this change:

- The API defines that attributes values
  [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows for nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to current state.

More importantly, performance critical applications can use Gogo ProtoBuf
generator (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogoproto proposed approach uses 40% less memory than the current schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is futureproof to adding new attribute types, has enough
flexibility to represent simple and complex attribute values for all telemetry
types and can be made fast by custom code generation for applications where it
matters.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simples approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse if in the future we need to add more data types to
attributes. This approach is not scalable for future needs and is semantically
wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field types (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This simplifies the schema however it results in significantly bigger AnyValue
in-memory. In vast majority of cases attribute values are strings. Integer and
boolean values are also used (although significantly less frequently than
strings). Floating point number, arrays and maps are likely going to be
diminishingly rare in the attributes. If we kept all these value types in
AnyValue we would pay the cost for all these fields although almost always only
string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also is not futureproof
and will become worse as we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord
attributes. This would allow to eliminate some of the requirements that we do
not yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376

## Motivation

There are several reasons for this change:

- The API defines that attributes values
  [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This simplifies the schema however it results in significantly bigger AnyValue
in-memory. In vast majority of cases attribute values are strings. Integer and
boolean values are also used, although significantly less frequently than
strings. Floating point number, arrays and maps are likely going to be
diminishingly rare in the attributes. If we keep all these value types in
AnyValue we will pay the cost for all these fields although almost always only
string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376
Resolves: open-telemetry#106

## Motivation

There are several reasons for this change:

- The API defines that attributes values
  [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This simplifies the schema however it results in significantly bigger AnyValue
in-memory. In vast majority of cases attribute values are strings. Integer and
boolean values are also used, although significantly less frequently than
strings. Floating point number, arrays and maps are likely going to be
diminishingly rare in the attributes. If we keep all these value types in
AnyValue we will pay the cost for all these fields although almost always only
string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376
Resolves: open-telemetry#106

## Motivation

There are several reasons for this change:

- The API defines that attributes values [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values. We need to add
  such capability.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This simplifies the schema however it results in significantly bigger AnyValue
in-memory. In vast majority of cases attribute values are strings. Integer and
boolean values are also used, although significantly less frequently than
strings. Floating point number, arrays and maps are likely going to be
diminishingly rare in the attributes. If we keep all these value types in
AnyValue we will pay the cost for all these fields although almost always only
string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376
Resolves: open-telemetry#106

## Motivation

There are several reasons for this change:

- The API defines that attributes values [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values. We need to add
  such capability.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This results in significantly bigger AnyValue in-memory. In vast majority of
cases attribute values of produced telemetry are strings (see e.g. semantic
conventions for proof). Integer and boolean values are also used, although
significantly less frequently than strings. Floating point number, arrays and
maps are likely going to be diminishingly rare in the attributes. If we keep all
these value types in AnyValue we will pay the cost for all these fields although
almost always only string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376
Resolves: open-telemetry#106

## Motivation

There are several reasons for this change:

- The API defines that attributes values [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values. We need to add
  such capability.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting. AnyValue can
represent any data that can be represented in JSON.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This results in significantly bigger AnyValue in-memory. In vast majority of
cases attribute values of produced telemetry are strings (see e.g. semantic
conventions for proof). Integer and boolean values are also used, although
significantly less frequently than strings. Floating point number, arrays and
maps are likely going to be diminishingly rare in the attributes. If we keep all
these value types in AnyValue we will pay the cost for all these fields although
almost always only string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 8, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376
Resolves: open-telemetry#106

## Motivation

There are several reasons for this change:

- The API defines that attributes values [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values. We need to add
  such capability.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This results in significantly bigger AnyValue in-memory. In vast majority of
cases attribute values of produced telemetry are strings (see e.g. semantic
conventions for proof). Integer and boolean values are also used, although
significantly less frequently than strings. Floating point number, arrays and
maps are likely going to be diminishingly rare in the attributes. If we keep all
these value types in AnyValue we will pay the cost for all these fields although
almost always only string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
@mwear
Copy link
Member

mwear commented Jun 12, 2020

Semantically, I agree that the dotted string notation is equivalent to a map, although I'd like to point out, that at least from the tracing client perspective, dotted strings are a slightly more efficient representation.

Consider the following representations for the key-value pair 'http.method': 'GET'.

Dotted string representation
{ 'http.method': 'GET }
To represent this we need 1 map and 2 strings; 3 total objects.

Map representation
{ { 'http' : { 'method': 'get' } }
This requires 2 maps, 3 strings; 5 total objects.

Furthermore, most tracing backends do not support nested attributes (as far as I know), and will need to flatten them into dotted strings. This is something that will either have to been done in the tracing clients during export, or by the backends on ingest.

While I recognize that this does have some advantages in regards to semantics and for the data that can be represented, it does introduce complexity into tracing clients and backends. I'm not saying we shouldn't pursue this proposal, but we should discuss what the actual benefits are, and whether the added complexity is worth the tradeoff.

tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Jun 15, 2020
## Summary

This adds support for arrays and maps to attribute values, including support for
nested values.

This is a breaking protocol change.

Resolves: open-telemetry/opentelemetry-specification#376
Resolves: open-telemetry#106

## Motivation

There are several reasons for this change:

- The API defines that attributes values [may contain arrays of values](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes).
  However the protocol has no way of representing array values. We need to add
  such capability.

- We intend to support Log data type in the protocol, which also requires array
  values (it is a Log Data Model requirement). In addition, Log data type
  requires support of key-value lists (maps) as attribute values, including
  nested values.

- There are long-standing requests to support nested values, arrays and maps for
  attributes:
  open-telemetry/opentelemetry-specification#376
  open-telemetry/opentelemetry-specification#596

This change introduces AnyValue. AnyValue can represent arbitrary numeric,
boolean, string, arrays or maps of values and allows nesting.

AttributeKeyValue now uses AnyValue to store the "value" part.

Note: below "Current" refers to the state of the "master" branch before this
PR/commit is merged. "Proposed" refers to the schema suggested in this
PR/commit.

## Performance

This change has a negative impact on the performance when using canonical Go
ProtoBuf compiler (compared to current OTLP state):

```
BenchmarkEncode/Current/Trace/Attribs-8              	     813	   1479588 ns/op
BenchmarkEncode/Proposed/Trace/Attribs-8             	     417	   2873476 ns/op
BenchmarkEncode/OpenCensus/Trace/Attribs-8           	     162	   7354799 ns/op

BenchmarkDecode/Current/Trace/Attribs-8              	     460	   2646059 ns/op	 1867627 B/op	   36201 allocs/op
BenchmarkDecode/Proposed/Trace/Attribs-8             	     246	   4827671 ns/op	 2171734 B/op	   56209 allocs/op
BenchmarkDecode/OpenCensus/Trace/Attribs-8           	     154	   7560952 ns/op	 2775949 B/op	   76166 allocs/op
```

However, I do not think this is important for most applications. Serialization
CPU and Memory usage is going to be a tiny portion of consumed resources for
most applications, except certain specialized ones.

For the perspective I am also showing OpenCensus in the benchmark to make it
clear that we are still significantly faster than it despite becoming slower
compared to the current state.

More importantly, performance critical applications can use Gogo ProtoBuf
compiler (Collector does use it), which _gains_ performance due to this change:

```
BenchmarkEncode/Current(Gogo)/Trace/Attribs-8        	    1645	    705385 ns/op
BenchmarkEncode/Proposed(Gogo)/Trace/Attribs-8       	    1555	    698771 ns/op

BenchmarkDecode/Current(Gogo)/Trace/Attribs-8        	     537	   2241570 ns/op	 2139634 B/op	   36201 allocs/op
BenchmarkDecode/Proposed(Gogo)/Trace/Attribs-8       	     600	   2053120 ns/op	 1323287 B/op	   46205 allocs/op
```

With Gogo compiler proposed approach uses 40% less memory than the current
schema.

After considering all tradeoffs and alternates (see below) I believe this
proposal is the best overall approach for OTLP. It is idiomatic ProtoBuf, easy
to read and understand, is future-proof to adding new attribute types, has
enough flexibility to represent simple and complex attribute values for all
telemetry types and can be made fast by custom code generation for applications
where it matters using Gogo ProtoBuf compiler.

Note: all performance measurements are done for Go implementation only (although
it is expected that other languages should exhibit somewhat similar behavior).

## Alternates Considered

I also designed and benchmarked several alternate schemas, see below.

### Adding array value to AttributeKeyValue

This is the simplest approach. It doubles down on the current OTLP protocol
approach and simply adds "array_values" field to AttributeKeyValue, e.g.:

```proto
message AttributeKeyValue {
  // all existing fields here.

  // A list of values. "key" field of each element in the list is ignored.
  repeated AttributeKeyValue array_values = 7;
}
```

This eliminates the need to have a separate AnyValue message and has lower CPU
usage because it requires less indirections and less memory allocations per
value. However, this is semantically incorrect since the elements of the array
must actually be values not key-value pairs, which this schema violates. It also
uses more memory than the proposed approach:

```proto
BenchmarkEncode/Proposed/Trace/Attribs-8             	     400	   2869055 ns/op
BenchmarkEncode/MoreFieldsinAKV/Trace/Attribs-8      	     754	   1540978 ns/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     250	   4790010 ns/op	 2171741 B/op	   56209 allocs/op
BenchmarkDecode/MoreFieldsinAKV/Trace/Attribs-8      	     420	   2806918 ns/op	 2347827 B/op	   36201 allocs/op
```

It will become even worse memory-wise if in the future we need to add more data
types to attributes. This approach is not scalable for future needs and is
semantically wrong.

### Fat AnyValue instead of oneof.

In this approach AnyValue contains all possible field values (similarly to how
AttributeKeyValue is currently):

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    double double_value = 5;
    repeated AnyValue list_values = 6;
    repeated AttributeKeyValue kvlist_values = 7;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

This results in significantly bigger AnyValue in-memory. In vast majority of
cases attribute values of produced telemetry are strings (see e.g. semantic
conventions for proof). Integer and boolean values are also used, although
significantly less frequently than strings. Floating point number, arrays and
maps are likely going to be diminishingly rare in the attributes. If we keep all
these value types in AnyValue we will pay the cost for all these fields although
almost always only string value would be present.

Here are benchmarks comparing proposed schema and schema with fat AnyValue and
using string and integer attributes in spans:

```
BenchmarkEncode/Proposed/Trace/Attribs-8             	     415	   2894513 ns/op	  456866 B/op	   10005 allocs/op
BenchmarkEncode/FatAnyValue/Trace/Attribs-8          	     646	   1885003 ns/op	  385024 B/op	       1 allocs/op

BenchmarkDecode/Proposed/Trace/Attribs-8             	     247	   4872270 ns/op	 2171746 B/op	   56209 allocs/op
BenchmarkDecode/FatAnyValue/Trace/Attribs-8          	     343	   3423494 ns/op	 2988081 B/op	   46205 allocs/op
```

Memory usage with this approach is much higher and it also will become worse as
we add more types.

### AnyValue plus ExoticValue

This is based on fat AnyValue approach but rarely used value types are moved to
a separate ExoticValue message that may be referenced from AnyValue if needed:

```proto
message AnyValue {
    ValueType type = 1;
    bool bool_value = 2;
    string string_value = 3;
    int64 int_value = 4;
    ExoticValue exotic_value = 5;
}
message ExoticValue {
    double double_value = 1;
    repeated AnyValue array_values = 2;
    repeated AttributeKeyValue kvlist_values = 3;
}
message AttributeKeyValue {
    string key = 1;
    AnyValue value = 2;
}
```

While this improves the performance (particularly lowers memory usage for most
frequently used types of attributes) it is awkward and sacrifices too much
readability and usability for small performance gains. Also for the rare cases
it is slow and uses even more memory so its edge case behavior is not desirable.

### Using different schema for log data type

I also considered using a different message definition for LogRecord attributes
and Spans. This would allow to eliminate some of the requirements that we do not
yet formally have for Span attributes (particularly the need to have maps of
nested values).

However, this does not help much in terms of performance, makes Span and
LogRecord attributes non-interchangeable and significantly increases the bloat
of code in applications that need to work with both Spans and Log records.
@mwear
Copy link
Member

mwear commented Jun 15, 2020

I'd also like to add that with the dotted-string notation, tracing clients can reduce the runtime string allocations to 0 for attribute keys by introducing constants for semantic conventions (and any other commonly used keys). We would lose this ability by changing to nested maps.

I should also clarify that I am completely ok with array support. It's the nested map support that I have reservations about.

@tigrannajaryan
Copy link
Member Author

@bogdandrutu can you please clarify why is this reopened?

@bogdandrutu
Copy link
Member

@tigrannajaryan because of the last week discussion and concerns raised by @mwear

@Oberon00
Copy link
Member

Oberon00 commented Jun 24, 2020

Was closing it even intentional? I can't remember any final decision in this issue. At least it's not documented here?

tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Nov 30, 2022
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Dec 14, 2022
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Feb 9, 2023
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Feb 9, 2023
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Feb 14, 2023
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Jan 10, 2024
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Jan 10, 2024
Resolves open-telemetry#376

Use cases where this is necessary or useful:

1. Specify more than one resource in the telemetry: open-telemetry#579
2. Data coming from external source, e.g. AWS Metadata: open-telemetry#596 (comment)
   or open-telemetry#376 (comment)
3. Capturing HTTP headers: open-telemetry#376 (comment)
4. Structured stack traces: open-telemetry#2841
5. Payloads as attributes: open-telemetry/oteps#219 (comment)

This is a draft PR to see what the change looks like.

If this PR is merged it will be nice to follow it up with:
- A standard way of flattening maps and nested objects when converting from OTLP
  to formats that don't support maps/nested objects.
- Recommendations for semantic conventions to use/not use complex objects.
@austinlparker
Copy link
Member

Closed by #3858

@adriangb
Copy link

adriangb commented Feb 5, 2025

I have to say I'm quite disappointed with this outcome.

IMO it's not a huge ask for backends to flatten nested attributes to dot notation if they don't want to support them, at least compared to #4333 which is a much bigger change for backends and where my comment of "but this will make it harder for backends" was dismissed with "that sounds like a backend problem". Concretely a good chunk of (the majority of? I don't have hard numbers) new backends seem to be built on top of ClickHouse, which has JSON support. Other modern open source query engines like DuckDB, DataFusion all have JSON support. I can't help but feel that a lot of decisions in OpenTelemetry are driven by the needs of existing vendors.

It's also extremely confusing the differences in what is supported and not at the SDK level vs. the spec level vs. the protocol level, e.g. see open-telemetry/opentelemetry-rust#2471 (comment)

@pellared
Copy link
Member

pellared commented Feb 5, 2025

@adriangb, You can also check this #2888

I can't help but feel that a lot of decisions in OpenTelemetry are driven by the needs of existing vendors.

I want just to call out that some vendors where supporting the change (extending the standard attributes) so please do not but everyone to "existing vendors" bucket.

For instance, as far as I remember all @open-telemetry/go-maintainers (Google, Elastic, Cisco, Splunk) were in favor of this change. I think we lost our will after a lot of feedback seemed to be against the change.

Given #3858 I am not sure if there is a room for changing the decision. Personally, I approved this PR (even though I wanted to extend the standard attributes) as we were circling back and forth during the Spec SIG meetings without any progress. We were not able to come up with an agreement to evolve the attributes. @open-telemetry/technical-committee, opinions?

@pellared
Copy link
Member

pellared commented Feb 6, 2025

As far as I see, the OTel Collector does support heterogeneous map and array values, nested values, bytes for attributes in all signals (including metrics, right?). For reference:

@mx-psi, am I correct? If so then are you experiencing any issues connected with it? If not then maybe we should really revisit it and do it in all languages? According to the specification it would be a bug (specification incompliance) in the collector. If the collector would decide to keep the current design, then I do not understand why all languages (including OTel Go) would have to be different.

@open-telemetry/technical-committee, @open-telemetry/governance-committee, I do not feel great about this given PRs like:

However, maybe after one year it should be retrospected given we have not stabilized "extended" attributes support for logs in many languages. By this time we simplified many things related to Logs signal (e.g. made Events part of the Logs API and SDK). Please consider if this issue should not be reopened (retriage it?).

@mx-psi
Copy link
Member

mx-psi commented Feb 6, 2025

As far as I see, the OTel Collector does support heterogeneous map and array values, nested values, bytes for attributes in all signals (including metrics, right?). For reference:

@mx-psi, am I correct? If so then are you experiencing any issues connected with it? If not then maybe we should really revisit it and do it in all languages?

I am not aware of any issues, but I think we should ask the component owners using this. Here are some Github searches that show usages in the wild of pcommon.Map related functions. Note that there are uses of this type that are not intended to construct 'complex attributes', in particular the log body may be a map per the data model:

The only interesting use case I have found is parsing XML in attributes in OTTL. Maybe @TylerHelmuth @evan-bradley can talk about their experience here?

According to the specification it would be a bug (specification incompliance) in the collector. If the collector would decide to keep the current design, then I do not understand why all languages (including OTel Go) would have to be different.

I believe this is because the proto actually supports these: https://github.com/open-telemetry/opentelemetry-proto/blob/6a4f4e4fd867eda646ae14604dbc773e2516826a/opentelemetry/proto/common/v1/common.proto#L36-L37, and pdata is mostly a wrapper of the proto. If the Collector is not spec-compliant, is the proto also not specification compliant then?

@adriangb
Copy link

adriangb commented Feb 6, 2025

Thanks @pellared. I'm sorry if I was harsh to all here.

One more thing I want to point out is that some things are moving to use the Logs API from tracing in part because of this limitation: in https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/#tools nested Log bodies are being logged, and I suspect that Logs are being used because this data could not be sent via a span (this is a suspicion, I did not spend the time to dredge through the history of this decision). It's a shame to move things to a system that was initially designed for backwards compatibility with legacy systems just because of artificial limitations with span attributes.

@pellared
Copy link
Member

pellared commented Feb 6, 2025

I suspect that Logs are being used because this data could not be sent via a span

I think that:

  • not every event is inside a span/trace.
  • it may be also about delivering events before the span ends (probably this could be solved differently)

@adriangb
Copy link

adriangb commented Feb 6, 2025

it may be also about delivering events before the span ends (probably this could be solved differently)

Totally agreed, I think this is an issue. I think I mentioned this before but we've created a concept of "pending span" which is a span we send out with 0 duration as soon as a span is opened so that we can show long lived spans before they are closed. We would be very happy for there to be some sort of convention along the lines of "send a log / event with X data to notify a backend that a span has started".

not every event is inside a span/trace

This I'm a bit confused by. I understand that you may not have a span currently running at the point when you want to emit an event, but what is stopping you from opening a span just to do that? Here's what we do:

import logfire

logfire.info('an event/log')

Which behind the scenes we transform into (more complicated, we hook in deeper to mark this as a "log span", make duration 0 and avoid overheads):

span = trace.start_current_span('an event/log')
span.end()

But that allows emitting events without a span currently active and without a new type of signal. Am I missing your point?

If anything this reinforces my point: if we allowed span attributes the full flexibility that log attributes we could go in the opposite direction and merge events into spans instead of merging events into logs (a signal initially designed for compatibility with legacy systems).

@pellared
Copy link
Member

pellared commented Feb 6, 2025

@adriangb, can you summarize and leave a comment here please: #4393?

I understand that you may not have a span currently running at the point when you want to emit an event, but what is stopping you from opening a span just to do that?

It is possible but it comes with unnecessary performance overhead and data noise. I see it as an ugly/hacky workaround and not a proper solution.

@adriangb
Copy link

adriangb commented Feb 6, 2025

It is possible but it comes with unnecessary performance overhead and data noise. I see it as an ugly/hacky workaround and not a proper solution.

I don't think it's that much overhead or hack. It still goes out as a single thing over the wire. Users don't know the difference. It's not hard for backends to handle. And if there was an agreement on going in that direction it would be trivial to add the right APIs to the SDKs to do this with zero overhead and zero ambiguity.

@adriangb
Copy link

adriangb commented Feb 6, 2025

@adriangb, can you summarize and leave a comment here please: #4393?

Done, thanks

@TylerHelmuth
Copy link
Member

@mx-psi OTTL has a plethora of ways to assign map values to attributes for any signal. We allow this because pdata allows it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue area:semantic-conventions Related to semantic conventions enhancement New feature or request needs discussion Need more information before all suitable labels can be applied release:after-ga Not required before GA release, and not going to work on before GA