Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TODOs in Jaeger exporter spec #1374

Merged
merged 9 commits into from
Jan 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ New:

- Add `cloud.infrastructure_service` resource attribute
([#1112](https://github.com/open-telemetry/opentelemetry-specification/pull/1112))

Updates:

- Versioning and stability guarantees for OpenTelemetry clients([#1291](https://github.com/open-telemetry/opentelemetry-specification/pull/1291))
Expand Down Expand Up @@ -44,6 +44,8 @@ Updates:
- Resource SDK: Reverse (suggested) order of Resource.Merge parameters, remove
special case for empty strings
([#1345](https://github.com/open-telemetry/opentelemetry-specification/pull/1345))
- Trace Exporters: Fix TODOs in Jaeger exporter spec
([#1374](https://github.com/open-telemetry/opentelemetry-specification/pull/1374))

## v0.7.0 (11-18-2020)

Expand Down
109 changes: 103 additions & 6 deletions specification/trace/sdk_exporters/jaeger.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,32 @@
This document defines the transformation between OpenTelemetry and Jaeger Spans.
Jaeger accepts spans in two formats:

* Thrift `Batch`, defined in [jaeger-idl/.../jaeger.thrift](https://github.com/jaegertracing/jaeger-idl/blob/master/thrift/jaeger.thrift)
* Protobuf `Batch`, defined in [jaeger-idl/.../model.proto](https://github.com/jaegertracing/jaeger-idl/blob/master/proto/api_v2/model.proto)
* Thrift `Batch`, defined in [jaeger-idl/.../jaeger.thrift](https://github.com/jaegertracing/jaeger-idl/blob/master/thrift/jaeger.thrift), accepted via UDP or HTTP
* Protobuf `Batch`, defined in [jaeger-idl/.../model.proto](https://github.com/jaegertracing/jaeger-idl/blob/master/proto/api_v2/model.proto), accepted via gRPC

See also:

* [Jaeger APIs](https://www.jaegertracing.io/docs/latest/apis/)
* [Reference implementation of this translation in the OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector/blob/master/translator/trace/jaeger/traces_to_jaegerproto.go)

## Summary

The following table summarizes the major transformations between OpenTelemetry
and Jaeger.

TBD
| OpenTelemetry | Jaeger Thrift | Jaeger Proto | Notes |
| ------------------------ | ---------------- | ---------------- | ----- |
| Span.TraceID | Span.traceIdLow/High | Span.trace_id | See [IDs](#ids) |
| Span.ParentID | Span.parentSpanId | as SpanReference | See [Parent ID](#parent-id) |
| Span.SpanID | Span.spanId | Span.span_id | |
| Span.Name | Span.operationName | Span.operation_name | |
| Span.Kind | Span.tags["span.kind"] | same | See [SpanKind](#spankind) for values mapping |
| Span.StartTime | Span.startTime | Span.start_time | See [Unit of time](#unit-of-time) |
| Span.EndTime | Span.duration | same | Calculated as EndTime - StartTime. See also [Unit of time](#unit-of-time) |
| Span.Attributes | Span.tags | same | See [Attributes](#attributes) for data types for the mapping. |
| Span.Events | Span.logs | same | See [Events](#events) for the mapping format. |
| Span.Links | Span.references | same | See [Links](#links) |
| Span.Status | Add to Span.tags | same | See [Status](#status) for tag names to use. |

## Mappings

Expand All @@ -36,9 +53,66 @@ OpenTelemetry Span's `InstrumentationLibrary` MUST be reported as span `tags` to
| `InstrumentationLibrary.name`|`otel.library.name`|
| `InstrumentationLibrary.version`|`otel.library.version`|

### Attribute
### IDs

Trace and span IDs in Jaeger are random sequences of bytes. However, Thrft model
represents IDs using `i64` type, or in case of a 128-bit wide Trace ID as two `i64`
fields `traceIdLow` and `traceIdHigh`. The bytes MUST be converted to/from unsigned
ints using Big Endian byte order, e.g. `[0x10, 0x00, 0x00, 0x00] == 268435456`.
The unsigned ints MUST be converted to `i64` by re-interpreting the existing
64bit value as signed `i64`. For example (in Go):

```go
var (
id []byte = []byte{0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}
unsigned uint64 = binary.BigEndian.Uint64(id)
signed int64 = int64(unsigned)
)
fmt.Println("unsigned:", unsigned)
fmt.Println(" signed:", signed)
// Output:
// unsigned: 18374686479671623680
// signed: -72057594037927936
```

### Parent ID

Jaeger Thrift format allows capturing parent ID in a top-level Span field.
Jaeger Proto format does not support parent ID field; instead the parent
MUST be recorded as a `SpanReference` of type `CHILD_OF`, e.g.:

```python
SpanReference(
ref_type=opentracing.CHILD_OF,
trace_id=span.context.trace_id,
span_id=parent_id,
)
```

This span reference MUST be the first in the list of references.

### SpanKind

OpenTelemetry `SpanKind` field MUST be encoded as `span.kind` tag in Jaeger span,
except for `SpanKind.INTERNAL`, which SHOULD NOT be translated to a tag.

| OpenTelemetry | Jaeger |
| ------------- | ------ |
| `SpanKind.CLIENT`|`"client"`|
| `SpanKind.SERVER`|`"server"`|
| `SpanKind.CONSUMER`|`"consumer"`|
| `SpanKind.PRODUCER`|`"producer"` |
| `SpanKind.INTERNAL`| do not add `span.kind` tag |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collector currently adds "span.kind=internal" in this case [1]. Can you please clarify why it should not be doing this? I assume because "internal" is not a valid value that Jaeger can expect to see?

[1] https://github.com/open-telemetry/opentelemetry-collector/blob/ef40657447c038889ba8f666aab3f1f075a56b9e/translator/trace/jaeger/traces_to_jaegerproto.go#L368

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine either way, there's no special processing for "internal". We can allow both, or for consistency require "internal".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is to not include internal since it increases data size without adding any useful info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do not mind I would prefer not to change Collector's translation since I am worried that it may break people's stuff who rely on the current behavior (unlikely but possible). So, if using "internal" does not result in any problems for Jaeger let's allow it.


### Unit of time

TBD
In Jaeger Thrift format the timestamps and durations MUST be represented in
microseconds (since epoch for timestamps). If the original value in OpenTelemetry
is expressed in nanoseconds, it MUST be rounded or truncated to microseconds.

In Jaeger Proto format the timestamps and durations MUST be represented
with nanosecond precision using `google.protobuf.Timestamp` and
`google.protobuf.Duration` types.

### Status

Expand All @@ -54,9 +128,32 @@ The following table defines the OpenTelemetry `Status` to Jaeger `tags` mapping.

### Error flag

When Span `Status` is set to `ERROR` an `error` tag MUST be added with the
When Span `Status` is set to `ERROR`, an `error` span tag MUST be added with the
Boolean value of `true`. The added `error` tag MAY override any previous value.

### Attributes

OpenTelemetry Span `Attribute`(s) MUST be reported as `tags` to Jaeger.

Primitive types MUST be represented by the corresponding types of Jaeger tags.

Array values MUST be serialized to string like a JSON list as mentioned in
[semantic conventions](../../overview.md#semantic-conventions).

### Links

OpenTelemetry `Link`(s) MUST be converted to `SpanReference`(s) in Jaeger,
using `FOLLOWS_FROM` reference type. The Link's attributes cannot be represented
Comment on lines +145 to +146
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collector implementation specifically calls out that the parent span reference should be represented as CHILD_OF and must be at the first place because it is important for Jaeger backend [1]. Is this correct? If so let's also emphasize here.

[1] https://github.com/open-telemetry/opentelemetry-collector/blob/ef40657447c038889ba8f666aab3f1f075a56b9e/translator/trace/jaeger/traces_to_jaegerproto.go#L289

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you're describing is the behavior needed to represent ParentID, and I will clarify that it must be the first in the list. But this is a section for additional Links.

in Jaeger explicitly. The exporter MAY additionally convert `Link`(s) to span `Log`(s):

* use Span start time as the timestamp of the Log
* set Log tag `event=link`
* set Log tags `trace_id` and `span_id` from the respective `SpanContext`'s fields
* store `Link`'s attributes as Log tags

Span references generated from `Link`(s) MUST be added _after_ the span reference
generated from [Parent ID](#parent-id), if any.

### Events

Events MUST be converted to Jaeger Logs. OpenTelemetry Event's `time_unix_nano` and `attributes` fields map directly to Jaeger Log's `timestamp` and `fields` fields. Jaeger Log has no direct equivalent for OpenTelemetry Event's `name` and `dropped_attributes_count` fields but OpenTracing semantic conventions specify some special attribute names [here](https://github.com/opentracing/specification/blob/master/semantic_conventions.md#log-fields-table). OpenTelemetry Event's `name` and `dropped_attributes_count` fields should be added to Jaeger Log's `fields` map as follows:
Expand Down