Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified span2 format #1499

Closed
codefromthecrypt opened this issue Jan 19, 2017 · 80 comments
Closed

Simplified span2 format #1499

codefromthecrypt opened this issue Jan 19, 2017 · 80 comments
Labels
enhancement model Modeling of traces

Comments

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jan 19, 2017

This is a proposal for a simpler json representation for tracers to use when reporting spans. This is a simplification of scope from what was formerly known as Zipkin v2 #939

Scope

The scope is only write, and only json. For example, this doesn't imply dropping support for using the same span ID across client and server RPC calls. It does imply converting this to the existing model at the point of collection.

Format

The following is the description of the simplified fields. Note that in the case of traceId it is fixed width vs variable. For 64-bit trace ids, simply pad left with 0s.

{
  "traceId": 16or32lowerHexCharacters,
  "parentId": 16lowerHexCharactersOrAbsentForRoot,
  "id": 16lowerHexCharacters,
  "kind": enum(CLIENT|SERVER|Absent),
  "name": stringOrAbsent,
  "timestamp": uint53EpochMicrosOrAbsentIfIncomplete,
  "duration": uint53MicrosOrAbsentIfIncomplete,
  "localEndpoint": existingEndpointTypeOrAbsent,
  "remoteEndpoint": existingEndpointTypeOrAbsent,
  "annotations": [
    {"timestamp": uint53EpochMicros, "value": string},
    ...
  ],
  "tags": {
    string: string,
    ...
  },
  "debug": trueOrAbsent,
  "shared": trueOrAbsent
}

Span name and timestamps are allowed to be missing to support late data, which is an edge case when spans are reported twice due to timeouts. Like previous versions, an empty span or service name defers the decision (usually to the other side of a shared span).

This format applied to a typical client span would look like this:

{
  "kind": "CLIENT",
  "traceId": "00000000000000005af7183fb1d4cf5f",
  "parentId": "6b221d5bc9e6496c",
  "id": "352bff9a74ca9ad2",
  "name": "query",
  "timestamp": 1461750040359130,
  "duration": 63874,
  "localEndpoint": {
    "serviceName": "zipkin-server",
    "ipv4": "172.19.0.3",
    "port": 9411
  },
  "remoteEndpoint": {
    "serviceName": "mysql",
    "ipv4": "172.19.0.2",
    "port": 3306
  },
  "tags": {
    "sql.query": "select distinct foo from bar"
  }
}

Impact on tracers

Tracers are the primary customers of this format, and in doing so they can have a closer alignment between field names and common operations. For example, the following is a "turing complete" span api which has almost direct mapping to this format.

start()
name(String name)
kind(Kind kind)
remoteEndpoint(Endpoint endpoint)
annotate(String value)
tag(String key, String value)
finish()
flush() <-- for reporting spans that didn't start and finish locally

It is important to remind that this format does not in any way drop support for the existing one. So old tracers can continue to operate.

Impact on servers

Collectors will decode and convert the new spans into the existing span model, backfilling "cs" and "cr" annotations and the "sa" binary annotation to make it appear as if it were sent in the old format. In order to do so, they need to know how to read the format. Choosing only json makes this simpler.

For http, we can add an endpoint or accept a media type header more specific than json. We could also choose a heuristic route. Kafka and similar systems which don't have headers would need a heuristic approach.

A heuristic approach would be to look for new fields. For example, startTimestamp or finishTimestamp will always be present in spans, so this can be used to decide which parsing rules to apply.

FAQ

Why just json?

This effort is geared at making tracer development simpler, and having the least scope possible to accomplish that. While thrift was supported in the old model, and has its merits, it is not a great format for people getting started, and it is very hard to debug. Those looking for reduced size of data can compress the spans before they are sent. Those who really like thrift can use the old model.

Why not special-case the tracer's endpoint?

The tracer's endpoint (localEndpoint) doesn't change from operation to operation, so you could reasonably ask why this isn't uploaded separate from the span data. The reason is that it lowers the scope of work. Many tracers report by adding spans to a queue flushed occasionally to a message producer. By keeping each span self-contained, this type of logic can be reused with no changes at all.

Why not change the query api, too?

This format is simpler because it removes the need to add the endpoint of the tracer to each annotation or tag. This simplification is possible in tracers as they have a one-one relationship with a traced service. At query time, a span can include both a client and a server, so it cannot be represented in this format.

Why not just remove support for the old model?

#939 started as an effort to change the semantics of spans to single-host, not just for upload, but also processing, storage and query. After a year of discussion, it is clear this is not a practical or pragmatic first step. Reasons include the benefits of the current model and a general lack of interest. This is a tactical alternative that provides many of the wins without the work of a big bang refactor.

What about async spans?

We recently started supporting async/one-way spans via "cs" -> "sr" annotations. Notice the span begins and ends with the request side of the RPC (there's no response sent by the server or received by the client). This translates to the new model as the following: clientSideSpan.start().flush() serverSideSpan.start().flush().

@codefromthecrypt
Copy link
Member Author

cc @openzipkin/instrumentation-owners @openzipkin/core

@bvillanueva-mdsol
Copy link

@adriancole what does Absent mean in "kind"? What kind of use case it stands for? Will it mean that we put 'Absent' as the value or put nothing at all?

@bvillanueva-mdsol
Copy link

@adriancole We were talking about async traces with @jcarres-mdsol . Is it safe to assume that if finishTimestamp is empty with kind as Client, an async process is being started with that span?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 20, 2017 via email

@codefromthecrypt
Copy link
Member Author

We were talking about async traces with @jcarres-mdsol . Is it safe to assume that if finishTimestamp is empty with kind as Client, an async process is being started with that span?

The semantics of one-way spans are currently the same as RPC, except there's no response. This was the least efforts means we could come up with. There's an example of this in a recent pull request #1497

To do this in using the new format in a backwards compatible way, the key is being able to flush a span (report it to zipkin without attempting to calculate duration locally).

ex. span.kind(client).start() --> translates into a "cs" annotation if you flush at this point
-- propagated to the server then..
ex. span.kind(server).start() --> translates into an "sr" annotation if you flush at this point

Let me know if this doesn't answer your question

@bvillanueva-mdsol
Copy link

Thanks for the clarification @adriancole

@eirslett
Copy link
Contributor

eirslett commented Jan 20, 2017

Note that 64 bit integers are tricky in JavaScript, which doesn't have them.
Interestingly, it's possible to represent the number of milliseconds since 1970 in only ~52 bits, while JavaScript can represent whole numbers up to 53 bits. (Just a coincidence) So it will silently work until we eventually encounter a "year 2000 problem".
It might be safer to store timestamps as strings (possibly compact/encoded), even if that could impact performance?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 21, 2017 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 21, 2017 via email

@codefromthecrypt
Copy link
Member Author

@eirslett and anyone else just circling back.. would any of the below help on the date thing?

It is certainly the case that "timestamp" implies seconds since epoch. Would replacing "timestamp" with "micros" or adding a suffix clear up enough ambiguity to be worth its characters?
rename startTimestamp to startMicros or startTimestampMicros
rename finishTimestamp to finishMicros or finishTimestampMicros

@codefromthecrypt
Copy link
Member Author

ps I had an offline request somewhere about the size impact of this. Spans that include more than one annotation or tag will be smaller than before due to de-duplication of the local endpoint.

It won't be as small as using single-character field names, but it isn't optimizing for that anyway.

The compression of json vs the same data compressed in thrift is an exercise we could do, but I don't think size is the major feature here, rather simplicity.

@shakuzen
Copy link
Member

Would replacing "timestamp" with "micros" or adding a suffix clear up enough ambiguity to be worth its characters?

I personally think unambiguous field names are best, so startTimestampMicros and finishTimestampMicros sound good to me. This way, the field names largely document themselves.

I don't see much value in optimizing for size here, especially considering compression. The marginal performance that I presume would be gained would not be worth the loss in clarity. And if such performance were a significant factor, JSON is already the wrong choice.

@codefromthecrypt
Copy link
Member Author

thinking about picking this up

@basvanbeek
Copy link
Member

One thing to think about... maybe do this format in proto3 as it supports a canonical encoding in JSON so we have a simple roll your own JSON model but at the same time can handle higher performance binary encoding/decoding.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jul 2, 2017 via email

@codefromthecrypt
Copy link
Member Author

So first wave will be implementing the collection side, and a storage impl for elasticsearch. This can flex as we review the implementation.

Note: We aim to reuse hex encoding of IDs as they exist today (in lower-hex). This makes sure existing log integration patterns don't break (ex logging usually adds trace ID to context and that is lower-hex which is same as json and headers). This is particularly useful in storage backends that contain both logging and tracing data (ex Elasticsearch).

Note part deux: usually, collectors receive single-host spans, but we'll need to address shared spans sooner or later. Kafka pipelines exist (theoretically) that send merged traces directly to storage (ex via zipkin-sparkstreaming). Also, people sometimes save off json (which has merged spans) and POST it later. While not a data format issue, it is an integration issue that will bite us similarly to how it bit stackdriver when a storage layer like ES stores single-host spans. https://github.com/GoogleCloudPlatform/stackdriver-zipkin/blob/master/translation/src/main/java/com/google/cloud/trace/zipkin/translation/SpanTranslator.java

I'll do my best to keep all operational notes like above close to the code, so that nothing is surprising.

@SergeyKanzhelev
Copy link

SergeyKanzhelev commented Jul 3, 2017

What do you think of making this JSON two-layered - envelope that has a basic correlation fields and versioned&typed payload. Something along the lines:

{
  "traceId": 32lowerHexCharacters,
  "parentId": 16lowerHexCharactersOrAbsentForRoot,
  "kind": enum(SPAN_CLIENT|SPAN_SERVER|SPAN_Local|ANNOTATION),
  "debug": trueOrAbsent,
  "data": 
    {
      "id": 16lowerHexCharacters,
      "name": string,
      "finishTimestamp": uint53EpochMicros,
      "localEndpoint": existingEndpointType,
      "remoteEndpoint": existingEndpointTypeOrAbsent,
      "tags": {
        string: string,
        ...
      },
      "annotations": [
        {"timestamp": uint53EpochMicros, "value": string},
        ...
      ],
    }
}

This way you can solve some scenarios like:

  • You can send annotation separately from span without the need to specify the end time:
    {
      "traceId": 32lowerHexCharacters,
      "parentId": 16lowerHexCharactersOrAbsentForRoot,
      "kind": ANNOTATION,
      "debug": trueOrAbsent,
      "data": 
      {
          "name": string,
          "localEndpoint": existingEndpointType,
          "tags": {
            string: string,
            ...
          },
    
  • Some annotations can be forced out of span in future so they could be indexed and queried easier
  • Versioning of span types may be easier in future. Just encode version in kind field
  • Other types like traces/exceptions/etc. may be supported as a payload in future. Those types will be correlated with traces without the need to change UI and tools. If type is unknown - tool can simply show JSON

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jul 4, 2017 via email

@SergeyKanzhelev
Copy link

@adriancole thanks for clarification. My understanding of schema was wrong I thought only these fields marked "Absent" can be omitted:

"kind": enum(CLIENT|SERVER|Absent),
"remoteEndpoint": existingEndpointTypeOrAbsent,
"debug": trueOrAbsent

So I wasn't clear on how you can report a span name for the runaway annotation. The intend with proposed envelope was to extract fields strictly required for correlation one level up and put all situation-specific fields inside the data object. Using kind you can strongly type those data objects further.

Also that proposal didn't change any data formats (except extended up kind to be a string) so I thought it may be incorporated now and make versioning easier in future.

Anyway, I see your point that this proposal doesn't solve any major problems you have immediate plans for and creates unnecessary complexity and overhead.

I really like how this schema simplify semantic of span!

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jul 4, 2017 via email

@codefromthecrypt
Copy link
Member Author

updated about absent

@SergeyKanzhelev
Copy link

What about the name? Will you be able to report it when you have remote annotation only? Or it will be just empty?

@codefromthecrypt
Copy link
Member Author

good point. we can accept absent span name and use the same semantics as what we currently do for empty (which is defer the naming choice). updated

@codefromthecrypt
Copy link
Member Author

ok made a tracking issue for this #1644

@yurishkuro
Copy link
Contributor

@adriancole what about the old format that was blocking openapi ?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Aug 14, 2017 via email

@codefromthecrypt
Copy link
Member Author

nearing the finish line here. One remaining thing left is honing the java api for span recording. We have an opportunity to make it easier and also match the schema 1-1. That would include the following

  • Change IDs to validated strings
    • This doesn't mean we don't validate.. it is still hex 16 or 32 chars
    • This should be provably more efficient than repeatedly converting back/forth from longs
    • We can tell the difference between an ID that was propagated as 128 bit from one that had all high bits unset (helpful for looking at instrumentation)
    • Storage backends that keep fixed binary or uint64 can continue to (we'd just convert on the way in)
  • Change IPs to validated strings
    • this is cheaper than repeatedly doing so and also packing and unpacking InetAddress instances
    • since we don't repeat endpoints anymore, it isn't as necessary to use integer encoding
    • This is a lot easier to read and less to debug than byte arrays or integers

By doing the above, the java struct looks exactly like the openapi struct. This makes generated code in any language look just like our native code, which means porting should be easier. This also makes everything easier to troubleshoot, as there are no conversions between types that don't always "toString" well.

Before, zipkin had to take special attention to reduce memory size as things were repeated a lot. While IDs are definitely longer this way (16 chars is more memory than a long), this seems a reasonable tradeoff. Moreover, if such was a big deal, the instrumentation could make a custom struct easier with v2 than today anyway.

cc @nicmunroe @anuraaga @shakuzen @jplock @llinder @devinsba

@codefromthecrypt
Copy link
Member Author

oh yeah one more thing on ^^. It makes it gson, jackson, moshi friendly: very hard to get serialization wrong.

@codefromthecrypt
Copy link
Member Author

On the java struct having validated strings, seems sensible so will try it out probably in a PR against the elasticsearch WIP.

@codefromthecrypt
Copy link
Member Author

started on trace identifiers as string here #1721

@codefromthecrypt
Copy link
Member Author

java api change to validated strings for IP and IDs resulted in 2x efficiency gains in benchmarks. #1721 It is incidentally also easier to write your own codec for. However, when we write to storage, we should still use the formal library as it normalizes IPv6, which makes string comparison more likely to work.

@codefromthecrypt
Copy link
Member Author

I've thought about it and the "v2 library" will be able to encode v1 format (at least json). This is important as for example if this is used in brave or similar, we want tracers to be able to upgrade independently of the system. Ex Encoder.LEGACY_JSON would spit out the old format (with no dependency on v1 jars). This will happen after #1726

indrekj added a commit to salemove/zipkin-ruby-opentracing that referenced this issue Apr 13, 2018
Libraries are slowly migrating over to a next version span format. The
new format is a lot easier to implement and understand.

More info: openzipkin/zipkin#1499
@codefromthecrypt codefromthecrypt added enhancement model Modeling of traces labels Oct 23, 2018
abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
This adds an internal copy of a span conversion utility mentioned in
issue openzipkin#1499. This is starting internal to ease review and allow
incremental progress. The first consumer will be dependency linking.
abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
This adds an internal copy of a span json codec issue openzipkin#1499. This starts
internal to ease review and allow incremental progress. The first
consumer will be Elasticsearch, as this format removes nested queries.

Note: this change also introduces json serialization of Span2, which
allows future use in Spark.
abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
This drops the internal type of DependencyLinkSpan in favor of the Span2
type coming in openzipkin#1499. Doing so now gives us practice, solves a few bugs
along the way. When Span2 becomes non-internal, the change will be a
simple package rename.
abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
)

This accepts the json format from openzipkin#1499 on current transports. It does
so by generalizing format detection from the two Kafka libraries, and
a new `SpanDecoder` interface. Types are still internal, but this allows
us to proceed with other work in openzipkin#1644, including implementing reporters
in any language.

Concretely, you can send a json list of span2 format as a Kafka or Http
message. If using http, use the /api/v2/spans endpoint like so:

```bash
$ curl -X POST -s localhost:9411/api/v2/spans -H'Content-Type: application/json' -d'[{
  "timestamp_millis": 1502101460678,
  "traceId": "9032b04972e475c5",
  "id": "9032b04972e475c5",
  "kind": "SERVER",
  "name": "get",
  "timestamp": 1502101460678880,
  "duration": 612898,
  "localEndpoint": {
    "serviceName": "brave-webmvc-example",
    "ipv4": "192.168.1.113"
  },
  "remoteEndpoint": {
    "serviceName": "",
    "ipv4": "127.0.0.1",
    "port": 60149
  },
  "tags": {
    "error": "500 Internal Server Error",
    "http.path": "/a"
  }
}]'
```
@jorgheymans
Copy link
Contributor

@adriancole it's late, i don't have the energy to scrutinize 77 comments of this issue, but i have a hunch that this issue can be closed given v2 being out in the open for quite a while. Unless you object, i'll close it tomorrow.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented May 21, 2020 via email

@jorgheymans
Copy link
Contributor

🎵 Another one bites the dust 🎵

@jcchavezs and at your request here's the Mercury emoji ☿️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement model Modeling of traces
Projects
None yet
Development

No branches or pull requests