Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zipkin v2 span model #939

Closed
codefromthecrypt opened this issue Feb 3, 2016 · 36 comments
Closed

Zipkin v2 span model #939

codefromthecrypt opened this issue Feb 3, 2016 · 36 comments
Labels
enhancement model Modeling of traces

Comments

@codefromthecrypt
Copy link
Member

We should not try to do all things as zipkin v2.. the large amount of interesting things may lead us to never even start. I believe that if we pare down do basics, creating a model that only includes features that work in zipkin 1 backends, we can start and finish something.

For example, zipkin 1 only really supports String -> String binary annotations.. let's make them String -> String for v2 even if they become fancy later.

Zipkin v1 allows "contributory" spans, where instrumentation are allowed to place multiple hosts into the same span. Let's prevent that, and use a portability technique google cloud trace does. Google Cloud Trace allows multiple spans with the same id, but they vary on span.kind. For example, core server annotations really imply a span.kind=server, where "sr" is span.startTs and "ss" is span.endTs.

We could probably literally use the google cloud trace model as zipkin v2 as it is almost exactly what we need.

https://cloud.google.com/trace/api/reference/rest/v1/projects.traces#TraceSpan

If we do something like this, I'm confident we can get v2 out in relatively short order, and still not prevent more interesting (and severe) change from occurring in later versions of zipkin.

@codefromthecrypt
Copy link
Member Author

here's an issue where binary annotations were a bit confusing (as they allow repeats on keys) openzipkin/brave#136

@yurishkuro
Copy link
Contributor

here's an issue where binary annotations were a bit confusing (as they allow repeats on keys)

is it fair to say that ^^^ is only confusing because the UI does not associate binary annotation with the "span kind" that contains them? For example, I made a fix that adds service name to SA/CA annotations

image

Another common scenario I've seen with same keys is the http.url, which is sent by both client and server, and the value may or may not be the same.

If we adopt the suggestion that (spanID, spanKind, emittingService) combination is unique (which I think implies ditching core annotations and moving the Endpoint on the Span itself), then seemingly identical binary annotations may still be distinguishable by the service that emits them.

@codefromthecrypt
Copy link
Member Author

The above concern on repeating keys is more comprehensively reviewed here: #989 (comment)

If the span included a copy of the endpoint, it would be possible to support this, though let's bear in mind this isn't a popular feature.

codefromthecrypt pushed a commit that referenced this issue Feb 23, 2016
Kafka messages have binary payloads and no key. The binary contents are
serialized TBinaryProtocol thrift messages. This change peeks at thei
first bytes to see if it is a List of structs or not, reading
accordingly.

This approach would need a revision if we ever add a Struct field to
Span. However, that is unlikely. At the point we change the structure of
Span, we'd likely change other aspects which would make it a different
struct completely (see #939). In such case, we'd add a key to the kafka
message of the span version, and not hit the code affected in this
change.

Fixes #979
@codefromthecrypt
Copy link
Member Author

When we get to actually implementing this, we should be very explicit about the safety any issues that remain.

For example, even if a span is single endpoint, do we accept updates to it? If we do, there's a chance of high-maintenance concerns such as a client sending different names for the same span.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Nov 30, 2016

Here's my take of the zipkin v2 model. Simplified, focused on single-host spans, not including any new features, removing troublesome or unsupported features. Zipkin v3 could introduce features not present in our storage or rest api, but I believe it best to pare down first.

Here's an example marked up in java.

class Span {
  class TraceId {
    long hi;
    long lo;
  }

  enum Kind {
    CLIENT,
    SERVER,
    ONE_WAY
  }

  class Annotation {
    long timestampMicros;
    String value;
  }

  @Nullable Kind kind;

  TraceId traceId;
  @Nullable Long parentId;
  long id;

  String name;

  long timestampMicros;
  long durationMicros;

  Endpoint localService;
  @Nullable Endpoint peerService;

  List<Annotation> annotations;
  Map<String, String> tags;

  boolean debug;
}

@codefromthecrypt
Copy link
Member Author

@openzipkin/cassandra @openzipkin/elasticsearch @openzipkin/instrumentation-owners @openzipkin/core finally getting around to this. As mentioned in the previous comment, I don't think it is useful to try and make the model more fancy, rather the pare down to the smallest to support our most used features. Later models can be more fancy after we stabilize.

The model above can support our existing rest api, as well be heuristically derived from existing spans. For example, similar heuristics are already in place to export our current zipkin spans into google stackdriver trace, which as mentioned earlier has a similar single-host model.

If reactions to the proposal I put are mostly positive, then I can bump out an issue to sort details and any impact to storage, translation, propagation etc. This would be similar to the 128-bit thing where it would likely take a couple months to sort everything out.

@basvanbeek
Copy link
Member

@adriancole not seeing the operation name as part of the proposed v2 span. Do you want to drop this in favor of consumers utilizing tags for this?

@codefromthecrypt
Copy link
Member Author

@basvanbeek uhh.. click refresh :)

@nicmunroe
Copy link

I like this model. Clear, concise, easy to understand what everything is and what it's for.

@jcarres-mdsol
Copy link
Contributor

how do you know if you should sample this trace?
Would make sense for tags to have timestamp?

@mansu
Copy link

mansu commented Nov 30, 2016

Does it make sense to merge annotations and tags into a single structure? I feel tags should also be a list.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Dec 1, 2016 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Dec 1, 2016 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Dec 1, 2016

Examples will help a lot. here's a literal translation of an existing client span into the proposed simplified format. Note: on subtle change is suffixing micros to our units. This helps with things like logstash etc which expect timestamps to be milllis or otherwise (meaning you can add a normal timestamp without clashing!)

current

{
  "traceId": "5af7183fb1d4cf5f",
  "name": "query",
  "id": "352bff9a74ca9ad2",
  "parentId": "6b221d5bc9e6496c",
  "timestamp": 1461750040359000,
  "duration": 5000,
  "annotations": [
    {
      "endpoint": {
        "serviceName": "zipkin-server",
        "ipv4": "172.19.0.3",
        "port": 9411
      },
      "timestamp": 1461750040359000,
      "value": "cs"
    },
    {
      "endpoint": {
        "serviceName": "zipkin-server",
        "ipv4": "172.19.0.3",
        "port": 9411
      },
      "timestamp": 1461750040364000,
      "value": "cr"
    }
  ],
  "binaryAnnotations": [
    {
      "key": "jdbc.query",
      "value": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?",
      "endpoint": {
        "serviceName": "zipkin-server",
        "ipv4": "172.19.0.3",
        "port": 9411
      }
    },
    {
      "key": "sa",
      "value": true,
      "endpoint": {
        "serviceName": "mysql",
        "ipv4": "172.19.0.2",
        "port": 3306
      }
    }
  ]
}

simplified

{
  "kind": "CLIENT",
  "traceId": "5af7183fb1d4cf5f",
  "parentId": "6b221d5bc9e6496c",
  "id": "352bff9a74ca9ad2",
  "name": "query",
  "timestampMicros": 1461750040359000,
  "durationMicros": 5000,
  "localEndpoint": {
    "serviceName": "zipkin-server",
    "ipv4": "172.19.0.3",
    "port": 9411
  },
  "remoteEndpoint": {
    "serviceName": "mysql",
    "ipv4": "172.19.0.2",
    "port": 3306
  },
  "tags": {
    "jdbc.query": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?"
  }
}

@codefromthecrypt
Copy link
Member Author

Hopefully it is visible in the example that bookend annotations "cs" "cr" are not needed in the simplified model, yet have the same signal (as you can tell by start, duration, type)

@codefromthecrypt
Copy link
Member Author

ps details, but flexible with how the address annotations are mapped to top level.

ex. localService vs localEndpoint vs endpoint

@bhs
Copy link

bhs commented Dec 1, 2016

@adriancole I'm not sure what sort of appetite there is for breaking changes here, but ignoring compatibility, I would suggest...

  • Moving anything RPC-role-related (the peer endpoint, the CLIENT/SERVER, etc) into the tags
  • A silly optimization: factoring everything about the local VM / local VM / etc out of the Span per se and moving it into the message that reports a set of spans from that VM/process/etc. It will be the same for all Spans in that reporting message, so why represent it over and over again?
  • Store the sampling rate somewhere (or its reciprocal which is probably smaller to represent) so that there's a principled way to infer statistically reasonable data about the pre-sampling stream. Note that this would need to propagate in-band somehow. (FWIW, in Dapper this was part of the same int32 that held the "sampled" bit)
  • Related to the sampling rate, I would get rid of the debug flag

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Dec 1, 2016 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Dec 1, 2016 via email

@devinsba
Copy link
Member

devinsba commented Dec 1, 2016

I think it would be useful to see what a full V1 trace would look like in this simplified format. The client one you posted above looks good to me. It simplifies a bunch of things I've had gripes with. (Endpoint on binary annotation)

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Dec 6, 2016

TL;DR; I've been thinking about how we don't break v3 in process of doing v2. If we change the tags to permit storage of string, number, bool values (as opposed to rejecting them), then the model can upgrade into a more featureful v3 system, particularly with advanced query support.


I think there's definitely a need to keep a v3 in mind when doing v2. How we can eventually support things on our backlog such as analysis or linked traces is important. It is equally important that we can "upgrade into" future features (ie usually by adding fields, and at worst removing some). Being future conscious without paralysis or over-engineering today is very tough balance.

I've know many full-bore systems keep mildly-typed tag values, for filter aggregation or analysis reasons. For example, AWS X-Ray and google's trace proto have a 3-type value of String bool or number. These can be mapped into json without qualifiers (even though there are some limitations like quoting big numbers). As long as there's only 3 types, we can keep the tags map simple. Basic type is crucial as this allows tags to be a dict vs a list of tuples which is more arduous to map to standard types.

For example, eventhough we can't currently query on them, we can still permit basic type tags like so, notably allowing a number.

  "tags": {
    "jdbc.result_count": 43,
    "jdbc.query": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?"
  }

Another analysis concern is not having to parse the annotations to figure out what they are. One way to accomplish that is to only send a map. However, this is difficult to reverse engineer a user-level description out of, and the lack of that leads to very poor zipkin experience of what I call "json pooping". Ex you have a message that is just a json blob. One neat thing I noticed about google's trace proto is that it supports both. For example, the annotation still has a primary description field.. and you can later add classification tags for use in filtering or whatnot. In other words, the feature is something you can "upgrade into" because the annotation type just gets an "extra" field of tags (as opposed to having its primary field replaced by tags)

Ex. In both v1 and v2, annotations look like this (notably v2 excludes the implicit endpoint):

{"timestamp":1478624611519000,"value":"mr"}

in V3, we could add or parse classifying tags to it, without breaking existing code.

{"timestamp":1478624611519000,"value":"mr", "tags": { "type": "messaging", "direction": "received"}}

@codefromthecrypt
Copy link
Member Author

another update, I was working through brave4 and noticed all the conversion stuff can be done mechanically. This means reporters and collectors could easily be changed to support the simplified form now, even if storage query etc aren't updated for a while. In other words, we can do "version 2" now, without breaking api.

The only thing it requires is that instrumentation who use the new format don't share span ids across a network boundary. I think starting with instrumentation (well technically codec) is a nice way to ease in the change in a way that is in no way a big bang.

Thoughts?

@codefromthecrypt
Copy link
Member Author

ps I noted a while ago that stackdriver had a heuristic for splitting client+server spans into single host spans. That's here. Uber also have a heuristic for the same task here.

When we get to the point of implementing v2 model, it will likely be the case that storage isn't upgraded, but the UI is. In such a case we can use a splitter similar to one of the above to represent traces in the new format even if the "raw span" is still the old model.

@codefromthecrypt
Copy link
Member Author

The mythical single-host span

From a trace instrumentation point of view, spans have always been single-host.
That's because the host of the span is the tracer, and the tracer only reports
its spans. With that point of view in mind, being required to add endpoint to
all spans feels laborious and inefficient. This has led to people asking to
change the model. I've gone down this path, too, though in the process have
grown an appreciation for the existing model. Here's why:

  • RPC spans require no navigation to see the parties involved
    If you look at the span data presented by the Zipkin api, when RPC data is
    grouped into a span, it is very easy to tell the timeline of activity between
    the client and the server. The UI takes advantage of this implicitly. When you
    click on a span, you automatically see the causal events in the same place:
    a client request caused the server response. When client and server are reported
    into different span IDs, there's a higher tooling burden. Ex I find it very hard
    to compare timestamps when they are nested into different json objects. Other
    tools like clock skew adjusters would have a different burden than today, too.

  • RPC spans permit third parties
    When modeling ancillary behavior, such as an intermediary, the current span
    model offers more options. For example, a reverse proxy could note a passthrough
    event with annotations as opposed to a new span. The code is more simple to do
    this, and the result would end up in the same json/UI panel. On the other hand,
    most tools are not written assuming there's N members in the same span: The
    approach of starting a new span for an instrumented proxy is still preferred,
    regardless of whether the proxy's span is propagated to the caller or not.

  • RPC spans will exist for a while!
    Even if we move to single-host spans, and address tooling gaps, we have to
    expect dual-host spans to coexist for a while. This means the canonical model
    either needs to synthesize fake spans from multiple host ones, or work with both
    models indefinitely. Synthesized Ids are mostly harmless, but not 100% harmless.
    It is likely that someone will question a span ID that never existed.

Where to go from here?

It seems most sensible to formalize a reporting representation even if the real
model needs to address multi-host for a while. The reporting representation can
be translated into the old one, and may eventually become its own model. In the
mean time, there's a backlog of work needed to really support single-host spans.
This includes making the UI detect RPCs that span different IDs, backfilling
tests and logic for clock skew correction and dependency linking, and probably
other things. Once the whole system works with single-host spans, we could then
consider formalizing it as the "one model", and switch storage etc accordingly.

@jcarres-mdsol
Copy link
Contributor

Is V2 API dependent on solving single host spans?

@codefromthecrypt
Copy link
Member Author

@jcarres-mdsol we can make a V2 api with any type of representation (ex if you are thinking about polishing things about the api, such as query parameters etc). This issue is more about the model.

We can add a v2 POST/PUT endpoint which has single-host json model/representation, and that could be used by instrumentation immediately (since they don't report timing information from multiple hosts anyway). The implementation would simply translate it to the existing model before sinking to storage.

@jcarres-mdsol
Copy link
Contributor

I think I'm still confused on this one, is single host spans a better representation than shared spans among services?

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 19, 2017

@jcarres-mdsol

I think I'm still confused on this one, is single host spans a better representation than shared spans among services?

:) you asked it it is better representation. It is a better representation when you primarily deal with one host, which is the case with tracers.

ex. when we write a client tracing code, we might now do the following

setType(client)
start()
addTag("http.path", "/foo")
setRemoteService(remoteService)
finish()

to map that to the existing span model, it becomes

addAnnotation(startTimestamp, "cs", localService)
addAnnotation(endTimestamp, "cr", localService)
addBinaryAnnotation("http.path", "/foo", localService)
addBinaryAnnotation("sa", true, remoteService)

it is very awkward to explain ^^ in docs..

using the span model proposed it becomes

setType(client)
setLocalService(serviceOfTracer)
setRemoteService(remoteService)
setTimestamp(startTimestamp)
setDuration(endTimestamp - startTimestamp)
addTag("http.path", "/foo")

I think the better is a better representation for tracers to use because...

  • it is less jargony (tags vs "binary annotations)
  • it is simpler (string -> string, as opposed to list (string, blob, type, endpoint)
  • it is less chatty (no needless repetition of the same endpoint on every annotation or binary annotation.
  • it removes ambiguity (sometimes people use the wrong binary annotation type)

All these things feed into easier understanding and docs when writing tracers. That means less support time, and people setup for success as opposed to confusion. We'll have less people with bugs (ex where they leave out binaryannotation.host which happens), because such bugs are now impossible.

Brown field tracers do not need to change. People who already learned the existing format can run with it. The winners here are new users and people who support zipkin. We make it easier for new tracers to get started. We offer them a simplified upload format, devoid of nuanced traps or cognitive burden.

The cost of doing this is a parser on the server-side. This is primarily collectors (ex we can say only http and kafka) which accept the simplified format. In the OpenApi/swagger spec, for example, we'd define an alternate supported representation, over the same endpoint or over a different one. It can coexist and I think it is worth the work if the only beneficiary was documentation and new users.

@codefromthecrypt
Copy link
Member Author

note to all in the thread :)

this issue was created because zipkin v2 never happened as it had too much scope. This issue initially discussed tackling the complexity in the model. It is still that.

As this issue nears a year in duration with no activity beyond discussion, I've suggested we pare it down further:

Allow a simpler representation format for tracers. Accept this format and convert to the existing one on collection.

I think the V2 word is scarier than it needs to be. This is a feature, just like a propagation format would be for tracers. The party with the most responsibility in this feature is those who maintain Zipkin's docs and collector code. Not doing this means we continue to punt known complexity problem further down the road.

If folks feel strongly we shouldn't do this, those folks need to roll up sleeves to help when questions and concerns happen during onboarding. Ex contributing to docs, gitter, and issue resolution when people get confused about what a binary annotation is and send the wrong data. If we prefer this route, please say so and help me with the burden of today.

@jcarres-mdsol
Copy link
Contributor

I am all for this to happen. I think I was confused by all this shared spans vs single-host, etc.
We are talking about simplifying the json sent to the http API. Then the earlier the better.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 19, 2017 via email

@jcarres-mdsol
Copy link
Contributor

Yes, please, current discussion will be confusing to anyone.
I'm very looking forward a new format

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 19, 2017 via email

@nicmunroe
Copy link

FWIW I'm personally in favor of the simplified single-host spans. Anything that makes it more difficult for instrumentors to accidentally introduce bugs due to confusion/complexity/etc is a good thing. And being simpler improves barrier-to-entry and learning curve issues.

@codefromthecrypt
Copy link
Member Author

A reminder that things are easier if we don't do mixed types for tags (ex map has value of type string or bool). Not only does this not work in elasticsearch mappings (first type wins), but also it is complex in json schema openzipkin/zipkin-api#27

@jorgheymans
Copy link
Contributor

Closing for the same reason as #1499 (comment) . This is an issue thread about the design of zipkin v2 model, which has been released long since.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement model Modeling of traces
Projects
None yet
Development

No branches or pull requests

9 participants