Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize format for "simple" baggages (a.k.a "tags") #13

Closed
bogdandrutu opened this issue Sep 16, 2017 · 14 comments
Closed

Standardize format for "simple" baggages (a.k.a "tags") #13

bogdandrutu opened this issue Sep 16, 2017 · 14 comments

Comments

@bogdandrutu
Copy link
Contributor

Background:

Open Questions:

    1. Use a different header than "Trace-Context" or extend that format?
    1. What types do we support: key -> string? value -> (string, int64, bool) ?
    1. Size restrictions: Per key or value? Per entire set of baggages/tags? None?
@yurishkuro
Copy link
Member

yurishkuro commented Sep 16, 2017

  • +1 for different header(s). I actually liked, conceptually, the idea in Microsoft spec that only unique request ID (aka span ID) needs a dedicated header, all other values like traceID and sampling flags can be just part of the baggage. But it makes the basic tracing case more verbose.
  • what types to support - given that this is a spec for a text protocol, I don't think the discussion of types is relevant, because it only manifests at the API level for accessing the values. From this spec pov, baggage keys and values are just strings.
  • I would leave size restrictions out of the spec, at least in the first iterations - the implementations can apply their own rules to that.

Other open questions:

  • do we combine all baggage items into one header or split them?
    • Microsoft spec combines them iirc, which implicitly puts extra restrictions on the size
    • in Jaeger we split each baggage item into individual header, but I don't quite like the encoding scheme we're using where the baggage key is part of the header name, which puts restrictions on the character set available for the keys
    • one possible encoding is:
      • Baggage-Size: <count of items>
      • Baggage0: <key>=<value>
      • ...
      • Baggage<n-1>: <key>=<value>
  • are the implementations allowed to do drop the baggage, e.g. if it gets too large?
    • can all if it be dropped? (imo yes)
    • can some of it be dropped? (imo yes)
  • what is the encoding of the keys and values?
    • e.g. \n is not legal in the header value, = above is a reserved char, , is a reserved char in HTTP spec, so some encoding is certainly required
  • we might need to define some approach to name spacing of baggage keys, to avoid different implementations overriding each other's metadata

@yurishkuro
Copy link
Member

btw, I am obviously biased towards "baggage" because of OpenTracing, but I think "tags" is way too generic and ambiguous term, while "baggage" is unusual enough to not be confused with anything else.

@CodingFabian
Copy link

I do not understand why separation of concerns is not applied in OpenTracing, and I wonder why this pollution is proposed to infect the idea of a trace context spec.
What is really needed for vendor interoperability of tracing is a clear definition of how to understand an active trace and how to continue it, preserving the trace.
The actual semantics of what this trace is has been subject to the tracing system used.

In my opinion the whole idea of baggage transport is seriously flawed, and I seriously had hoped that this spec will be the remedy, reducing it to the minimal need of understanding a trace and being able to continue it.

If you want a tldr of why I dislike baggage or tags, then here is the focal point:
Baggage or Tags lack semantics. They consume bandwidth without explaining anybody what they mean. Because they are "optional" in most specs, this means they are unusable for interoperability. And due to their optional nature they are unusable for control flow. The way it is used now is a vendor specific quirk of tracers, which decide to carry some meaningless key value pairs along, while sending other vendor specific key value pairs to the trace system.

@yurishkuro
Copy link
Member

@CodingFabian

I do not understand why separation of concerns is not applied in OpenTracing, and I wonder why this pollution is proposed to infect the idea of a trace context spec.

Instrumentation for tracing consists of 90% of general purpose distributed context propagation, and 10% of the actual tracing. There are initiatives to separate DCP into its own layer (e.g. https://github.com/tracingplane). We discussed it at the inception of OpenTracing (which actually started as "Distributed Context Propagation" project), but decided that we want the API to be easy for tracing first, and simply expose the DCP (baggage) API which would have to be implemented anyway, since tracing cannot exist without DCP.

I think the same reasoning applies here. It's certainly possible to limit the scope of this spec to only the tracing piece, except that there have been already many questions "what about this" which can be easily solved with general-purpose baggage spec.

It is not clear from your comment if you think the notion of general purpose DCP itself flawed, or any specific implementations of it. If it's the former, then I can say this:

  1. http://www.cs.brown.edu/~rfonseca/pubs/fonseca-tracing-1973.pptx
  2. http://pivottracing.io/mace15pivot.pdf
  3. at Uber we have several projects in flight that are using baggage for propagating things like tenancy, auth & claims, fault instructions for chaos engineering, business flow identifiers, markers for capacity planning, cost accounting, contextualized metrics.

Baggage or Tags lack semantics.

Yes, that is the main feature of baggage. Actually, they do have semantics for components that care about specific items. And components that don't care don't need to know the semantics, they only need to pass the data along to the next layer. If every baggage item had semantics that every one of 5000 microservices understood, we'd need to change all 5000 microservices if we want to add one more baggage item, even if only 1% of them actually care about interpreting it.

@yurishkuro
Copy link
Member

Added another open question to my #13 (comment) above regarding name spacing of baggage keys.

@codefromthecrypt
Copy link

just like RFCs (we we aren't even at the stage of) change a bunch of times, we can expect whatever we do here to change. I would like to follow-up with @CodingFabian to figure out what we can do around this.

For experimentation sake, the Correlation-Context seems on track to help as long as whatever we hope to achieve with this ends up in some sort of tests (ex maybe some python or other easy-access language tests here in this repo).

bikeshedding the name a bit, I think Correlation-Context is associated with the trace-context header, maybe trace-context-meta isn't a bad choice as it also prefix-matches (assuming it is sent iff trace-context is). meta is bad, just throwing ideas

@bhs
Copy link
Contributor

bhs commented Sep 26, 2017

Truly just throwing this out there in the spirit of sharing ideas (translation: I'm not sure I like what I'm about to propose, either)...

Another way to look at this is that there's one set of trace context fields that are for tracer-internal purposes, and potentially another set of trace context fields for tracing-user purposes. The "internal" fields are basically what's been discussed elsewhere in this repo: trace-ids, span-ids, and sampling bits. The "user" fields would be things like census-style accounting entities, security tokens, and so on. The latter maps almost 1:1 with the opentracing "baggage" concept, of course, though this is a different way of framing it.

Separating the "user" fields from the "internal" fields seems safer from an interference and isolation standpoint. I wonder (and this is the most questionable part here) if we could use precisely the same format or meta-format for the "user" and "internal" fields – could be an elegant unification. (Though I suspect it won't be compact enough for the tracer-internal state)

@tylerbenson
Copy link

@bhs While I like the idea of avoiding potential naming collisions between trace internals and user space, if this gets popular and becomes more standardized, I also see a potential for a third grouping of collisions from frameworks/libraries. That may be something to consider when trying to decide to split or not.

@bhs
Copy link
Contributor

bhs commented Sep 29, 2017

@tylerbenson that's a great point. I still like the idea of segmenting state that's specific to tracer internals... wouldn't want it to get lost or truncated because of overzealous user behavior.

@wu-sheng
Copy link
Contributor

@bogdandrutu Since we have Correlation Context, and @SergeyKanzhelev suggest for ttl property. For me, baggages are somthing ttl=infinite. Do we still this separated head?

Maybe you will say, baggages are somthing of application data, Correlation Context are something about tracer propagation. That is the only possible difference for me, now.

@christian-schwarzbauer
Copy link

hi everybody,

I'd like to add our point of view at Dynatrace to this discussion, if I may. :-)
Generally saying, our tracing concepts are pretty similar, but we also have some "internal tracing data" that is only useful for our tools.

So I absolutely vote for something like the baggage concept to propagate tool-internal information - which mostly consists of various IDs for us.
I also kinda like the name bagagge as suggested by @yurishkuro and keeping the "Trace-Context" prefix as @adriancole suggested, would lead to something like "Trace-Context-Baggage" which would sound ok to me.

As the majority of this thread seems to do, I also suggest to separate the headers between "general tracing data" and "internal tracing data" (= baggage).
I also suggest to only have one header for all the baggage -> should make it easier to get it through firewalls/ESBs and such if there's only 2 specific headers.
In any case, if there are multiple tools/vendors involved in one trace, they must not interfere with one another.

So from my point of view the basic behavior guidelines should be:

  • if there is bagagge coming in, it has to be passed along
  • tools may modify their own bagagge, but must not touch bagagge from other tools
  • baggage may be dropped, but only if really really necessary

I also don't see the need to specify the exact encoding, but general tracecontext rules could be recommended: e.g. use '-' as separator, base16 encoding of numeric values
I would just go for something simple like

<tool1_name>=<tool1_bagagge>;<tool2_name>=<tool2_bagagge>;...

and a specific tool's bagagge could look like this (but it's actually up to the tool)

01-127-session42-87e9a6f1	// ';' and '=' may not be used and have to be encoded by tool itself (e.g. via url-encoding)

Of course there should be no tool name collision, but I guess that should be doable. :-)

For a binary representation there should of course also be size information included, which I don't see necessary for the HTTP header representation.
I also would not include required version information, that's - again - up to the tool to implement on its own.

Having said all that an actual bagagge HTTP header could IMHO simply look like this:

Trace-Context-Baggage: uber=05-taxi31337-27;dynatrace=01-127-session42-87e9a6f1

@yurishkuro sorry for the Uber sample, that was the first name that came to my mind ;-)

@codefromthecrypt
Copy link

codefromthecrypt commented Oct 3, 2017 via email

@JonathanMace
Copy link

Another question to add:

Suppose I have some async components, or suppose I propagate my baggage in RPC responses. How do I combine a key that has two different values for a tag?


FWIW here's the paper we wrote about baggage: http://cs.brown.edu/people/jcmace/papers/mace2017layered.pdf

The paper's a bit roundabout in its message, but is certainly a good resource for this discussion (feel free to contact me to discuss it, especially if you disagree with any of our assertions)

Also not-quite-as-relevant, here's the upcoming SOSP paper on Canopy, Facebook's e2e tracing system:
http://cs.brown.edu/people/jcmace/papers/kaldor2017canopy.pdf


If code matters more than words, I've begun implementing our baggage stuff for other languages (e.g., https://github.com/tracingplane/tracingplane-go/blob/master/atomlayer/api.go is our minimal implementation of TP propagation in Go, at 118 LOC) and instrumenting other platforms (e.g., shamelessly butchering https://github.com/JonathanMace/spring-cloud-sleuth). I'll use the sock shop demo (https://microservices-demo.github.io/) to illustrate some baggage concepts. Also we hacked a goroutine-local context into go (https://github.com/JonathanMace/tracing-framework-go/blob/master/cmd/modify-runtime/modify.go) which is cool.

@SergeyKanzhelev
Copy link
Member

I believe this issue is addressed by introduction of Trace-Context-Ext for logger internal fields and Correlation-Context for user-defined bag of properties. Closing the issue. Please re-open for more discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants