-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supports reading multiple spans per Kafka message #995
Conversation
Kafka messages have binary payloads and no key. The binary contents are serialized TBinaryProtocol thrift messages. This change peeks at thei first bytes to see if it is a List of structs or not, reading accordingly. This approach would need a revision if we ever add a Struct field to Span. However, that is unlikely. At the point we change the structure of Span, we'd likely change other aspects which would make it a different struct completely (see #939). In such case, we'd add a key to the kafka message of the span version, and not hit the code affected in this change. Fixes #979
I must admit this feels very hacky and brittle to me... but it could work as a temporary migration path. In the future, I suggest we should only support collecting a list of spans. For Zipkin v2, we could provide a deprecation warning - add warning log messages whenever the collector receives single spans, and not a list of spans? |
Did anyone do any benchmarks to show that writing/reading Kafka messages with a single span is slowed than messages with batch of spans? I do agree with @eirslett that APIs that accept just a single span instead of a collector are suboptimal. Although logging a warning would just spam the collector's log, it can't do much about what clients submit. |
I doubt there is an existing benchmark for any Kafka zipkin reporter, much
less single span vs list. I think the key here is that list gives more
flexibility for instrumentation, including a means to benchmark and report
back.
Personally, I am cool chasing instrumentation to have them switch to list,
as list of one is only several bytes overhead. Kafka is still newish
(except Ruby).
Also happy to add a "log once" message that notes the span id and endpoint
of single-span Kafka messages. That could prevent logs from cluttering.
I agree this is hacky just don't know a cheaper way to help folks move off
single span without breaking them.
Adding a key to the message would seem less hacky, but take more discussion
for example. I was hoping to save that energy for v2 (ack list-only
.)
I also think benchmarks would be helpful, but there's a chicken egg.
Ideally, I would like Prat to respond back when he can try list.
I'm unsurprised about meh reactions.. Anyone feel we shouldn't go down the
peek path with above context in mind?
|
Don't get me wrong, I'm +1 for this change. I don't like the solution, but I still think it's the best solution - there's no smooth migration path. One possible alternative would be to consume span collections on a separate kafka topic, but then we get lots of additional complexity from handling two topics. |
I will try running the collector with this patch to check the perf On Wednesday, February 24, 2016, Eirik Sletteberg notifications@github.com
Prateek Agarwal |
is there indeed a lot of setup overhead to use a different topic for different message format? |
From usability perspective, maintaining multiple topics would be hard imo. For testing perf and maintaining backwards compat, it will be helpful though |
In my experiment, each Trace has 9 spans, which means without patch 9 kafka messages per trace.
With bundling, all 9 spans are batched up in 2 kafka messages.
As we can see, we straight away get a perf improvement of ~4.5 times. This should So, what i see is Message consumption rate is around constant to ~350 but we can increase |
I've not heard any feedback against from a kafka transport user, and this clearly will help @prat0318 and move us in the right direction of moving towards lists as the defacto unit-of-transport. merging |
Supports reading multiple spans per Kafka message
Kafka messages have binary payloads and no key. The binary contents are
serialized TBinaryProtocol thrift messages. This change peeks at thei
first bytes to see if it is a List of structs or not, reading
accordingly.
This approach would need a revision if we ever add a Struct field to
Span. However, that is unlikely. At the point we change the structure of
Span, we'd likely change other aspects which would make it a different
struct completely (see #939). In such case, we'd add a key to the kafka
message of the span version, and not hit the code affected in this
change.
Fixes #979