Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add binary protocol for streams described in #1058 #1062

Open
mattrjacobs opened this issue Jan 25, 2016 · 23 comments
Open

Add binary protocol for streams described in #1058 #1062

mattrjacobs opened this issue Jan 25, 2016 · 23 comments

Comments

@mattrjacobs
Copy link
Contributor

Using JSON for streams is comparatively inefficient. Using binary streams would make consumption/transmission less expensive.

I'll document how the protocol/schema more when I've got a prototype that works, but my initial guess is that it'll look fairly similar to a CBOR representation of the existing schema. More to come on this as I get more experience

@dmgcodevil
Copy link
Contributor

You can also take a look at protobuf, with proper use you can have quite
small messages. Just FYI

Using JSON for streams is comparatively inefficient. Using binary streams
would make consumption/transmission less expensive.

I'll document how the protocol/schema more when I've got a prototype that
works, but my initial guess is that it'll look fairly similar to a CBOR
http://cbor.io representation of the existing schema. More to come on
this as I get more experience


Reply to this email directly or view it on GitHub
#1062.

@semper-omnia-paratus
Copy link

I would like to start working on this one. I wonder when you are going to release description of protocol/schema?

@mattrjacobs
Copy link
Contributor Author

Thanks for getting involved @andrey-polyakov!

I haven't formally documented the schema yet. I've done a first pass at generating reasonable JSON for the 3 new streams in 1.5 and there already exists a JSON conversion for the original hystrix-metrics-event-stream.

If you run the hystrix-examples-webapp and invoke the SSE streams, you can see what the structure of that JSON is. I'd love any feedback on how that looks and any improvements that might make translation to a binary format easier.

I also added sample JSON output to the docs here: https://github.com/Netflix/Hystrix/wiki/Metrics-and-Monitoring#sample-streams.

@semper-omnia-paratus
Copy link

I've read the document about metrics, and checked com.netflix.hystrix.contrib.

Exchange format back-end (Jackson/Json) ought to be replaced with CBOR flavour of Jackson at the first place then we can figure the potential of CBOR. Right?

Now my second question, if we break down the task on steps, from where I should start doing actual work?

@mattrjacobs
Copy link
Contributor Author

At the moment, the 4 HTTP streams are all generated by a servlet which turns Java objects into JSON and them publishes them onto the wire as a String.

My understanding is that the term SSE only refers to text, so sending binary is no longer "SSE".

Therefore, I think we're free to implement this however we choose. Given the lack of a standard, however, we should probably also be fairly explicit about how we're choosing to encode the data, and how to decode it on the other side.

I also think that there are a few implementation options. I think the most similar to today is replacing the JSON data writes to the wire with CBOR data writes to the wire. I'd also like to explore the performance of that option against writing to a memory-mapped file. Then a different server could be dedicated to taking that data and putting it on the wire. This is #1063.

But for now, it's a great first step to understand the design space of using a binary protocol from the same process that is generating metrics.

So here's my proposal for work that should get done.

  1. Document schema
  2. Write unit tests of encode/decode targeting that schema
  3. Wire that encoding logic into the existing servlets and make it conditional (maybe output=cbor query param?)
  4. Write a sample client using decode logic that proves we can properly interpret data.

Any steps you can think of that I'm missing?

I'm happy to pitch in on any of these, and also happy for you to dive in and take any of these that are most interesting to you. I'm pretty busy with travel the next 2 weeks, so it's likely that I can help review code, but probably not be writing any.

Thanks for the offer to contribute - I'm excited about getting this built.

@semper-omnia-paratus
Copy link

I will definetely take a closer look in the next two days. Fortunately, we have a stat in BC on Mon.

That is fine as I am happy to contribute. I guess only cordination and reviewing will be required.

Thanks for the prompt reply.

@semper-omnia-paratus
Copy link

Can you elaborate a little what documenting a schema means please?

Then I'll just start from the first thing first) Thanks.

@semper-omnia-paratus
Copy link

As for now, I was able to replace data format back-end with Jackson-dataformat-CBOR and add methods to write CBOR instead.

I modified some test cases to verify correctness (not all of them yet). Specifically, I converted JSON to CBOR using new back-end API.

Currently I started working on recovering HystrixRequestEvents back from CBOR.

Cheers.

@mattrjacobs
Copy link
Contributor Author

@andrey-polyakov For documenting the schema, I meant describing what you expect data to look like going over the wire.

As an example, I can imagine encoding command keys directly in binary by treating them as strings. I can also imagine treating them as enums, and providing 2 types of messages:

  • a dictionary of -> int. these messages contain mappings and clients should hold onto these
  • actual command events. instead of serializing the key as a string, serialize the enum value and expect the client can look it up.

If you could describe how you think messages should look and how clients should behave in English, that should allow for a design discussion before we get to the implementation.

@semper-omnia-paratus
Copy link

@mattrjacobs I pushed some changes to my branch here.

Specifically, I was able to serializeHystrixRequestEvents and deserialize it to a new class. I created that new calss because it is hard to instantiate the original class.
UPDATE:
I haven't started documenting anything but the proto I mentioned gave me some ideas.

As for schema, my impression after a few trials is that better to be minimalistic keeping everyting except string values binary. Let us keep it as close as possible to JSON versions but throw away field names. I guess it is easier and logical to keep all the values, their order and the way they are nested. Given said that it is about writing binary values in a certain order. Also, I am thinking about a unique format ID being writted at the very begining of any document.

To give a more detailed format description I need to get something working, I guess. You can check how raw data looks like in those unit tests I wrote.

Let me know what you think. Thanks.

@semper-omnia-paratus
Copy link

It is possible to pack boolean fields together to use less memory. In one byte there may be sitting six boolean values though I am not sure if it is worth it.

@semper-omnia-paratus
Copy link

@mattrjacobs Weekend is the time when I work on open-source and it is approaching so that would be very nice if you gave me some feedback. Thanks.

@mattrjacobs
Copy link
Contributor Author

Sorry for the delay. From what I can tell, this looks great. Do you have any measurements on how the wire size compares to the JSON version?

@semper-omnia-paratus
Copy link

That's fine. Thank you! I will continue in the following two days and release some figures.

I am going to do what I described in the plan then. Deserializers and unit tests should keep me busy for a while.

What do you think about packing booleans together as bits?

@semper-omnia-paratus
Copy link

Here is an example of CBOR beating JSON with 98 and 249 bytes respectively:

��a1x�Bayerische Motoren Werkea2���
������a4��h���a1x�Let it be long one pleasea2������a4���������

which is equivalent of
{"name":"Bayerische Motoren Werke","events":["SHORT_CIRCUITED","FALLBACK_FAILURE","FALLBACK_REJECTION"],"latencies":[104],"collapsed":{"name":"SUCCESS","count":23}},{"name":"Let it be long one please","events":["SUCCESS"],"latencies":[65536]}]

@mattrjacobs
Copy link
Contributor Author

That seems very promising. Nice work!

As far as boolean packing, I think efficient encoding should be the prime concern, so I so no reason not to, unless it makes the implementation unduly hard.

@semper-omnia-paratus
Copy link

Yep, numbers are appealing though CBOR engine is not stable.

I've used some work arounds to make things work and this is why some field names are still there. The engine does not always work as expected.

@semper-omnia-paratus
Copy link

@mattrjacobs Here is update on my work in the sandbox I got:

This is what I've done:

  • I have wired CBORGenerator to HystrixMetricsStreamServlet, so I observe CBOR actual messages though the format was not modified and it is still overly verbose because of the field names. No conditional key to switch back to JSON as I just ported whole thing to CBOR so far.
  • To prove that CBOR back-end can do (de)serialization some work done in hystrix-metrics-event-stream.
  • Other changes

I put it on hold for a while and I am wondering where to go from this point? Optimizing the format to CBOR is a huuuge task thus I just need to confirm that it is a feasible goal.

Cheers!

@mattrjacobs
Copy link
Contributor Author

Yeah, I'm still of the opinion that getting a fully-optimized CBOR serialization/deserialization implementation is worthwhile. I'll play around with your branch and give you some more detailed feedback. Can you provide any details on which pieces of the engine were not working as expected?

@semper-omnia-paratus
Copy link

Actually, there were a couple of things described in this ticket. UPDATE: resolved

@agentgt
Copy link
Contributor

agentgt commented Mar 8, 2016

It would be nice to know exactly what performance problem or bottleneck is seen. I ask because I would think making the current HystrixMetricsStreamServlet nonblocking a higher priority (right now it is NOT using HTTP servlet 3.0 async support). It uses blockingqueues and polls.

Using true async support (or I suppose using the rx-netty stream) should reduce latency as well as drastically reduce resources (threads). I would try that before messing with serialization formats particularly since I would imagine most consumers are Javascript (node, browser) and thus JSON would be very fast (and deserialization speed can shockingly be slower than network hops on some of these platforms).

@mattrjacobs
Copy link
Contributor Author

Opened #1123 for async servlets.

Internally at Netflix, it's important to use binary because we have large non-Node systems responsible for processing these streams for analysis. We believe a binary protocol will help efficiency on both sides of the stream. Additionally, since we pay for bytes on the network, minimizing payload size is also important to us.

@semper-omnia-paratus
Copy link

@mattrjacobs Since CBOR supposed to work in key/value format one still have to provide the key name which takes space. It is also quite tedious to work with.

May be Protocol Buffers is a better option?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants