Better support for chunked parsers and streaming #62

judah · 2016-12-16T23:40:15Z

Currently, when parsing messages we force a strict ByteString for every submessage. It might be better to parse in a more chunked fashion, e.g., for use with lazy ByteStrings or with conduits/pipes.

Counterpoint: regardless, parsing has to read the whole input into a Haskell object that's at least as big as the input data anyway. So it's not clear how much you'd gain from this change.

If we did want this, it might be easier to move from attoparsec to another library. Protobufs are a little tricky to parse because the wire format is not naturally delimited; messages are just sequences of tagged field/value pairs, and sub-messages are encoded as a varint followed by the message (with no "ending" marker).

For example, from the binary package we could use isolate :: Int -> Get a -> Get a which restricts a sub-parser to a specific number of bytes, and isEmpty :: Get Bool which detects end-of-input correctly within a call to isolate. In comparison:

attoparsec doesn't provide an isolate function, AFAIK; currently we mimic it by running a parser on the output of take :: Parser ByteString.
cereal provides an isolate function, but it still reads the full span into a single ByteString.
store's isolate doesn't work yet for our use case (Functions for peeking sequence-like types with runtime known size mgsloan/store#40) and the library also lacks support for architecture-independent serialization (Safe support for architecture independent serialization mgsloan/store#36). See Consider using store for serialization? #5 for more discussion.

The text was updated successfully, but these errors were encountered:

awpr · 2016-12-17T00:15:40Z

I've had some success writing resumable internally-length-delimited binary format parsers with 'cereal' recently, although I did use 'isolate' since the chunks in question were meant to be small.

We can always just instrument our byte-parsing monad of choice with our own State monad or manual int-threading for isolation (i.e. before parsing each tag, check whether we have any bytes "remaining"). Then 'isolate n subParser' is effectively 'lift (evalStateT subParser n) >> modify (subtract n)'.

- More heterogeneous list ops - Resource ops that don't use "dtype" as the type parameter For the latter, we may need an upstream fix, or else to change the convention of how we can tell what the type parameter is.

This will make it easier to replace the parser with something else, especially as we work on generated encodings. See also #62. This change also "unrolls" the `anyBits` function into specific functions `getFixed32` and `getFixed64`, to give a more unified API between parsing and building.

This will make it easier to replace the parser with something else, especially as we work on generated encodings. See also #62.

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against #62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of #294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 14.01 μs (13.86 μs .. 14.18 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs)

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against #62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of #294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 11.66 μs (11.36 μs .. 11.99 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs) Also added isolate and used it for parsing messages and packed fields. This improved the nested benchmark a bit compared to without it: benchmarking nested(900B)/decode/whnf 14.32 μs (14.08 μs .. 14.57 μs) => 11.66 μs (11.36 μs .. 11.99 μs) It didn't make a significant difference in the packed benchmark, I think because the effects of using lists currently dominate everything else.

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against google#62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of google#294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 11.66 μs (11.36 μs .. 11.99 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs) Also added isolate and used it for parsing messages and packed fields. This improved the nested benchmark a bit compared to without it: benchmarking nested(900B)/decode/whnf 14.32 μs (14.08 μs .. 14.57 μs) => 11.66 μs (11.36 μs .. 11.99 μs) It didn't make a significant difference in the packed benchmark, I think because the effects of using lists currently dominate everything else.

) This will make it easier to replace the parser with something else, especially as we work on generated encodings. See also google#62.

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against google#62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of google#294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 11.66 μs (11.36 μs .. 11.99 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs) Also added isolate and used it for parsing messages and packed fields. This improved the nested benchmark a bit compared to without it: benchmarking nested(900B)/decode/whnf 14.32 μs (14.08 μs .. 14.57 μs) => 11.66 μs (11.36 μs .. 11.99 μs) It didn't make a significant difference in the packed benchmark, I think because the effects of using lists currently dominate everything else.

judah mentioned this issue Dec 16, 2016

How to parse delimited messages #61

Closed

judah added a commit that referenced this issue Dec 13, 2018

Move more code and types into Data.ProtoLens.Encoding.Bytes.

c0bb840

This will make it easier to replace the parser with something else, especially as we work on generated encodings. See also #62.

judah mentioned this issue Dec 13, 2018

Move more code and types into Data.ProtoLens.Encoding.Bytes. #276

Merged

judah added a commit that referenced this issue Dec 13, 2018

Move more code and types into Data.ProtoLens.Encoding.Bytes. (#276)

54962bc

This will make it easier to replace the parser with something else, especially as we work on generated encodings. See also #62.

judah mentioned this issue Dec 21, 2018

Use a custom parsing monad instead of attoparsec. #298

Merged

edsko mentioned this issue Mar 30, 2024

Parse Protobuf messages as lazy bytestrings well-typed/grapesy#119

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better support for chunked parsers and streaming #62

Better support for chunked parsers and streaming #62

judah commented Dec 16, 2016

awpr commented Dec 17, 2016

Better support for chunked parsers and streaming #62

Better support for chunked parsers and streaming #62

Comments

judah commented Dec 16, 2016

awpr commented Dec 17, 2016