From 8232b9020a39dd92a7f59e1ce54f666462f81700 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Thu, 18 Oct 2018 14:15:20 -0700 Subject: [PATCH 1/4] Fixed typo --- 2018/System.Buffers.SequenceReader/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2018/System.Buffers.SequenceReader/README.md b/2018/System.Buffers.SequenceReader/README.md index c56fbf8..aa0dd3e 100644 --- a/2018/System.Buffers.SequenceReader/README.md +++ b/2018/System.Buffers.SequenceReader/README.md @@ -38,7 +38,7 @@ Status: **Needs more work** | cost of zeroing out the passed in scratch buffer as it's rarely used. Counter argument from @stephentoub: > If `Peek` returned the number of `T`s actually peeked, why would the span - > need to be zero'd out beforehand? (It looks like it's currently defined to + > need to be zeroed out beforehand? (It looks like it's currently defined to > return a `ReadOnlySpan`, and presumably that logically incorporates that > by being of the correct length?) * `IsNext` is a bit weird. Maybe we should we use `StartsWith` and From b4aa05dc5c9516e3eab1078b7fc0429d18ba080d Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Thu, 18 Oct 2018 14:15:55 -0700 Subject: [PATCH 2/4] Prepare section headers --- 2018/System.Buffers.SequenceReader/README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/2018/System.Buffers.SequenceReader/README.md b/2018/System.Buffers.SequenceReader/README.md index aa0dd3e..8530779 100644 --- a/2018/System.Buffers.SequenceReader/README.md +++ b/2018/System.Buffers.SequenceReader/README.md @@ -5,7 +5,9 @@ Status: **Needs more work** | [Issue](https://github.com/dotnet/corefx/issues/32588) | [Video](https://www.youtube.com/watch?v=0WN1OXKBMl8) -## Notes +## Session 1 + +### Notes * Two types are being proposed here: - `SequenceReader`: wraps a `ReadOnlySequence` @@ -27,7 +29,7 @@ Status: **Needs more work** | * None of this is accessible to languages that do not support `ref structs`, such as Visual Basic. -## API feedback +### API feedback * `Rewind` seems a bit weird * Can we unify `Advance` and `Rewind` into `Seek()` where the offset can be From ab6feeb64e6abfcd528aef2562b2d680a60d0baf Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Thu, 18 Oct 2018 14:16:12 -0700 Subject: [PATCH 3/4] Add notes for round 2 --- 2018/System.Buffers.SequenceReader/README.md | 43 ++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/2018/System.Buffers.SequenceReader/README.md b/2018/System.Buffers.SequenceReader/README.md index 8530779..4f80d82 100644 --- a/2018/System.Buffers.SequenceReader/README.md +++ b/2018/System.Buffers.SequenceReader/README.md @@ -51,3 +51,46 @@ Status: **Needs more work** | - We shouldn't name them `Advance` as they could conflict with the existing `Advance(long)` method when `T` is `long`. +## Session 2 + +### SequenceReader\ + +* Should we remove the `unamanged` constrained and replace with a `.ctor` check? + - Done for pinning and being able to take a pointer? +* Can we have some sample code that uses a `ReadOnlySpan`-based reader vs. a + `ReadOnlySequence`-based reader? + - We can then compare the per gains + - We can then compare the usability gains +* `AtLastSegment`. The logic outlined in the comment is wrong, you need to honor + position, and not check whether there is next segment. +* `Allocator` seems odd. If we need it, should we promote the delegate or move + it elsewhere? We can't use a `Func` due to the generic instantiation issue + with `Span`. + - Should the allocator be passed in as argument? Storing it as a field might + cause additional memory barrier +* `AreAnyNext` -> `IsNextAny` +* Suggestion from last time was to rename `IsNext` to `StartsWith` but the + latter isn't clear whether it's relative to the current position vs. the + buffer's start. +* We may want to make `advancePastDelimiter` different methods to avoid the perf + issue due to the extra branches. +* The struct has to be passed by `ref` or you have bugs. Also, since it's large, + passing it by value would also tank perf. + +### SequenceReaderExtensions + +* All methods have a limit of reading 128 bytes maximum + - This allows stack allocated buffers to be used and seemed rationale + - Might be too small for specialized scenarios, such as many leading zeros + - We cannot really expose APIs with sizes as this would likely destroy perf +* No APIs can produce strings, we should probably add that as dealing with + things going across buffers is hard and we don't want folks to go through an + intermediary array. + +## BufferExtensions + +* Can't we make this an instance method? + +## UnsafeGetPositionWithinSegment + +* Move to `SequenceMarshal` \ No newline at end of file From 09bc6ece4f8bc18f3e648971ea185799d21e0b53 Mon Sep 17 00:00:00 2001 From: Immo Landwerth Date: Tue, 23 Oct 2018 15:36:19 -0700 Subject: [PATCH 4/4] Address Stephen's feedback on the notes --- 2018/System.Buffers.SequenceReader/README.md | 28 +++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/2018/System.Buffers.SequenceReader/README.md b/2018/System.Buffers.SequenceReader/README.md index 4f80d82..154809b 100644 --- a/2018/System.Buffers.SequenceReader/README.md +++ b/2018/System.Buffers.SequenceReader/README.md @@ -61,6 +61,32 @@ Status: **Needs more work** | `ReadOnlySequence`-based reader? - We can then compare the per gains - We can then compare the usability gains + - From [@stephentoub](https://github.com/stephentoub): + > And specifically (since I'm not sure I was clear in the meeting), here's + > what I was hoping to see. Given some common scenario that reads from / + > parses data from a pipeline: + > + > * Use the BufferReader that only supports span, and have a single method + > (written as we would in production code as best we can) with all of the + > relevant boilerplate that handles interaction both with the pipeline and + > with managing sequences/calls to the span-based reader. + > * Use the BufferReader that supports sequence, and have a single method + > (written as we would in production code as best we can) with all of the + > relevant boilerplate that handles interaction both with the pipeline and + > with the sequence-based reader. + > + > I'd like to us to be able to look at the code the provides the same + > functionality and the benchmarks on it and compare side-by-side both + > usability and performance. If we could do the test as part of kestrel + > (e.g. two different runnable commits off of master) and look at plaintext + > benchmarks and what the kestrel code looks like both ways, even better. + > + > If the version with the span-based reader is totally incomprehensible and + > provides only modest gains, then that leads us one way. If the version + > with the span-based reader is only mildly more cumbersome and provides + > significant gains, then that leads us another way. Etc. I just want to + > make sure we're all in agreement on the goals, seeing the same data, and + > then coming to a joint conclusion. * `AtLastSegment`. The logic outlined in the comment is wrong, you need to honor position, and not check whether there is next segment. * `Allocator` seems odd. If we need it, should we promote the delegate or move @@ -79,7 +105,7 @@ Status: **Needs more work** | ### SequenceReaderExtensions -* All methods have a limit of reading 128 bytes maximum +* All methods have a limit of reading 128 `T`s - This allows stack allocated buffers to be used and seemed rationale - Might be too small for specialized scenarios, such as many leading zeros - We cannot really expose APIs with sizes as this would likely destroy perf