Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

TryGet for ReadOnlysequence #27229

Merged
merged 19 commits into from
Feb 26, 2018
Merged

TryGet for ReadOnlysequence #27229

merged 19 commits into from
Feb 26, 2018

Conversation

pakrym
Copy link

@pakrym pakrym commented Feb 17, 2018

@pakrym pakrym added the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Feb 17, 2018
@@ -22,7 +22,9 @@
<PropertyGroup Condition="'$(Configuration)|$(Platform)' == 'uap-Windows_NT-Debug|AnyCPU'" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)' == 'uap-Windows_NT-Release|AnyCPU'" />
<ItemGroup>
<Compile Include="System\Buffers\ReadOnlySequence_TryGet.cs" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ordering

Can you please move these to the related ItemGroups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why globbing is better

return true;
}

internal bool TryGetReadOnlyMemory(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put parameter on the same line

return false;
}

var startIndex = GetIndex(Start.Index);
Copy link
Member

@ahsonkhan ahsonkhan Feb 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use of var, here and elsewhere

@KrzysztofCwalina
Copy link
Member

We should talk about this at the API review. I am not sure we want to keep adding all these "XxxMarshal" classes.

Copy link
Member

@KrzysztofCwalina KrzysztofCwalina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about it at the API review.

@pakrym
Copy link
Author

pakrym commented Feb 20, 2018

@KrzysztofCwalina I wanted to put it into MemoryMarshal but couldn't fix the build because sometimes it comes from System.Private.CoreLib and causes conflicts. I renamed it temporary to be able to see CI green.

@pakrym pakrym force-pushed the pakrym/TryGet branch 2 times, most recently from 1ebda25 to 524228a Compare February 22, 2018 22:38
@@ -508,4 +509,12 @@ public static partial class MemoryMarshal
public static System.Span<T> CreateSpan<T>(ref T reference, int length) { throw null; }
#endif
}

public static partial class SequenceMarshal
Copy link
Member

@ahsonkhan ahsonkhan Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have separate a SequenceMarshal class? Can we use the existing MemoryMarshal?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly the question that I was going to ask you :) see #27229 (comment)

Copy link
Member

@ahsonkhan ahsonkhan Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend adding it on the coreclr side first (which will get mirrored here, in corefx). Only then we will get the tests passing.

See the example of API addition here (in the example @benaadams added TryGetOwnedMemory APIs to MemoryMarshal):

  1. Add new API to coreclr (Add TryGetOwnedMemory coreclr#16455).
  2. The mirror PR will automatically get created to add the change on the corefx side (Mirror changes from dotnet/coreclr #27325).
  3. Wait for a versions-update PR to go through, which points corefx to the coreclr with your change (Update BuildTools, CoreClr, CoreFx, CoreSetup to preview1-02431-03, preview1-26218-05, preview1-26218-04, preview1-26218-02, respectively (release/2.1) #27241).
  4. Expose the API in corefx and add tests (Add TryGetOwnedMemory ref and tests #27288).

couldn't fix the build because sometimes it comes from System.Private.CoreLib

On netfx, your change here (as is) will likely pass since it doesn't depend on coreclr. The other CI legs would continue to fail though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ReadOnlySequence is not in coreclr at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good point!

MemoryMarshal is a partial class (common, portable, fast) and its surface area from corelib has to be a superset of what's coming from System.Memory (it contains some fast span specific APIs):
https://github.com/dotnet/corefx/blob/master/src/System.Memory/ref/System.Memory.cs#L502

Therefore, atm, if you add an API to it, it must be implemented in coreclr.

It is a bit tricky, but what if we add these to MemoryMarshal.Sequence.cs (which defines the partial MemoryMarshal class) and include that file here (if IsPartialFacadeAssembly == true):
https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System.Memory.csproj#L153

Any concerns with that approach?

Copy link
Member

@ahsonkhan ahsonkhan Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the fact we don't want to move ReadOnlySequence to coreclr we are going to need a new type for these so they can stay just in System.Memory

For it to stay in System.Memory, do we have to create a new type? Can you explain why using MemoryMarshal would force ReadOnlySequence to move to coreclr (which I understand that we don't want)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System.Memory for netcoreapp is not entirely built from coreclr. We still have types that come from System.Memory. See every unconditional ItemGroup:

Not all of System.Memory but all of MemoryMarshal is implemented in coreclr for netcoreapp.

Copy link
Member

@ahsonkhan ahsonkhan Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all of System.Memory but all of MemoryMarshal is implemented in coreclr for netcoreapp.

My suggestion here suggests an approach to change that. Could that work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That approach will not work. It would only work if you plan to expose these APIs only in the OOB implementation which we do not want to do. MemoryMarshal implementation cannot be split between two assemblies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemoryMarshal implementation cannot be split between two assemblies.

Yes, you are right. Ignore my suggestion. We will have to use a separate class for this. Thanks for the clarification.

/// </summary>
Memory<T> GetMemory(int minimumLength = 0);
Memory<T> GetMemory(int lengthHint = 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this name. Maybe sizeHint instead.

{
internal bool TryGetMemoryList(out IMemoryList<T> startSegment, out int startIndex, out IMemoryList<T> endSegment, out int endIndex)
{

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Remove extra whitespace

@@ -13,29 +13,35 @@ namespace System
/// </summary>
public readonly struct SequencePosition : IEquatable<SequencePosition>
{
private readonly object _segment;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove whitespace

@@ -165,7 +168,11 @@ private BufferSegment AllocateWriteHeadUnsynchronized(int count)
{
// No free tail space, allocate a new segment
segment = CreateSegmentUnsynchronized();
segment.SetMemory(_pool.Rent(Math.Max(_minimumSegmentSize, count)));

var adjustedToMinimumSize = Math.Max(_minimumSegmentSize, lengthHint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments about what this is doing.

}

/// <summary>
/// Get an array from the underlying <see cref="ReadOnlySequence{T}"/>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Get an array segment...


/// <summary>
/// Index inside segment of memory this <see cref="SequencePosition"/> points to.
/// </summary>
public int Index { get; }
[EditorBrowsable(EditorBrowsableState.Never)]
public int GetInteger() => _index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we fine with these still being public?

Copy link
Member

@ahsonkhan ahsonkhan Feb 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you please add tests similar to https://github.com/dotnet/corefx/issues/27224#issue-297985444 so we can validate correctness here?

For instance, are we making sure we remove the bitmask whenever necessary (ToString, etc).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Does this comment still hold? https://github.com/dotnet/corefx/pull/27229/files#diff-b484b08a6c445aba9f2da2e98b714a9fR12

// Properties of this type should not be interpreted by anything but the type that created it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we fine with these still being public?

Yes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, are we making sure we remove the bitmask whenever necessary (ToString, etc).

We are not removing anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these methods need to be public?

We are not removing anything.

Should we remove the bitmask in ToString? https://github.com/dotnet/corefx/pull/27229/files#diff-b484b08a6c445aba9f2da2e98b714a9fR62

Otherwise, the index value will not make sense to users.

Assert.Equal(2, array.Offset);
Assert.Equal(3, array.Count);

Assert.False(SequenceMarshal.TryGetMemoryList(buffer, out var _, out var _, out var _, out var _));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you don't need to do out var _, out _ works too.

Assert.Equal(0, startIndex);
Assert.Equal(3, endIndex);

Assert.False(SequenceMarshal.TryGetArray(buffer, out var array));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: fix use of var

@@ -186,6 +185,15 @@ private BufferSegment AllocateWriteHeadUnsynchronized(int count)
return segment;
}

private int GetSegmentSize(int sizeHint)
{
// First we need to handle case where hint is smaller then minimum segment size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo then

// First we need to handle case where hint is smaller then minimum segment size
var adjustedToMinimumSize = Math.Max(_minimumSegmentSize, sizeHint);
// After that adjust it to fit into pools max buffer size
var adjustedToMaximumSize = Math.Min(_pool.MaxBufferSize, adjustedToMinimumSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment, variable name, and Math.Min/Max operations don't seem to align. Are we sure this is correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, adjustedToMinimumSize because of _minimumSegmentSize, adjustedToMaximumSize because of _pool.MaxBufferSize

if (destination.Length == 0)
{
writeSize = Math.Min(source.Length, bufferWriter.MaxBufferSize);
destination = bufferWriter.GetSpan(writeSize);
destination = bufferWriter.GetSpan(source.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #27007 (comment)

Aside from the first time through the loop, this will always be true (since destination = default). Consider special casing the first iteration outside the loop.

Any thoughts on this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, looks pretty bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seperate the while loop into a different function so this just contains the fast path, then tail calls into the multi write while in else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benaadams it not exactly about that. Comment is about destination.Length check in every iteration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, looks pretty bad.

How so? Something like this seems reasonable to me:

        if (source.Length > 0)
        {
            if (destination.Length == 0)
            {
                destination = bufferWriter.GetSpan(source.Length);
            }

            int writeSize = Math.Min(destination.Length, source.Length);
            source.Slice(0, writeSize).CopyTo(destination);
            bufferWriter.Advance(writeSize);
            source = source.Slice(writeSize);

            while (source.Length > 0)
            {
                destination = bufferWriter.GetSpan(source.Length);

                writeSize = Math.Min(destination.Length, source.Length);
                source.Slice(0, writeSize).CopyTo(destination);
                bufferWriter.Advance(writeSize);
                source = source.Slice(writeSize);
            }
        }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed end to while (false); 😄

Copy link
Member

@benaadams benaadams Feb 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the continue seems redundant.

No need to either clear span (= default ); or get a new span if there is nothing thing left to write; so it just brings the loop condition internal plus a getting new span.

i.e. if there is more in source (so would loop again) its implicit that the destination is full, so you need a new span and no explicit checks are needed

Copy link
Member

@ahsonkhan ahsonkhan Feb 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed end to while (false);

With while(false), I understand why the continue is necessary.

No need to either clear span (= default ); or get a new span if there is nothing thing left to write; so it just brings the loop condition internal plus a getting new span.

I don't understand what you mean. I don't see any unnecessary work in the last iteration of the loop. To me, the continue is necessary for correctness.

Copy link
Member

@benaadams benaadams Feb 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while (source.Length > 0) // if we got here (first time) we already know source.Length > 0
{
    if (destination.Length == 0) // unnecessary check other than first iter
                                 // hence move out, and switch to do loop
                                 // doing prechecks only on destination.Length with if block
    //...
    destination = default;       // unnecessary clear with loop change
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was comparing to current

@pakrym pakrym changed the title [WIP] TryGet for ReadOnlysequence TryGet for ReadOnlysequence Feb 23, 2018
@pakrym pakrym removed the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Feb 23, 2018
@pakrym
Copy link
Author

pakrym commented Feb 24, 2018

@dotnet-bot test Windows x64 Debug Build
@dotnet-bot test Windows x86 Release Build

public class ReadOnlySequenceTryGetTests
{
[Fact]
public void Ctor_Array_Offset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for empty array (i.e. ReadOnlySequence<byte>(new byte[0]))?

I don't see how this PR resolves the issue that @davidfowl mentioned here - https://github.com/dotnet/corefx/issues/27224

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't that's why I said that it's not a but in the issue you linked. This PR adds a way that should be use to get segment/index/array/memory which is TryGet* methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it isn't a bug. Even though they are marked EditorBrowsableState.Never, these are public methods on a public type that anyone can call.

More importantly, ToString() is not hidden, and hence, if someone calls ToString() or looks at the state of SequencePosition in the debugger, the index value will not make any sense.

var ownedMemory = new CustomMemoryForTest<byte>(new byte[] { 1, 2, 3, 4, 5 }, 0, 5);
var buffer = new ReadOnlySequence<byte>(ownedMemory, 2, 3);
Console.WriteLine(buffer.Start.ToString());
// Actual: System.MemoryTests.CustomMemoryForTest`1[System.Byte][-2147483646]
// Expected: System.MemoryTests.CustomMemoryForTest`1[System.Byte][2]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design of SequencePosition is such that if you couldn't read it's properties outside ReadOnlySequence. You are right that having ToString makes less sense in this case so it should be removed.

We should look into adding DebuggerDisplay for ReadOnlySequence that would decode and display segments and indexes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed to string and renamed fields not to have index or segment in names.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source.Slice(0, writeSize).CopyTo(destination);
bufferWriter.Advance(writeSize);
source = source.Slice(writeSize);
destination = default;
destination = bufferWriter.GetSpan(source.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain what this change fixes? What was wrong with the previous implementation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fixes

Aside from the first time through the loop, this will always be true (since destination = default). Consider special casing the first iteration outside the loop.

Without special casing anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also does a superfluous GetSpan via interface when leaving the loop?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but now we are doing a lot of extra work if writeSize == 0 in the first iteration.

All these should be no-op when writeSize == 0 (which only happens if destination.Length == 0).

source.Slice(0, writeSize).CopyTo(destination);
bufferWriter.Advance(writeSize);
source = source.Slice(writeSize);

In terms of optimization, this approach from @benaadams is pretty good since there aren't any unnecessary operations when not needed:
#27229 (comment)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benaadams Ugh, you are right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsonkhan what was it you thought I should look at?

Copy link
Member

@ahsonkhan ahsonkhan Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method in BuffersExtensions which calls WriteMultiSegment:
https://github.com/dotnet/corefx/pull/27229/files#diff-321812293ad3b9aa1516f25eb68f8fbeR61

public static void Write<T>(this IBufferWriter<T> bufferWriter, ReadOnlySpan<T> source)

Is there a cleaner way to implement this that is just as optimal (or better)?

Comment thread for context: #27229 (comment)

Edit: Original implementation - https://github.com/pakrym/corefx/blob/4b59d6a9ce135f27dcb46b3b73c79665d4ab2620/src/System.Memory/src/System/Buffers/BuffersExtensions.cs#L61

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a cleaner way to implement this that is just as optimal (or better)?

Yes, with a second ref struct that holds the span as state to avoid the virtual calls https://github.com/aspnet/Common/blob/46abbd1a136ae4a4bb93d17759ac4c0bbf59dee4/shared/Microsoft.Extensions.Buffers.Sources/BufferWriter.cs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would help with a single Write, but if you have multiple of those BufferWriter is the way to go.

Copy link
Member

@benaadams benaadams Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, extension Write is for discoverability/convenience (else people will wonder how they write to it); then second struct is for perf. Will likely be other versions like an encoding writer (ala TextWriter/StreamWriter etc)?

@pakrym
Copy link
Author

pakrym commented Feb 26, 2018

@dotnet-bot test Linux x64 Release Build
@dotnet-bot test Windows x64 Debug Build

@pakrym
Copy link
Author

pakrym commented Feb 26, 2018

@ahsonkhan anything else you want to be addressed in this RP?


internal bool TryGetArray(out ArraySegment<T> array)
{
if (Start.GetObject() == null || GetSequenceType() != SequenceType.Array)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to cache the object into a local instead of calling Start.GetObject() multiple times? Or since it is just a lightweight accessor of the private field, it doesn't matter here.

The same would apply to GetInteger.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementations are so tiny, they should just get inlined.

Copy link
Member

@ahsonkhan ahsonkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pakrym pakrym merged commit aca4676 into dotnet:master Feb 26, 2018
@karelz karelz added this to the 2.1.0 milestone Mar 10, 2018
Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corefx that referenced this pull request Nov 15, 2019
…et#27229)

Port dotnet/coreclr#27210 to release/3.1

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants