Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make System.Guid readonly #44629

Merged
merged 5 commits into from
Nov 17, 2020

Conversation

GrabYourPitchforks
Copy link
Member

This is a follow-up to #1809. It contains two changes: (a) make System.Guid readonly (+ some minor code cleanup); and (b) expose Unsafe.Unbox<T> for internal use, now that Guid can use it as an implementation detail. :)

One piece of feedback given in #1809 was that we shouldn't use the internal MutableGuid type. I've changed the call sites here to use Unsafe.AsRef, but to be honest I think this is introducing noise and that MutableGuid is a cleaner solution. If we don't like MutableGuid, then perhaps we can clone all of the GUID fields into the existing mutable type GuidResult, then introduce an API GuidResult.ToGuid() which is basically a glorified reinterpret_cast.

I think I hit all the needed places to shim Unsafe.Unbox. From investigating other calls, as long as we have a rudimentary implementation in Unsafe.cs we don't need to enlighten the mono interpreter.

- Also adds Unsafe.Unbox for internal use
@jkotas
Copy link
Member

jkotas commented Nov 13, 2020

we can clone all of the GUID fields into the existing mutable type GuidResult

I like this idea. GuidResult can actually have just 4 ints (instead of the bytes and shorts from Guid). It may make the parsing tiny bit faster - we won't need to split the parsed ints into bytes.

// Compare each element

if (rA != rB) { return false; }
if (Unsafe.Add(ref rA, 1) != Unsafe.Add(ref rB, 1)) { return false; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also change this to do the comparison as longs using Unsafe.ReadUnaligned. We had abandoned PR for that: #35654

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why it was abandoned? I see the alignment discussion on that thread, but there doesn't seem to be any clear reason the PR was rejected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess Bruce had more important things to do than to finish it.

Copy link
Member

@EgorBo EgorBo Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this I guess 🙂

private static bool EqualsCore(in Guid left, in Guid right)
{
    ref byte leftBytes = ref Unsafe.As<int, byte>(ref Unsafe.AsRef(in left._a));
    ref byte rightBytes = ref Unsafe.As<int, byte>(ref Unsafe.AsRef(in right._a));

    // early out if not equal
    if (Unsafe.ReadUnaligned<ulong>(ref leftBytes) != 
        Unsafe.ReadUnaligned<ulong>(ref rightBytes))
        return false;

    return Unsafe.ReadUnaligned<ulong>(ref Unsafe.Add(ref leftBytes, 8)) == 
           Unsafe.ReadUnaligned<ulong>(ref Unsafe.Add(ref rightBytes, 8));
}
G_M41817_IG01:
G_M41817_IG02:
       mov      rax, qword ptr [rcx]
       cmp      rax, qword ptr [rdx]
       je       SHORT G_M41817_IG05
G_M41817_IG03:
       xor      eax, eax
G_M41817_IG04:
       ret
G_M41817_IG05:
       mov      rax, qword ptr [rcx+8]
       cmp      rax, qword ptr [rdx+8]
       sete     al
       movzx    rax, al
G_M41817_IG06:
       ret

a SIMD version of it is slightly faster for equal guids but in most cases, I guess, we can "early out"

Also, this code can be branchless even without the SIMD (https://godbolt.org/z/jjM3Gs) but JIT doesn't optimize cmp(x1,x2) && cmp(x3,x4) to cmp(x1,x2) | cmp(x3,x4) yet, but it probably makes no sense to make Guid.Equals branchless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 64-bit comparisons didn't seem to help matters - see #44629 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| Equal_32Bit | 0.9002 ns | 0.0087 ns | 0.0082 ns |
| Equal_64Bit | 1.1797 ns | 0.0109 ns | 0.0102 ns |

Interesting. #35654 had this case as an improvement.

@benaadams
Copy link
Member

Also needs the AsRef for assignments in Guid.Unix.cs?

@GrabYourPitchforks
Copy link
Member Author

Here's the perf run from Guid.Equals(in Guid a, in Guid b) my machine, where the equality operation was using 32-bit comparisons, 64-bit comparisons, branchless 64-bit comparisons, and 128-bit comparisons (SIMD). This assumes that the equality operator function is not being inlined. The branch predictor has also been primed.

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen 9 3950X, 1 CPU, 32 logical and 16 physical cores
.NET Core SDK=5.0.100
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT
  DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT


|                   Method |      Mean |     Error |    StdDev |
|------------------------- |----------:|----------:|----------:|
|              Equal_32Bit | 0.9002 ns | 0.0087 ns | 0.0082 ns |
|              Equal_64Bit | 1.1797 ns | 0.0109 ns | 0.0102 ns |
|    Equal_64BitBranchless | 0.8910 ns | 0.0054 ns | 0.0050 ns |
|               Equal_Sse2 | 0.9096 ns | 0.0080 ns | 0.0075 ns |
|           NotEqual_32Bit | 1.1562 ns | 0.0082 ns | 0.0073 ns |
|           NotEqual_64Bit | 1.1845 ns | 0.0091 ns | 0.0085 ns |
| NotEqual_64BitBranchless | 0.8972 ns | 0.0091 ns | 0.0086 ns |
|            NotEqual_Sse2 | 0.9080 ns | 0.0069 ns | 0.0058 ns |

I'm not seeing any measurable improvement in using 64-bit comparisons over 32-bit comparisons. The branchless / SIMD versions can be a little better, but now we're operating in the picosecond range. Doesn't seem worth the complexity to me.

@GrabYourPitchforks
Copy link
Member Author

On the plus side, I experimented with replacing Guid.GetHashCode() with one-round AES, since a 128-bit Guid is perfectly sized to mimic one block of plaintext. And this knocked the GetHashCode runtime down to 0.25 nanoseconds on my box. 'Twas a fun experiment, and now the Crypto Board will stare daggers at me for the rest of my days.

@GrabYourPitchforks
Copy link
Member Author

With latest iteration, I tried a hybrid of the "move all the fields to GuidResult" and "is it possible to make GuidResult contain only 4 ints?" approaches. This hybrid approach uses [FieldOffset] so that both the "parse D" and "parse X" logic feel somewhat natural and don't end up polluted with width-changing Unsafe.As calls everywhere.

If this approach isn't preferred, we can investigate alternatives.

@GrabYourPitchforks GrabYourPitchforks marked this pull request as ready for review November 14, 2020 01:37
g._d = (byte)(uintTmp >> 8);
g._e = (byte)uintTmp;
// _d, _e must be stored as a big-endian ushort
result._de = (BitConverter.IsLittleEndian) ? BinaryPrimitives.ReverseEndianness((ushort)uintTmp) : (ushort)uintTmp;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result._de = (BitConverter.IsLittleEndian) ? BinaryPrimitives.ReverseEndianness((ushort)uintTmp) : (ushort)uintTmp;
result._de = BitConverter.IsLittleEndian ? BinaryPrimitives.ReverseEndianness((ushort)uintTmp) : (ushort)uintTmp;

Nit - multiple places. I believe the usual style for src/libraries is without the extra ( ) in this context.

(BitConverter.IsLittleEndian) ? - 27 occurences under src/libraris
BitConverter.IsLittleEndian ? - 3 occurrences under src/libraries

Copy link
Member Author

@GrabYourPitchforks GrabYourPitchforks Nov 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is an argument to add BinaryPrimitives.htonl, which would encapsulate the check and make the call sites cleaner? Right now I think this API sits on IPAddress, which is the wrong layering.

Edit: We considered this in #29222 and rejected it.

@jkotas
Copy link
Member

jkotas commented Nov 14, 2020

This hybrid approach uses [FieldOffset] so that both the "parse D" and "parse X" logic feel somewhat natural and don't end up polluted with width-changing Unsafe.As calls everywhere

I like this hybrid approach.

@GrabYourPitchforks
Copy link
Member Author

GrabYourPitchforks commented Nov 16, 2020

Latest iteration: removed internal Unsafe.Unbox implementation, removed Unsafe.As wherever I could by re-typing the internal APIs, code cleanup w.r.t. unnecessary parens.

Crossing my fingers and hoping CI cooperates today. :)
(Epilogue narration: Levi's optimism was indeed misplaced.)

@GrabYourPitchforks
Copy link
Member Author

Thanks @jkotas for the feedback. Much appreciated! :)

@GrabYourPitchforks
Copy link
Member Author

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@GrabYourPitchforks GrabYourPitchforks merged commit 43f63d7 into dotnet:master Nov 17, 2020
@GrabYourPitchforks GrabYourPitchforks deleted the guid_readonly branch November 17, 2020 04:36
@ghost ghost locked as resolved and limited conversation to collaborators Dec 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants