-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: IBitwiseEquatable<T> #75642
Comments
Tagging subscribers to this area: @dotnet/area-system-memory Issue DetailsBackground and motivationMost of our vectorized implementations on MemoryExtensions check RuntimeHelpers.IsBitwiseEquatable to determine whether vectorization is feasible. Today, IsBitwiseEquatable is hardcoded to a finite list of primitive types; #75640 extends that further, but overriding struct MyColor : IEquatable<T> { ... }
...
instead of:
```C#
ReadOnlySpan<MyColor> colors = ...;
MyColor value = ...;
return colors.IndexOf(value); it's something more like: ReadOnlySpan<MyColor> colors = ...;
MyColor value = ...;
return MemoryMarshal.Cast<MyColor, int>(colors).IndexOf(Unsafe.As<MyColor, int>(ref value)); and most importantly, it's something someone has to write code for on each use rather than it "just work"ing. If we instead expose a way for a type to be annotated as "You can trust that my Equals override and/or IEquatable implementation are identical to bitwise equality semantics", then we can special-case IsBitwiseEquatable to recognize this annotation, and everything else just lights up. API Proposalnamespace System;
public interface IBitwiseEquatable<T> : IEquatable<T>
{
// no additional methods, it's just a marker interface
} API Usagestruct MyColor : IBitwiseEquatable<MyColor>
{
private byte R, G, B, A;
public bool Equals(MyColor other) => R == other.R && G == other.G && B == other.B && A == other.A;
public override bool Equals(object obj) => obj is MyColor c && Equals(c);
public override int GetHashCode() => HashCode.Combine(R, G, B, A);
} Alternative DesignsNo response RisksIt's yet one more thing a developer implementing equality would need to think about. We'd probably want the C# compiler to emit it in some situations for records, and anything we do around #48733 should also factor it in.
|
The biggest issue is that bitwise equality is not just checking all fields, floating point data blocks it too but the thing that'd break it the most is padding between struct fields. With it being platform specific, it's impossible to verify it with an analyzer reliably. I could rather see the VM being able to check it at runtime or going the other way around with the interfaces and having |
It wouldn't need to verify perfectly. It could flag provably wrong usage. |
An attribute could force some sort of Auto layout with Pack 1 for it, although it'd be a bit awkward and wouldn't play nice with platforms that don't support unaligned loads. |
Would it be viable for this interface or attribute to indirectly but publicly expose https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/comutilnative.cpp#L1642 ? So that violating |
To be clear, my intent wasn't to say "dear runtime, please always use memcmp regardless of whether it's valid", but rather "dear runtime, i'm ok if you use memcmp instead of my Equals". The latter doesn't require the runtime to use memcmp but rather just let's it, and it wouldn't if the type wasn't ammenable, just as it doesn't today in ValueType.Equals if the type's layout isn't ammenable. The analyzer I mention would be to help flag cases where they'd obviously have different semantics, e.g. Equals was only comparing one of multiple fields. |
Pattern-match the IL in the Equals method. No public APIs needed, just works, even for existing code. |
That sounds fairly fragile, but if we can make it work well enough on enough cases, sure. Do we have existing examples outside of the JIT/compiler where we do that? |
E.g. ILLinker has quite a bit of IL pattern matching. |
Thanks, sure, I meant examples in the VM...? |
I do not see anything like this in the VM currently. We used to have it in the past (e.g. for CER). It is not that hard to add ad-hoc IL parser. Or if we want to have all (fragile) optimizations under src\clr\jit, I guess we can figure out a way how to teach the JIT to do this sort of analysis. |
Pattern matching would be awesome. Especially for countless structs with just one field. But how will this translate to C#? The following is still not out of the box available: new ConsoleColor[0].AsSpan().IndexOfAnyExcept(ConsoleColor.Red) Would it mean the type constraints I found the current version of C# is actually able to distinguish between If [MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int IndexOf<E>(this ReadOnlySpan<E> span, E other) where E : unmanaged, Enum => Unsafe.SizeOf<E>() switch {
1 => MemoryMarshal.Cast<E, byte >(span).IndexOf(Unsafe.As<E, byte >(ref other)),
2 => MemoryMarshal.Cast<E, ushort>(span).IndexOf(Unsafe.As<E, ushort>(ref other)),
4 => MemoryMarshal.Cast<E, uint >(span).IndexOf(Unsafe.As<E, uint >(ref other)),
8 => MemoryMarshal.Cast<E, ulong >(span).IndexOf(Unsafe.As<E, ulong >(ref other)),
_ => throw new InvalidCastException()
}; Poke me if you want a copy of mine. |
|
OK you're already considering that. Still, enums are a bit of a lost child imho here again. Here the JIT takes care of it and that is what Is this as a larger issue on the radar anywhere? Or is it considered not possible? Is it just ECMA blocking any improvement in smarts about enums by the compiler/jit/runtime? Could they get |
@jkotas, how strongly do you feel about this approach? Looking at marking this as Notably the two approaches don't seem mutually exclusive, so we could have bother the marker interface for explicit opt-in and eventually add some VM support for implicit light up as well. |
If we believe that the pattern matching approach is viable, marking interface that has to be manually implement by users is unnecessary baggage. I do not think it makes sense to add this baggage as a quick fix that we will have to live with forever. We are here for the long run. Also, the interface as proposed is not expressive enough. It does not compose using generics. For example, it is not possible to express that |
We'll have many scenarios where the arbitrary IL analysis is "expensive", and potentially more expensive than any of the analysis we're doing today given it will typically require inlining and recursive analysis of each inlinee and I expect this cost will be the most prohibitive factor. Consider for example that The pattern matching will also get tripped up on some code. Consider some of the optimizations we've done in our own code to ensure things are maintainable or performant. That is, we have code that specializes per
I think we're going to hit some limitations no matter which approach we go with. The marker-interface based approach allows opt in, but as you mentioned can't readily represent things like An attribute based approach is slightly more extensible. You could imagine having You could also imagine having some well known There are many options, and none are perfect, but I expect we'll need more than just one given the complexities involved in such checks. |
The analysis should be significantly cheaper than JITing the Equals method and everything called by it.
I am not convinced about it. We do not need to do the optimization for 100% of the cases - we never do. It is enough to address 90+% of the cases and the IL pattern matching should be good enough for that. People should be always able use workarounds like |
Could you elaborate? It's certainly cheaper than jitting, but it basically requires doing the initial import of the IL and inlining of other methods to get basic control flow handled. That also involves one of the more expensive parts of inlining, which is all the token resolution. |
We will JIT or even inline the Equals method today and we are fine with how much it costs. I am saying that paying a fraction of this cost to perform this analysis is acceptable, it is not prohibitively expensive. |
👍, if we were to do this I think it would be overall simpler to do it in the JIT where we already have the infrastructure to produce some basic blocks and do simple control flow analysis. It may also fit into the long-standing issue/request to be able to "cache" simple heuristics about the compiled or analyzed code. Knowing if an One could imagine such heuristics being cached as part of crossgen as well, much as usage of |
Would maybe introducing this API: public static class RuntimeHelpers
{
public static bool BitwiseEquals<T>(T left, T right) where T : unmanaged
} and then pattern patching public bool Equals(Guid guid) => RuntimeHelpers.BitwiseEquals(this, guid); simplify things here? That way we wouldn't need to detect complex equals logic and the JIT would be able to emit the most efficient thing it can. |
Background and motivation
Most of our vectorized implementations on MemoryExtensions check RuntimeHelpers.IsBitwiseEquatable to determine whether vectorization is feasible. Today, IsBitwiseEquatable is hardcoded to a finite list of primitive types; #75640 extends that further, but overriding
Equals
/ implementingIEquatable<T>
opts you out again. Developers can still get the vectorization benefits of these implementations, but it's awkward and unsafe, e.g. with a type like:instead of:
it's something more like:
and most importantly, it's something someone has to write code for on each use rather than it "just work"ing.
If we instead expose a way for a type to be annotated as "You can trust that my Equals override and/or IEquatable implementation are identical to bitwise equality semantics", then we can special-case IsBitwiseEquatable to recognize this annotation, and everything else just lights up.
API Proposal
API Usage
Alternative Designs
It could also be an attribute, e.g.
that could then be applied to the type, e.g.
Risks
It's yet one more thing a developer implementing equality would need to think about. We'd probably want the C# compiler to emit it in some situations for records, and anything we do around #48733 should also factor it in. We might also want analyzers that try to flag types which implement IBitwiseEquality but don't integrate all fields into its Equals, don't do a simple field-by-field comparison or unsafe-cast-memcmp in their Equals, etc.
The text was updated successfully, but these errors were encountered: