Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow fine grained deep control of Record Equality #3835

Closed
YairHalberstadt opened this issue Sep 1, 2020 · 18 comments
Closed

Allow fine grained deep control of Record Equality #3835

YairHalberstadt opened this issue Sep 1, 2020 · 18 comments

Comments

@YairHalberstadt
Copy link
Contributor

YairHalberstadt commented Sep 1, 2020

Problem Statement

Currently if you want to modify how record equality works you need to define the Equals method manually. If you need to change something deep in a set of nested records, you need to manually write the equality method for all of them. This commonly occurs if you want to compare collections using value rather than reference equality.

For example lets say you have the following types:

record A(ImmutableArray<B> Bs, string S) {}
record B(ImmutableArray<string> Strings, int I) {}

And you wanted to compare the ImmutableArrays via value equality, you would have to do this:

public static bool Equals(A a1, A a2)
{
     return a1.Bs.Length == a2.Bs.Length && a1.Bs.Zip(a2.Bs, (b1, b2) => Equals(b1, b2)).All(x => x) && a1.S == a2.S;
}

public static bool Equals(B b1, B b2)
{
     return b1.Strings.Length == b2.Strings.Length && b1.Strings.Zip(b2.Strings, (s1, s2) => s1 == s2).All(x => x) && b1.I == b2.I;
}

Which is a lot of boilerplate, none of which is reusable if you then want to e.g. now also compare strings via OrdinalIgnoreCase.

There is no way to take an existing implementation of Equals and to tweak it slightly using custom rules.

Suggested solution

We essentially use a form of the visitor pattern to allow this:

Update IEquatable<T> as follows:

namespace System
{
    public interface IEquatable
    {
        bool Equals(object? other, IEqualityComparer equalityComparer);
        int GetHashCode(IEqualityComparer equalityComparer) => GetHashCode();
    }
    
    public interface IEquatable<T> : IEquatable
    {
        bool Equals(T? other);
        bool Equals(T? other, IEqualityComparer equalityComparer) => Equals(other);
        bool IEquatable.Equals(object? other, IEqualityComparer equalityComparer) => other is T t && Equals(t, equalityComparer);
    }
}

Add an abstract class EqualityComparer :

namespace System
{
    public abstract class EqualityComparer : IEqualityComparer, IEqualityComparer<object>
    {
        public virtual new bool Equals(object? x, object? y)
        {
            if (x is IEquatable equatable)
                return equatable.Equals(y, this);
            return EqualityComparer<object>.Default.Equals(x, y);
        }

        public virtual int GetHashCode(object obj)
        {
            if (obj is IEquatable equatable)
                return equatable.GetHashCode(this);
            return obj.GetHashCode();
        }
    }
}

When Generating records, add an implementation for Equals(T? other, IEqualityComparer equalityComparer). E.g. in our example given above it would generate:

record A(ImmutableArray<B> Bs, string S) : IEquatable<A>
{
    public virtual bool Equals(A? other) => Equals(other, EqualityComparer<object>.Default);
    public virtual bool Equals(A? other, IEqualityComparer equalityComparer)
    {
        return other is not null && equalityComparer.Equals(Bs, other.Bs) && equalityComparer.Equals(S, other.S);
    }
    public override int GetHashCode() => GetHashCode(EqualityComparer<object>.Default);
    public virtual int GetHashCode(IEqualityComparer equalityComparer)
    {
        return HashCode.Combine(equalityComparer.GetHashCode(Bs), equalityComparer.GetHashCode(S));
    }
}

record B(ImmutableArray<string> Strings, int I) : IEquatable<B>
{
    public virtual bool Equals(B? other) => Equals(other, EqualityComparer<object>.Default);
    public virtual bool Equals(B? other, IEqualityComparer equalityComparer)
    {
        return other is not null && equalityComparer.Equals(Strings, other.Strings) && equalityComparer.Equals(I, other.I);
    }
    public override int GetHashCode() => GetHashCode(EqualityComparer<object>.Default);
    public virtual int GetHashCode(IEqualityComparer equalityComparer)
    {
        return HashCode.Combine(equalityComparer.GetHashCode(Strings), equalityComparer.GetHashCode(I));
    }
}

Then it is possible for users to write an IEqualityComparer once off which compares all collections correctly and use it everywhere. For example:

public class EnumerableValueComparer : EqualityComparer
{
    protected EnumerableValueComparer(){}

    public static EnumerableValueComparer Instance = new EnumerableValueComparer();

    public override bool Equals(object? x, object? y)
    {
        if (x is IEnumerable enumerableX && y is IEnumerable enumerableY)
        {
            var enumeratorX = enumerableX.GetEnumerator();
            var enumeratorY = enumerableY.GetEnumerator();
            while (true)
            {
                var anyX = enumeratorX.MoveNext();
                if (anyX != enumeratorY.MoveNext())
                    return false;
                if (!anyX)
                    return true;
                if (!base.Equals(enumeratorX.Current, enumeratorY.Current))
                    return false;
            }
        }
            
        return base.Equals(x, y);
    }

    public override int GetHashCode(object obj)
    {
        if (obj is IEnumerable enumerable)
        {
            var hashCode = 13;
            foreach (var item in enumerable)
            {
                hashCode = HashCode.Combine(hashCode, item is null ? 0 : base.GetHashCode(item));
            }
            return hashCode;
        }
        return base.GetHashCode(obj);
    }
}

Then to compare two instances of A using value equality, you can just do:

a1.Equals(a2, EnumerableValueComparer.Instance.

@333fred
Copy link
Member

333fred commented Sep 1, 2020

When we were designing record equality, we considered extension points to allow for deep customization. However, we considered this outside the purview of records themselves as there's so much variation: I want my List<List<string>> to compare the inner strings with case-insensitive invariant culture comparison, for example. Rather, we think that will be an excellent use for a source generator, which can implement all the extension points and customizations it wants and be closely tied to how the framework implements its classes.

@YairHalberstadt
Copy link
Contributor Author

@333fred

This allows the user to provide all those extension points without requiring source generation to change - in fact it solves this problem generically for all types (not just records) albeit it requires adoption before it's valuable. Records are a great way to instantly drive up that adoption.

@YairHalberstadt
Copy link
Contributor Author

YairHalberstadt commented Sep 1, 2020

I want my List<List<string>> to compare the inner strings with case-insensitive invariant culture comparison, for example.

Easy. Just do this (using my example above):

public class MyComparer : EnumerableValueComparer
{
    public override bool Equals(object? x, object? y)
    {
        if (x is String xString && y is String yString)
            return xString.Equals(yString, StringComparison.InvariantCultureIgnoreCase);
        return base.Equals(x, y);
    }

    public override int GetHashCode(object obj)
    {
        if (obj is string str)
            return str.GetHashCode(StringComparison.InvariantCultureIgnoreCase);
        return base.GetHashCode(obj);
    }
}

@HaloFour
Copy link
Contributor

HaloFour commented Sep 1, 2020

@YairHalberstadt

I don't really see how this suggestion helps with customizing the equality comparison of nested values within a record. You're suggesting the ability to override a single IEqualityComparer<T>, but that single comparer applies to the record as a whole, not to the individual fields, nor to the individual fields of any of the individual fields. How would you propose to provide a comparer that would compare those nested values in a List<List<string>> as case-insensitive without comparing all strings as case-insensitive? I think an appropriate extension point to customize equality at the field level would be significantly more involved.

@YairHalberstadt
Copy link
Contributor Author

How would you propose to provide a comparer that would compare those nested values in a List<List> as case-insensitive without comparing all strings as case-insensitive?

My gut feeling is this proposal covers 90%+ of cases where you need to adjust equality comparisons. I think cases where you need to compare some strings one way but other strings another are much rarer.

@HaloFour
Copy link
Contributor

HaloFour commented Sep 1, 2020

@YairHalberstadt

My gut feeling is this proposal covers 90%+ of cases where you need to adjust equality comparisons. I think cases where you need to compare some strings one way but other strings another are much rarer.

A single IEqualityComparer that happens to know how to compare any two things that is applied recursively through the elements of the record and all of the fields of those elements feels like it would cause a lot of unexpected consequences. These differences in comparison should be a much, much more explicit concern.

And given that nominal records are proposed as a replacement for POCOs, I think the need to compare different elements of type string in a different manner will be far from rare. Names vs. IDs.

@YairHalberstadt
Copy link
Contributor Author

And given that nominal records are proposed as a replacement for POCOs, I think the need to compare different elements of type string in a different manner will be far from rare. Names vs. IDs.

But in such cases you should explicitly control equality to use the correct StringComparison anyway. The suggested mechanism only applies to cases where the default equality is considered good enough.

@HaloFour
Copy link
Contributor

HaloFour commented Sep 1, 2020

@YairHalberstadt

But in such cases you should explicitly control equality to use the correct StringComparison anyway. The suggested mechanism only applies to cases where the default equality is considered good enough.

Then why need this proposal at all? If you're writing your own equality comparers anyway why not reuse them within a custom Equals method?

I'm with @333fred , there's too much variation in this space. Records intended to provide value equality in the same vein as structs. As with structs if you wish to influence that equality it's your responsibility to override and provide your own implementation.

@canton7
Copy link

canton7 commented Sep 1, 2020

I agree on the need: almost all of my record-like objects currently have some form of custom equality for at least one of their members. Without the ability to customise the equality used for each member individually, records have very little benefit for me.

However, I don't like this proposal for a few reasons:

  1. Everyone who compares the two record instances for equality has to remember to pass the correct IEqualityComparer. One missed call site and it all falls down. The record itself should be defining how its members are compared, not the caller.
  2. It's a bit of a blunt hammer. You get to say that all members of a particular type should have the same custom equality, but nothing finer. What if different string members need different equality?
  3. I can't see it playing well with e.g. types which currently accept an IEqualityComparer<T>, like Dictionary. You can't re-use your EnumerableValueComparer for that case because it's not generic. Making it generic adds more boilerplate.
  4. Your comparer will have to resort to reflection if e.g. it wants to compare two generic types, and wants to make sure that the generic type arguments are identical.
  5. Your example comparer will box every collection element, as you've lost the ability to use generics.
  6. If you need more than one type of custom equality, you'll end up writing an equality comparer which is specific to the set of types used in your record, and so lose reusability.

I'd like to see some way of customising per-member equality however. There's a bit of a discussion here, which contains the suggestion:

record A(
    [EqualityComparer(typeof(SomeCustomComparer))] ImmutableArray<B> Bs,
    string S) { }

You can also imagine something like:

record A(ImmutableArray<B> Bs, string S)
{
    [Equality(nameof(B))]
    private static bool Equals(ImmutableArray<B> left, ImmutableArray<B> right) => ...;
}

or:

[Equality(nameof(Bs), typeof(SomeCustomComparer))]
record A(ImmutableArray<B> Bs, string S) { }

Granted you can implement the above with a source generator: maybe that's the best way. You lose uniformity across code bases for something which I suspect is going to be quite common, though.

@theunrepentantgeek
Copy link

I can't help but think that the problem here is that the records are incorrectly structured, not that the compiler needs to do anything different.

record B(ImmutableArray<string> Strings, int I) {}

In this example, the need for strings to be compared in a case-insensitive manner screams to me that there's a semantic type buried in the declaration. I'd solve this by extracting the semantic type and being explicit:

record TimeSeriesId(string id) 
{
    public bool Equals(TimeSeriesId other) => StringComparison.InvariantCultureIgnoreCase.Equals(id, other.id);
}

record B(ImmutableArray<TimeSeriesId> SeriesIds, int I) { }

Chances are that TimeSeriesId would be widely used across the codebase, so the benefits in type safety and code reuse would more than justify the trivial effort for implementation.

In short, rather than inventing complex infrastructure that inevitably will only meet 80% of the needs (at best), I think its better to lean into the type system for this level of control.

Also, it seems to me that code like this

public static bool Equals(A a1, A a2)
{
    return a1.Bs.Length == a2.Bs.Length 
        && a1.Bs.Zip(a2.Bs, (b1, b2) => Equals(b1, b2)).All(x => x) 
        && a1.S == a2.S;
}

is a violation of the law of Demeter, just lurking to cause you future pain.

@YairHalberstadt
Copy link
Contributor Author

@HaloFour @canton7

How would you propose to provide a comparer that would compare those nested values in a List<List> as case-insensitive without comparing all strings as case-insensitive?

It's a bit of a blunt hammer. You get to say that all members of a particular type should have the same custom equality, but nothing finer. What if different string members need different equality?

I stand by my response earlier that this is probably rarely needed. However it is actually rather simple to do:

public class MyComparer : EnumerableValueComparer
{
    public override bool Equals(object? x, object? y)
    {
        if (x is List<List<string>> && y is List<List<string>>)
            return InvariantCultureComparer.Instance.Equals(x, y);
        return base.Equals(x, y);
    }

    public override int GetHashCode(object obj)
    {
        if (obj is List<List<string>>)
            return InvariantCultureComparer.Instance.GetHashCode(obj);
        return base.GetHashCode(obj);
    }
}

public class InvariantCultureComparer : EnumerableValueComparer
{
    public override bool Equals(object? x, object? y)
    {
        if (x is String xString && y is String yString)
            return xString.Equals(yString, StringComparison.InvariantCultureIgnoreCase);
        return base.Equals(x, y);
    }

    public override int GetHashCode(object obj)
    {
        if (obj is string str)
            return str.GetHashCode(StringComparison.InvariantCultureIgnoreCase);
        return base.GetHashCode(obj);
    }
}

@HaloFour

Then why need this proposal at all? If you're writing your own equality comparers anyway why not reuse them within a custom Equals method?

I think you misunderstand the intention behind my proposal. This is less about changing the equality for records you own, but more about modifying equality for records you don't own, or adjusting the default equality for records you own for a particular use case.

@canton7

Everyone who compares the two record instances for equality has to remember to pass the correct IEqualityComparer. One missed call site and it all falls down. The record itself should be defining how its members are compared, not the caller.

That's simple to do:

record A(ImmutableArray<B> Bs, string S)
{
    public bool Equals(A? other) => Equals(other, EnumerableValueComparer.Instance);
}

However as I said above this is less about changing the equality for records you own, but more about modifying equality for records you don't own, or adjusting the default equality for records you own for a particular use case.

It's a bit of a blunt hammer. You get to say that all members of a particular type should have the same custom equality, but nothing finer. What if different string members need different equality?

Responded above.

I can't see it playing well with e.g. types which currently accept an IEqualityComparer, like Dictionary. You can't re-use your EnumerableValueComparer for that case because it's not generic. Making it generic adds more boilerplate.

Since IEqualityComparer<T> is contravariant we can just make EqualityComparer implement IEqualityComparer<object>.

Your comparer will have to resort to reflection if e.g. it wants to compare two generic types, and wants to make sure that the generic type arguments are identical.
Your example comparer will box every collection element, as you've lost the ability to use generics.

As I said above this is less about changing the equality for records you own, but more about modifying equality for records you don't own, or adjusting the default equality for records you own for a particular use case. Therefore I think focusing on functionality over performance is appropriate.

If you need more than one type of custom equality, you'll end up writing an equality comparer which is specific to the set of types used in your record, and so lose reusability.

It's actually quite simple to get reusability by wrapping the EqualityComparer around an array of interfaces like this:

public interface EqualityComparerPart
{
    bool IsApplicable(object? obj);
    bool Equals(object? a, object? b, IEqualityComparer equalityComparer);
    bool GetHashCode(object obj, IEqualityComparer equalityComparer);
}

However this issue is long enough as it is without going into the particulars of that design.

@YairHalberstadt
Copy link
Contributor Author

@theunrepentantgeek

In this example, the need for strings to be compared in a case-insensitive manner screams to me that there's a semantic type buried in the declaration. I'd solve this by extracting the semantic type and being explicit:

That was given as a secondary example, and is clearly a made up use case. Besides you may not own the type being compared. Regardless there are countless cases where I have wanted to equate types not using their equals methods, and have been forced to rewrite reams of code to achieve this.

is a violation of the law of Demeter, just lurking to cause you future pain.

That was just an example of the shortest code you would have to write today, not an example of what code you should write.

@quinmars
Copy link

quinmars commented Sep 2, 2020

A little bit off-topic, but you can use SequenceEqual:

public static bool Equals(B b1, B b2)
{
     return b1.Strings.SequenceEqual(b2.Strings) && b1.I == b2.I;
}

// or case insensitive
public static bool Equals(B b1, B b2)
{
     return b1.Strings.SequenceEqual(b2.Strings, StringComparer.OrdinalIgnoreCase) && b1.I == b2.I;
}

@HaloFour
Copy link
Contributor

HaloFour commented Sep 2, 2020

@YairHalberstadt

This is less about changing the equality for records you own, but more about modifying equality for records you don't own, or adjusting the default equality for records you own for a particular use case.

That feels like an exceptionally rare thing to want to do and something that the language and BCL should not be encouraging. Records should provide their own identity and equality. But, if you really, really need to do this, you can already by providing an IEqualityComparer<T> for each of those records that compares the properties of those records directly, in those exceptionally rare cases. In my opinion the fact that it might be tedious would be a good thing; you should think twice about doing it.

@YairHalberstadt
Copy link
Contributor Author

That feels like an exceptionally rare thing to want to do

You clearly have a blessed existence

@HaloFour
Copy link
Contributor

HaloFour commented Sep 2, 2020

@YairHalberstadt

You clearly have a blessed existence

Considered records haven't shipped yet I think it's kind of odd to be claiming that a feature is required to override how their equality is calculated. 😀

As for overriding the equality of other arbitrary types, sure that's sometimes a necessity. That is why IEqualityComparer<T> exists. My question would be why this interface is now no longer sufficient, and why IEquatable needs to be changed in design and behavior to enable this proposal. Is it tedious to write comparers for every type you're comparing? Probably. But you can reuse the component parts, such as an ImmutableArray<T> (and such comparers already exist.)

Lastly, this proposal requires modifications to the BCL, which are not going to happen by .NET 5.0 when records ship. And the BCL has already expressed concern about adding DIM to commonly-used interfaces, refusing to do so with IEnumerable, and I don't imagine their opinion will be different in this case.

@YairHalberstadt
Copy link
Contributor Author

My question would be why this interface is now no longer sufficient,

IEqualityComparer<T> doesn't compose. It was never sufficient. The aim here is to provide a design for composable equality comparing. The reason records are relevant is that the design requires some level of adoption to be useful, and records provides a cheap way to get that level of adoption.

@YairHalberstadt
Copy link
Contributor Author

Closing as records have shipped and this didn't seem very popular anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants