-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hashcode implementation proposal for DataClassificationSet #4933
Conversation
...braries/Microsoft.Extensions.Compliance.Abstractions/Classification/DataClassificationSet.cs
Outdated
Show resolved
Hide resolved
public override int GetHashCode() => _classifications.GetHashCode(); | ||
public override int GetHashCode() | ||
{ | ||
int hash = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please make this ode conditional of #if NETFRAMEWORK
and then add separate code that uses this API on .NET Core?
https://learn.microsoft.com/en-us/dotnet/api/system.hashcode.add?view=net-8.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if maybe we should compute the hash code just once and store it in a field of the DataClassificationSet struct? I worry about the cost of repeated enumeration, and the fact enumeration of a HashSet is not guarantee to return the sequence in the same order every time. Realistically, since the set is not mutated once initialized, it's highly likely that enumeration will always be consistent. But who knows, maybe somebody will change HashSet in some clever way in the future which breaks this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for feedback!
About first point - makes perfect sense, I'll rewrite to use HashCode.Add.
About the second point - the fact that HashSet is not guaranteed to return the sequence in the same order should not be much of a problem as long as we use commutative operations to calculate the hashcode (such as XOR) - correct? With those, we should be able to calculate the hashcode in order-independent way.
For performance reasons it would definitely make sense to compute the hashcode just once, since it is used as a key in a dictionary. For that we could consider either lazy hashcode calculation in the GetHashCode method or calculate it during object initialization (in the constructor). Because of thread-safety concerns, I would prefer to calculate it in constructor.
I will send a commit that addresses this - please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix!
Fixes: #4932
Microsoft Reviewers: Open in CodeFlow