Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override == and hashCode in PbMap #224

Merged
merged 7 commits into from
Apr 2, 2019
Merged

Conversation

szakarias
Copy link
Member

No description provided.

@szakarias szakarias requested a review from sigurdm March 20, 2019 13:40
@@ -9,6 +9,7 @@ environment:
sdk: '>=2.0.0-dev.17.0 <3.0.0'
dependencies:
fixnum: '>=0.9.0 <0.11.0'
quiver: ^2.0.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we do not get further dependencies if we can avoid it. I think it is better to duplicate it.
Hopefully we can have a resolution of dart-lang/sdk#11617 someday, and use core library functionality.

if (other.length != length) {
return false;
}
if (other.hashCode != hashCode) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calculating the hash code during equality testing is slower than just doing structural equality testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much slower if the first key is not in the other map!

It is also exponential in the nesting depth of the PbMaps when the map values contain PbMaps -
you walk the maps for this hashCode call, and again in the != below.

Perhaps you can make a benchmark for PbMaps nested 20x deep to prove we avoid this problem.

}

int get hashCode {
return hashObjects(_wrappedMap.keys
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we should be able to share code with _FieldSet.hashCode and PbList.hashCode

return hashObjects(_wrappedMap.keys
.map((key) => hash2(key.hashCode, _wrappedMap[key].hashCode))
.toList(growable: false)
..sort());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just rely on the default dart map being ordered (it is a LinkedHashMap)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No of course not! - different order of insertion should still result in the same hashcode. I think we can xor the hashcodes of all the entries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hash2 takes two Objects, and calls their hashCodes.
So the calls the hashCode here are unnecessary.

hashObjects takes an Iterable of Objects and calls the hashCodes.

In effect we are doing a lot of x.hashCode.hashCode.
At best that seems inefficient, and may result in a poor hashCode if the underlying combiner doesn't have good dispersal.
At least we can fix the two calls here.

@rakudrama
Copy link
Contributor

Question: why do we need structural == and hashCode for PbMaps when Dart's corelib Map implementations have identity == and hashCode?

return false;
}
for (final key in keys) {
if (other[key] != this[key]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other[key] could be null.
I'd add a comment that this[key] can never be null, so it is not necessary to check other.containsKey(key), since the test will fail for missing key.

As a TODO - It might be more efficient to check all the keys first, since they are usually smaller than the values.

@sigurdm
Copy link
Collaborator

sigurdm commented Mar 22, 2019

Question: why do we need structural == and hashCode for PbMaps when Dart's corelib Map implementations have identity == and hashCode?

It is a really good question.

It is to fit in with the existing protobuf implementation.
Currently PbList has structural == and hashCode, changing that would be a breaking change. It seems weird that PbMap should not have. Especially given that GeneratedMessage also implements structural == and hashCode that recurses into its fields.

I agree this is a wrong design: protobuf messages are by nature open-ended and structural equality is more or less meaningless (unknown fields cannot be compared). But it seems very late to change that now, and the equality checks have some use in writing tests.

@@ -35,4 +35,29 @@ bool _areByteDataEqual(ByteData lhs, ByteData rhs) {
return _areListsEqual(asBytes(lhs), asBytes(rhs));
}

List<T> sorted<T>(Iterable<T> list) => new List.from(list)..sort();

// Jenkins hash functions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide a link to where we loaned this from

@@ -35,4 +35,29 @@ bool _areByteDataEqual(ByteData lhs, ByteData rhs) {
return _areListsEqual(asBytes(lhs), asBytes(rhs));
}

List<T> sorted<T>(Iterable<T> list) => new List.from(list)..sort();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicate


// Jenkins hash functions

int _combine(int hash, int value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either give these a more distinct name or collect them as static methods in a (private) class


expect(m, m2);
expect(m == m2, isTrue);
expect(m.hashCode, m2.hashCode);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a negative test?

@@ -1,4 +1,9 @@
## 16.0.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps name this a dev-version, we don't need to publish a new protoc version.

return true;
}

@override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add dartdoc to pbmap explaining how the hash and equals semantics work.

@@ -35,4 +35,27 @@ bool _areByteDataEqual(ByteData lhs, ByteData rhs) {
return _areListsEqual(asBytes(lhs), asBytes(rhs));
}

List<T> sorted<T>(Iterable<T> list) => List.from(list)..sort();
List<T> sorted<T>(Iterable<T> list) => new List.from(list)..sort();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's deprecate this, and make a private version. It has nothing to do in the exports of protobuf.

Copy link
Collaborator

@sigurdm sigurdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@szakarias szakarias merged commit d17fa16 into google:master Apr 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants