Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider stronger hash codes #39

Closed
davidmorgan opened this issue Nov 11, 2019 · 8 comments
Closed

Provider stronger hash codes #39

davidmorgan opened this issue Nov 11, 2019 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@davidmorgan
Copy link

I was looking for ways to ease operator == and hashCode in Dart and found equatable.

It seems a reasonable approach overall.

But, looking at the implementation, it looks like it uses XOR to combine hashes. This would mean, for example, that

class Point {
  final int x = 3;
  final int y = 4;
}

and

class Point {
  final int x = 4;
  final int y = 3;
}

will hash to the same value. Did I miss something?

I think the correct approach would be to use something like

https://en.wikipedia.org/wiki/Jenkins_hash_function

--which is what package:quiver uses

https://github.com/google/quiver-dart/blob/master/lib/src/core/hash.dart

Good news is it should be possible to add this within the current implementation with no changes to how it's used :)

Thanks!

@felangel
Copy link
Owner

Hi @davidmorgan 👋
Thanks for opening an issue!

You bring up a great point and we can definitely make this improvement. I would love to review a pull request if you want to contribute or alternatively, I can try to make this adjustment over the next few days 👍

Thanks again!

@felangel felangel added the enhancement New feature or request label Nov 11, 2019
@davidmorgan
Copy link
Author

I currently have a cold so I'm sticking to email right now ;)

BTW, the reason I was looking into this is I'm considering proposing a lint recommending that people not implement operator== and hashCode by hand; I was looking for libraries to point people to as better. I wrote built_value which is one such library, but I didn't want to just point people to my own library :D

I wonder if you know of any others?

@felangel
Copy link
Owner

felangel commented Nov 14, 2019

@davidmorgan hope you feel better! That's awesome, I think a lint would be really nice to have. I honestly only know of built_value and equatable as well 😛.

BTW I was thinking about your proposal and am wondering if there's a concrete use-case where XOR would not work? I agree that it theoretically doesn't satisfy all cases, but given the props override is part of the class implementation I don't think there would be a problem unless props dynamically builds the list and the order changes at runtime and even then runtimeType is included in the hashCode generation so it shouldn't be a problem.

I created a gist to test this behavior and it seems to work fine unless I'm missing something.

Thoughts?

@felangel felangel added the waiting for response Waiting for follow up label Nov 14, 2019
@felangel felangel self-assigned this Nov 14, 2019
@davidmorgan
Copy link
Author

The problem is that naturally occurring data might lead to a large number of hash collisions.

For example, if you have a Set of Point objects that represent two diagonal lines:

(1, 2), (2, 4), (3, 6), (4, 8), (2, 1), (4, 2), (6, 3), (8, 4)

then these eight objects hash to only four hash codes.

Another natural example would be if you have one field for first name, one field for last name; and then someone called Morgan David hashes to the same code as David Morgan.

Jenkins hash should be very fast to compute--just a little slower than xor by itself--but distributes the hash keys much more evenly, pretty much removing cases where particular patterns in the data can lead to hash collisions. (i.e. the data that leads to hash collisions looks like random data, and you can only hit it either by coincidence--rare enough that it doesn't matter--or by malicious input crafted to cause hash collisions--which is a whole other conversation).

@felangel
Copy link
Owner

Makes sense, thanks for clarifying 💯

@felangel felangel mentioned this issue Nov 17, 2019
3 tasks
@felangel felangel removed the waiting for response Waiting for follow up label Nov 17, 2019
@felangel
Copy link
Owner

Done in #40 and will be released as part of v1.0.0

@dsrenesanse
Copy link
Contributor

Thank you @davidmorgan for improvements!

@davidmorgan
Copy link
Author

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants