-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
murmur3 support #150
murmur3 support #150
Conversation
murmur3 was removed until someone had time to document and get their story straight about what the murmur3 codec actually means. It's a bit of a mess (#135 (comment)). I would like to not merge this in until we have the definition of what is meant by murmur3 specced out. Adding test vectors here like we have for some of the other hash functions would be really appreciated. |
the downside of not having something here is that it's going to surround all the use cases with ugliness instead e.g. it's already there: https://github.com/ipfs/go-unixfsnode/blob/main/hamt/util.go#L108-L112 |
) | ||
|
||
func init() { | ||
multihash.Register(multihash.MURMUR3_128, func() hash.Hash { h := murmur3.New128(); return h }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the goal is to make sure we have a description of what unixfsv1 does internally, per A and B, it appears the function desired is New64
, not New128
.
Of course, that's not what multihash indicator code 0x22
is stated to be in the multiformat table today. 🤷
We have the choice between one of several different kinds of ugliness. There is no non-ugly pick as far as I can tell :) There's a couple different pieces of history that are contradictory. If we want to stop avoiding the problem, we have to pick one of them to break.
I think we should probably do the third, there. The indicator codes have a lot more inertia than the textual names do, because they appear in data in the wild. The code names have very minimal inertia by comparison. So I propose we "break" the name and change it to a correct one in order to fix this mess. EDIT: ah, I didn't see another discussion reach the same conclusion, but it appears it has in multiformats/multicodec#236. Super good. We're all in agreement then, sounds like. |
This is updated to naming chosen upstream. Please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple small nits but otherwise LGTM.
} | ||
|
||
// A wrapper is needed to export the correct size, because murmur3 incorrectly advertises Hash64 as a 128bit hash. | ||
type murmur64 struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe this type name should be as verbose as the constants became, for consistency?
func (murmur64) BlockSize() int { | ||
return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks slightly funny. I can't really fathom any harm to it, though.
... but I went on a splunk anyway, and I'd say 16 is probably an accurate number. citation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm trying to stay with what it would report natively: https://github.com/spaolacci/murmur3/blob/539464a789e9b9f01bc857458ffe2c5c1a2ed382/murmur.go#L31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh. Remains weird to me that they said it that way, but okay.
No description provided.