-
Notifications
You must be signed in to change notification settings - Fork 11
Conversation
This hash function is heavily used by Filecoin, making Kubo unable to operate with Filecoin data.
Thank you for submitting this PR!
Getting other community members to do a review would be great help too on complex PRs (you can ask in the chats/forums). If you are unsure about something, just leave us a comment.
We currently aim to provide initial feedback/triaging within two business days. Please keep an eye on any labelling actions, as these will indicate priorities and status of your contribution. |
(additional filecoin hashes)
@hsanjuan have you tried this in Kubo ? I don't see a registered hash in https://github.com/multiformats/go-multihash I think that is just gonna error anyway. |
@hsanjuan note that cc @porcuquine @vmx |
I have tried it. It must be registered because this imports go-multihash and works? |
Yes, though I think those parameters are well defined (somewhere :). |
To clarify, things break as soon as one of these CIDs is "seen". This is before the potential step where a block referenced by it is verified (I don't know what error it will throw in that case since these blocks are not retrievable). In general my feeling is that ipfs may not know how to hash a block referenced like this (??), but that doesn't mean the safeguards against "insecure" hashes should kick in. I'm sure this can error at a different place just fine. |
But this is terrible... If a block content addressed with sha256 links to an md5 hash and a murmru "hash", there is nothing wrong with that block itself. @Jorropo @aschmahmann did we graft the verifcid integration incorrectly somewhere...? |
IIUC everything is as it's been in this place for as long as verifcid has existed https://github.com/ipfs/go-blockservice/blob/986faee89be3cef0e742cabda49bdc1d95050d6f/blockservice.go#L232.
@hsanjuan are you sure that's right? Pretty sure this error isn't happening on parse but on before a fetch where you try to pin (i.e. fetch) an entire DAG which includes some blocks go-verifcid doesn't like. Note that even if go-verifcid was fine did kubo would be unable to fetch the over Bitswap due to a failure in https://github.com/ipfs/go-libipfs/blob/6c0f842feed4c8a0ef24d38c94b6e28e622f25d7/bitswap/message/message.go#L222 because there is no hasher registered for go-multihash to use (prefix.Sum calls multihash.Sum which brings us to https://github.com/multiformats/go-multihash/blob/38400376eaefaa6216d075b466037e75437c2838/core/registry.go#L97 It seems like what would happen here if you allowed go-verifcid to allow these hashes you can't understand anyway is that your pin request would just hang until you cancelled it while in the meanwhile the blockservice (and then Bitswap) keep wasting your resources endlessly looking for a block you would never be able to use even if you could find it. AFAICT there isn't anyone who currently benefits from this change and so reverting it would probably be the correct move. If someone actually runs into a problem here that this PR fixes we can always reapply it. In the meanwhile it's safer and more user friendly to error early than to just hang forever looking for a block you can't possibly get or verify. Is this about right @hsanjuan? |
Agreed that it could, for example you could replace most of the calls to go-verifcid.ValidateCIid with something that looks approximately like Maybe it's more reasonable to have a new error type here alongside "potentially insecure hash function" and complaints about hash size being too big or small for "not a registered hash function". Note: a hash being too big isn't an "insecure one" either, just one that's not larger than reasonable and nobody has yet had a use for. This would mean go-verifcid is coupled to some go-multihash internals (i.e. global registry) which is unfortunate, although go-verifcid already is a singleton and has a dependency on go-multihash (although one that could be replaced currently with https://github.com/multiformats/go-multicodec). @hsanjuan do you have a preference for which approach you'd like to take here? |
Under the go-verifcid rules as they were set when go-verifcid was established, these hashes are "good". The primary goal of go-verifcid was to disallow hashes which do not provide second-preimage resistance. Answering Riba's comment directly:
You can verify that some "blob" corresponds to the hash but the padding has to be already applied to the blob. On the other hand, the purpose of go-verifcid can change so I will leave it to you all as the evolution of Kubo's components is outside of my scope. |
This seems like an issue worth fixing too. Is it only POSEIDON_ that needs special params for hashing? If so, there should be nothing preventing me from adding and retrieving blocks using the other two hashing functions? In general this library seems like the wrong place to make the decision even in the case of Poseidon... I would expect a pin request to hang when it tries to fetch a block that is not retrievable. That is normal ipfs behavior. What is unexpected is ipfs to error telling me that a hash is insecure without it being insecure, and without actually asking for it to the blockstore or the network. The "I don't know how to hash" is a better error, but I would expect that ipfs knows how to hash using functions that are actively used in Filecoin land. Overall, since in practice no one is putting data with such hashes on IPFS yet, it does not matter today. We can leave things as they are. Is there an scenario, however, where someone might put such data on IPFS or we may want it to be retrievable over IPFS? Will FVM have any impact on that? |
This hash function is heavily used by Filecoin, making Kubo unable to operate with Filecoin data.