-
Notifications
You must be signed in to change notification settings - Fork 30
Blocks should be addressed by Multihash, not CID #259
Comments
Note that CID has multihash inside of it. It is not about not passing CID around, it is ensuring that we compare multihashes |
Note: Unless I'm mistaken, we should consider switching provider records to using multihashes (without breaking backwards compat too much). |
@Stebalien the block interface should still take CID's (just like @diasdavid said we should still use CIDs on the wire) |
@kevina, @diasdavid I disagree. For bitswap, we should send CIDs because we'll need to do that for DAGSwap (IPLD selectors). For blockstores, blocks really shouldn't understand the concept of CIDs because they're blocks (not at the IPLD level). Yes, we could continue to use CIDs but that's mixing abstraction levels and will tend to lead to bugs (see the associated go-ipfs issue: https://github.com/ipfs/go-ipfs/issues/4189) |
Nevermind. It's actually quite useful to have blocks that carry information about their codec/CID without actually being able to interpret the underlying data. |
The datastore (in go, at least) is the layer that uses raw multihashes and doesn't understand CIDs. |
@Stebalien Yeah, I agree conceptually, but in practice there is near zero overlap. Blocks with the same multihash but different cids are non-existant in practice (please call me out on this if anyone disagrees). |
@whyrusleeping this is going to change drastically if we switch to base32 by default beacause base32 CIDs will be CIDv1, not CIDv0 (CIDv0, according to @diasdavid, is base58btc only by spec). |
The reason I closed this bug is that it's still useful to have blocks that carry CID information that can't (or doesn't need to be at this point in time) actually be interpreted. |
@whyrusleeping if the multibase or the multicodec is different, then the multihash will be the same. During the Lisbon Treaty, we were seriously concerned about block duplication (for example importing git blocks as raw and with git multicodec). But now, as @Stebalien points out, Also, if we don't compare by multihash it means that a CIDv0 would never match a CIDv1 for the same block and there for would miss fetching, even though the only things that change is the link and not the content. |
The base is only referenced when being printed. Ipfs internally just stores
it baseless. What you choose to print it as doesn't matter (and I sincerely
hope the js code doesn't keep them encoded internally).
…On Thu, Aug 31, 2017, 9:27 PM David Dias ***@***.***> wrote:
@whyrusleeping <https://github.com/whyrusleeping> if the multibase or the
multicodec is different, then the multihash will be the same.
During the Lisbon Treaty, we were seriously concerned about block
duplication (for example importing git blocks as raw and with git
multicodec). But now, as @Stebalien <https://github.com/stebalien> points
out, if we change the defaults to base32, we will see a ton of blocks being
referenced by a new CID that will have the same multihash.
Also, if we don't compare by multihash it means that a CIDv0 would never
match a CIDv1 for the same block and there for would miss fetching, even
though the only things that change is the link and not the content.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#259 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABL4HPWl0cc33wB1nKONv-pdq045H72iks5sd4eigaJpZM4PJVgk>
.
|
We use its Buffer format which is version + multicodec + multihash for storing them on the repo:
But you are right, multibase is not an issue. I knew there was something and led myself to believe that multibase was part of the mix too. The multicodec and cid version are the ones creating different keys (for blocks in repo) for the same content. |
Given a block of data, it's possible to interpret it different ways assuming different codecs. Currently, in bitswap, the blockstore, and everywhere else for that matter, we track blocks using CIDs instead of using the data's multihash.
That is:
Use-cases/References:
Unfortunately, fixing this will be non trivial. This will likely end up being part of #255.
The text was updated successfully, but these errors were encountered: