Skip to content
This repository has been archived by the owner on Jun 2, 2020. It is now read-only.

Commit

Permalink
Improve CID concept doc for #95
Browse files Browse the repository at this point in the history
License: MIT
Signed-off-by: Randall Harmon <rjharmon0316@gmail.com>
  • Loading branch information
rjharmon committed Aug 18, 2018
1 parent a92f8ee commit 47bcb02
Showing 1 changed file with 22 additions and 7 deletions.
29 changes: 22 additions & 7 deletions content/guides/concepts/cid.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,33 @@ menu:
parent: concepts
---

A *content identifier* is a value that addresses a single piece of content in IPFS. It is mainly a cryptographic hash of the content, but is encoded as a [multihash](https://github.com/multiformats/multihash) and [multicodec](https://github.com/multiformats/multicodec). (Note: older CIDs have a different design — see [version 0](#version-0) below.)
A *content identifier*, or CID, is a label used as a standardized way of pointing to material in IPFS. It doesn't indicate _where_ the content is stored, but it forms a kind of address out of the content itself. CIDs are consistently compact, regardless of the size of underlying content.

<!-- TODO: explain more of the details of how CID v1 is composed here. -->
CIDs are based on the content's [cryptographic hash](concepts/hashes). As a result, any difference in content will produce a different CID. Any IPFS node having the content will be able to match the hash and be able to retrieve the original content.

You can read up on the details in the [CID spec](https://github.com/ipld/cid). You might also want to check out the [CID inspector](http://cid-utils.ipfs.team/#zb2rhiVd5G2DSpnbYtty8NhYHeDvNkPxjSqA7YbDPuhdihj9L) for an interactive breakdown of CIDs.
## CID formats

CIDs can take a few different forms with different encoding bases or CID versions. Many of the existing IPFS tools still generate v0 CIDs, although the `files` ([MFS](/concepts/mfs)) and `object` operations now use CIDv1 by default.

### Version 0

When IPFS was first designed, we used base 58-encoded multihashes as the content identifiers (This is simpler, but much less flexible than newer CIDs). CIDv0 is still used by default for many IPFS operations, so you should generally try to support v0.

If a CID is 46 characters starting with "Qm", it's a CIDv0 (for more details, check the [decoding algorithm](https://github.com/ipld/cid/blob/ef1b2002394b15b1e6c26c30545fd485f2c4c138/README.md#decoding-algorithm) in the CID specification).

## Version 1

Version 1 is the latest version of CID. It is used by default for `files` ([MFS](/concepts/mfs)) and `object` operations.
CID v1 contains some leading identifiers that clarify exactly which representation is used, along with the content-hash itself. These include:

* A multibase prefix, specifying the encoding used for the remainder of the CID
* The CID version identifer, which indicates which version of CID is encoded
* The [multicodec](https://github.com/multiformats/multicodec) identifier, indicating the format of the target content - it helps people and software to know how to interpret that content after the content is fetched

These leading identifiers also provide forward-compatibility, supporting different formats to be used in future versions of CID.

You can use the first few bytes of the CID to interpret the remainder of the content address and know how to decode the content after it's fetched from IPFS. For more details, check out the [CID specification](https://github.com/ipld/cid). It includes a [decoding algorithm](https://github.com/ipld/cid/blob/ef1b2002394b15b1e6c26c30545fd485f2c4c138/README.md#decoding-algorithm) and links to existing software implementations for decoding CID's.

You might also want to check out the [CID inspector](http://cid-utils.ipfs.team/#zb2rhiVd5G2DSpnbYtty8NhYHeDvNkPxjSqA7YbDPuhdihj9L) for an interactive breakdown of differently-formatted CIDs.

## Version 0

When IPFS was first designed, we used base 58-encoded multihashes as the content identifiers. (This is simpler, but much less flexible than newer CIDs.) It is still used by default when adding files and blocks to IPFS, so you should generally try to support them.

The CID specification includes a [decoding algorithm](https://github.com/ipld/cid/blob/ef1b2002394b15b1e6c26c30545fd485f2c4c138/README.md#decoding-algorithm) you can use to distinguish CID v0 from newer versions.

0 comments on commit 47bcb02

Please sign in to comment.