From f1033eee108d1b5c4aef681111f2b87b77583f81 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 20 Jun 2023 21:47:07 +0200 Subject: [PATCH] ipip-404: descope car roots and determinism Discussions in https://github.com/ipfs/specs/pull/402 illustrated deeper problem with CAR determinism, and we made a decision to remove its aspects from IPIP-402. Ref. https://github.com/ipfs/specs/pull/402#issuecomment-1598000900 --- src/http-gateways/trustless-gateway.md | 56 ++++++++++++++++++-------- src/ipips/ipip-0402.md | 39 ++++++++++++++++++ 2 files changed, 78 insertions(+), 17 deletions(-) diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index e0553e224..1df71dbe3 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -4,7 +4,7 @@ description: > Trustless Gateways are a minimal subset of Path Gateways that allow light IPFS clients to retrieve data behind a CID and verify its integrity without delegating any trust to the gateway itself. -date: 2023-04-17 +date: 2023-06-20 maturity: reliable editors: - name: Marcin Rataj @@ -153,18 +153,9 @@ The Body hash MUST match the Multihash from the requested CID. ### CAR Response -A CAR stream -([application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)) -for the requested content type, path and optional `dag-scope` and `entity-bytes` URL parameters. - -:::note - -By default, block order in CAR response is not deterministic, blocks can -be returned in different order, depending on implementation choices (traversal, -speed at which blocks arrive from the network, etc). An opt-in ordered CAR -responses MAY be introduced in a future, see [IPIP-412](https://github.com/ipfs/specs/pull/412). - -::: +A CAR stream for the requested +[application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) +content type, path and optional `dag-scope` and `entity-bytes` URL parameters. #### CAR version @@ -174,9 +165,40 @@ field MUST match the `version` parameter returned in `Content-Type` header. #### CAR roots -If the response uses version 1 or 2 of the CAR spec, the +The behavior associated with the [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field -MAY contain the CID of the terminus of the content path. +is not currently specified. + +Clients MAY ignore it. + +:::issue + +As of 2023-06-20, the behavior of the `roots` CAR field remains an [unresolved item within the CARv1 specification](https://web.archive.org/web/20230328013837/https://ipld.io/specs/transport/car/carv1/#unresolved-items). + +::: + +#### CAR determinism + +The default CAR header and block order in a CAR response is not specified and is non-deterministic. + +Clients MUST NOT assume that CAR responses are deterministic (byte-for-byte identical) across different gateways. + +Clients MUST NOT assume that CAR includes CIDs and their blocks in the same order across different gateways. + +:::issue + +In controlled environments, clients MAY choose to rely on undocumented CAR determinism, +subject to the agreement of the following conditions between the client and the +gateway: +- CAR version +- content of [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field +- order of blocks +- status of duplicate blocks + +In the future, there may be an introduction of a convention to indicate aspects +of determinism in CAR responses. Please refer to +[IPIP-412](https://github.com/ipfs/specs/pull/412) for potential developments +in this area. + +::: -If implementation prefers to avoid buffering blocks, and return them as soon as -possible, the field MAY be left empty. diff --git a/src/ipips/ipip-0402.md b/src/ipips/ipip-0402.md index 451777f9b..4a83ab1ba 100644 --- a/src/ipips/ipip-0402.md +++ b/src/ipips/ipip-0402.md @@ -143,6 +143,45 @@ mention feature detection via OPTIONS -- a separate IPIP? OR suggest executing test request and client-side detection the first time a gateway is used. --> +#### CAR roots and determinism + +As of 2023-06-20, the behavior of the `roots` CAR field remains an [unresolved item within the CARv1 specification](https://web.archive.org/web/20230328013837/https://ipld.io/specs/transport/car/carv1/#unresolved-items): + +> Regarding the roots property of the Header block: +> +> - The current Go implementation assumes at least one CID when creating a CAR +> - The current Go implementation requires at least one CID when reading a CAR +> - The current JavaScript implementation allows for zero or more roots +> - Current usage of the CAR format in Filecoin requires exactly one CID +> +> [..] +> +> It is unresolved how the roots array should be constrained. It is recommended +> that only a single root CID be used in this version of the CAR format. +> +> A work-around for use-cases where the inclusion of a root CID is difficult +> but needing to be safely within the "at least one" recommendation is to use +> an empty CID: `\x01\x55\x00\x00` (zero-length "identity" multihash with "raw" +> codec). Since current implementations for this version of the CAR +> specification don't check for the existence of root CIDs +> (see [Root CID block existence](https://web.archive.org/web/20230328013837/https://ipld.io/specs/transport/car/carv1/#root-cid-block-existence)), +> this will be safe as far as CAR implementations are concerned. However, there +> is no guarantee that applications that use CAR files will correctly consume +> (ignore) this empty root CID. + +Due to the inconsistent and non-deterministic nature of CAR implementations, +the gateway specification faces limitations in providing specific +recommendations. Nevertheless, it is crucial for implementations to refrain +from making implicit assumptions based on the legacy behavior of individual CAR +implementations. + +Due to this, gateway specification changes introduced in this IPIP clarify that: +- The CAR `roots` behavior is out of scope and flags that clients MAY ignore it. +- CAR determinism is not present by default, responses may differ across + requests and gateways. +- Opt-in determinism is possible, but standarized signaling mechanism does not + exist until we have IPIP-412 or similar. + ### Security This IPIP allows clients to narrow down the amount of data returned as a CAR,