-
Notifications
You must be signed in to change notification settings - Fork 1.2k
(WIP) feat: dag import and export to and from CAR files #2953
Conversation
Regarding the xkcd archive I get the following in go-ipfs:
Would be nice to double-check we converge bit-to-bit |
👍
|
Can someone in charge of js-ipfs tell me whether it's worth pursuing this PR further to match functionality (I think it's pretty close) and reproduce the excellent tests that Peter has included over there? There's objections in #2745 about code bloat and I don't want to waste time on something that might be rejected on that basis. @achingbrain maybe? |
I've updated datastore-car upstream to handle this, and I've imported the sharness test from go-ipfs that covers this but it seems like we're not set up to run sharness fully compatible with go-ipfs. |
reset_blockstore 0 | ||
reset_blockstore 1 | ||
|
||
mkfifo pipe_testnet | ||
mkfifo pipe_devnet | ||
|
||
test_expect_success "fifo import" ' | ||
( | ||
cat ../t0054-dag-car-import-export-data/lotus_testnet_export_128_shuffled_nulroot.car > pipe_testnet & | ||
cat ../t0054-dag-car-import-export-data/lotus_devnet_genesis_shuffled_nulroot.car > pipe_devnet & | ||
|
||
do_import 0 \ | ||
pipe_testnet \ | ||
pipe_devnet \ | ||
../t0054-dag-car-import-export-data/combined_naked_roots_genesis_and_128.car \ | ||
> basic_fifo_import_actual | ||
result=$? | ||
|
||
wait || true # work around possible trigger of a bash bug on overloaded circleci | ||
exit "$result" | ||
) | ||
' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rvagg the exercise of the GC-lock and the import of FIFOs may not be something you are too inteerested in testing within js-ipfs. Just raising it here as it was a very important part of making 🗡️ viable.
// TODO: ^ go-car currently attempts to pin roots even if they don't exist in | ||
// the CAR body, need to align behaviour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rvagg this is done in go-ipfs
to allow a copy-less "transactional" operation:
<some source> | stream-dagger <many options> --emit-stdout=car-v0-fifos-xargs | xargs -0 ipfs dag import
What that mode does is print 2 fifo names on stdout. The first fifo contains all the data. The second contains the roots only ( because we can derive the roots only once we streamed everything over ). The full dag import
context serves as a "transaction" of sorts, keeping GC at bay between the lengthy data stream and the pin at the very end.
Whether js-ipfs needs to support this at the same level as go-ipfs is an open question. /cc @mikeal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i’d wait to support this until a later date if js-ipfs
wants to prioritize it.
Yes please - sorry for the misdirection. The sharness tests here aren't run and should really be deleted. The testing strategy is (I really must put this in a doc): CLITests live in /packages/ipfs/test/cli. All interactions with IPFS core are stubbed so we just ensure that the correct arguments are passed in HTTP APITests live in /packages/ipfs/test/http-api and are similar to the CLI tests in that we stub out core interactions and inject requests with shot. CoreAnything non-implementation specific should be considered part of the 'Core APIs'. For example node setup code is not Core, but anything that does useful work, e.g. network/repo/etc interactions would be Core. All Core APIs should be documented in /docs/core-api. All Core APIs should have comprehensive tests in /packages/interface-ipfs-core.
Non-CoreAny non-core API functionality is tested in /packages/ipfs-http-api/tests and /packages/ipfs/tests for |
What's the status of this? Do we intend to finish it? |
We have a replacement CAR library now, https://github.com/ipld/js-car, so that needs to be integrated here. It uses the new js-multiformats though, so it'll start bringing in some new stack pieces. Is this work needed though? I don't have a feel for whether this kind of parity with go-ipfs is even the goal these days? |
had to start this again because so much has moved on, both here and in ipld/multiformats |
Adds `ipfs.dag.import` and `ipfs.dag.export` commands to import/export CAR files, e.g. single-file archives that contain blocks and root CIDs. Supersedes #2953 Fixes #2745 Co-authored-by: achingbrain <alex@achingbrain.net>
Closes: #2745
Ref: ipfs/kubo#7011
Ref: ipfs/kubo#6870
Disclaimer: this is my first hack directly on this repo, so I've had a steep learning curve today figuring this out and I wouldn't be surprised if I have some things very wrong! Guidance gratefully accepted, or perhaps someone else would like to take ownership of this?
I thought it would be nice to widen the conversation about the new
ipfs dag export
andipfs dag import
commands being added in go-ipfs. This is approximate parity, minus some minor details (mostly noted in the code). No tests here, it would be good to share the same fixture suite as @ribasushi is building for go-car.Because datastore-car uses the newer
@ipld/block
, that gets imported here (export uses a differentBlock
than import). That creates a bit of space to have duplicated dependencies so that'd be worth checking on and trying to minimise.Notes:
export
supports only a single root CID and applies a full DAG walk to it. In the future it's expected that you'll also apply a selector to it (default would be like a*
) as well and CARv2 would even store that selector with the root. It produces a "well-formed" CAR, that's deterministic in theory. We need to test the limits of that determinism but assuming we're walking the graph in the same order then it should be identical each time you run it with the same root in js-ipfs or go-ipfs.import
is lax, by design, it accepts a CAR as a "bundle of blocks", perhaps with no root, perhaps with many roots, perhaps with roots that don't even exist in the CAR body. It just dumps the blocks into your store. Where there's a root specified that's found in the body, it'll pin that root in your store. I suspect some details here will be refined in go-car and need to be synced here. For one: go-car currently doesn't accept zero-root CAR files but JS does.Examples:
export:
Will result in a 108M file named xkcd.car containing a Unixfs mirror of XKCD from some point in time.
import:
lazy simulation of multi file import:
import from stdin: