-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipfs dag export is slower than ipfs cat #8004
Comments
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
Finally, remember to use https://discuss.ipfs.io if you just need general support. |
FWIW, @jsign both cat and export have to do a dag traversal from the blockstore (the car is not stored linearly on disk), and export has to write minimally more data, as it includes all blocks (not just leaf nodes) and RLE encodings for car blocks. One thing I'd be curious about -- what's the generated CAR size for the 5.0GB flat file? |
BTW, I just saw this and thought it was an interesting issue -- probably someone else on the IPFS team will be in charge of addressing it. |
Yep, both cases are doing random-reads so that shouldn't be relevant to explain the difference. Regarding the sizes, the original file size is 5368709120 bytes, and the CAR output 5370748072 bytes, so a diff of ~2MiB (UnixFS+CAR overhead). |
Yea I'm stumped then -- I even went to look at CAR export and CAT's traversal methods -- it's not at all clear to me why one would be faster or slower. |
@hannahhoward isn't this kinda-sorta related to ipld/go-ipld-prime#149 or a similar multiple-rehashing problem in the traversal somewhere ? |
@ribasushi unless I'm missing something the non of these should be running into the hash on read problem so probably not? |
Version information:
Description:
I'm trying to get a sense of the throughput of CAR exporting for a DAG, mainly motivated by data preparing to Filecoin onboarding. A quick
ipfs dag export
test in some cloud VM showed that I had ~40MiB/s throughput which is pretty slow, and most probably related to slow SSDs.To take out disk from the equation, I did the following experiment using IPFS mounting the repo in RAM:
Term 1:
Term 2:
So for a 5GiB random file:
ipfs add
: 8.4s (609MiB/s)ipfs cat
: 12.09s (423MiB/s)ipfs dag export
: 19.29s (265MiB/s)It seems that these results are odd since
ipfs dag export
is slower thanipfs cat
. Sounds like exporting the DAG should be doing less work than unpeeling the UnixFS layer for cat-ing the file.Also, note the IPFS repo is mounted in RAM so these throughputs are best-case scenarios.
The IPFS config file is just the default that gets created on a fresh
ipfs init
as shown above (so using FlatFS).Other info about my box:
ipfs add
and not the rest since it's in RAM)Extra test with 1GiB file showing the same fact:
Let me know if there might be any pitfall, or similar!
The text was updated successfully, but these errors were encountered: