-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(gw): faster dir listing - fetch child sizes in parallel #8888
refactor(gw): faster dir listing - fetch child sizes in parallel #8888
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good idea, but let's wait for #8853 (review) to land first (and decide on default threshold there).
To avoid having too much on our plate, I'm marking this for go-ipfs 0.14.
// FIXME: Check above. The UnixFS files we're iterating | ||
// (because we use the UnixFS API) should always support | ||
// this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc we need to keep this check because we've seen dag-pb dir that links to dag-cbor or dag-json CIDv1
@lidel Your call, but note that this PR wasn't about just another performance optimization but addressing the core of the issue (at least what I understand it to be). We probably didn't groom Steven's original issue description as much as we should and the principal objective is still a bit unclear to me, my takeaway was @alanshaw's concern about the GW timing out:
The core of the timeout is fetching directory entries metadata sequentially and without a per-entry timeout as we still do now in #8853, only with the threshold to avoid the fetching if we have too many entries, but note this timeout will still happen even with a few entries if only one of them is missing:
Browsing http://localhost:8080/ipfs/QmYf9xX1TrQA3vxoQRkYWQT9ugWsdg7fihMg1w7ZLg2UiN even with 4 entries will still timeout on the first one not being available. The objective of this PR (not fully expanded before, sorry) is to attempt to fetch any |
Quick triage notes: Two separate problemsIn my mind the core issue reported was the unmovable reality of big directories fetching tens of thousands of child blocks: even when all blocks are reachable, and even if we did the parallel thing from this PR, big enough dir would take ages, and could hit timeout on caching/reverse proxy or client. That is being addressed in #8853 by skipping child fetch entirely for directories above some threshold. This PR is about additional improvement that can land separately: This PR: parallel child size fetch
@schomatis is the idea to do the listing in "best-effort" fashion: fetch things in parallel and display "?" next to child nodes that errored / timeouted? or just add parallel fetch to parallelize slow fetches? If it is the latter, perhaps we could simplify this code by simpler batching using |
If this is the original issue then this PR would add more complexity than value here. Closing then. |
Broken and untested. Just to gather early feedback.
@lidel This is a variant of your optimization fetching metadata in parallel to avoid the sequential stall. Need to work some more on it (document, test, encapsulate) but wanted to check if (a) this is a feature you're interested in and (b) this is an additional level of complexity you're comfortable with in this part of the code.