-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support extraction of unixfs content stored in car files #263
Conversation
willscott
commented
Nov 4, 2021
•
edited
Loading
edited
- support symlinks
- round-trip (create then extract) tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small blocker on error handling when making directory otherwise LGTM.
Left suggestions.
Naively writing out files with |
Okay. There's better accounting to not get in trouble with symlinks or paths outside of the chosen directory now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am mostly curious if you intended for this code not to support UnixFS files as roots? It appears that way unless I'm missing something.
|
||
ls := cidlink.DefaultLinkSystem() | ||
ls.TrustedStorage = true | ||
ls.StorageReadOpener = func(_ ipld.LinkContext, l ipld.Link) (io.Reader, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graphsync's storeutil.LinkSystemForBlockstore does this quite pleasantly -- sets trusted storage, adds a good storage read opener, even includes some optimizations for bytes.Buffer. We should really, really consider putting this in ipld-prime, or somewhere that's obviously accessable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a glue module for this coming in ipld/go-ipld-prime#279 (which I haven't merged yet, but don't actually know what I'm waiting for, either), so we could use that shortly.
cmd/car/car.go
Outdated
{ | ||
Name: "extract", | ||
Aliases: []string{"x"}, | ||
Usage: "Extract the contents of a car", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, it looks like if you give anything other than a car where the roots are DagPB/Raw, it will error. Perhaps it is worth putting a comment like this ("Extracts the contents of a car when the car encodes UnixFS data")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should think about other conventions that we should try to interpret, like hamts, structured data, etc.
} | ||
pbnode := pbn.(dagpb.PBNode) | ||
|
||
ufn, err := unixfsnode.Reify(ipld.LinkContext{}, pbnode, ls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not using a the Reify feature of LinkSystem for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because in extractDir
we look at the dagpb level data in understanding how and when to do the reification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have not entirely wrapped my head around all the code here, but it's possible that you could use NodeReifier
's LinkContext
parameter to pass down relevant hints, either via the .LinkPath
or .LinkNode
or .ParentNode
properties. I'd think one of those should provide enough info for the decision making. Unless the relevant info is super nonlocal.
I have not analyzed if that would make this code easier to reason about or reuse; just offering the possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be a new coordination with unixfsnode somehow that seems harder to reason about though. I think this is a reasonable setup for applying the reification explicitly when needed.
return nil | ||
} | ||
|
||
pbn, err := ls.Load(ipld.LinkContext{}, cidlink.Link{Cid: root}, dagpb.Type.PBNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code seems to assume that if you don't have a raw node, you have a root that is a UnixFS Directory, not a regular file (even though extractFile exists as a function). Is this the intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is generally the case. if i export a single file it'll be implicitly in a directory in order to provide the file name of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, to compare to other unix stuff, tar
commands do allow single file no-dir contents, iirc. Rare and a bit arcane, but possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay for that to go in a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to address the concern I had about root escaping, so all paths should strictly only come under the requested output directory