Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decompressor_tgz: handle symlinks and links (hardlinks) #37

Closed
wants to merge 2 commits into from

Conversation

pearkes
Copy link

@pearkes pearkes commented Oct 20, 2016

This adds support for symlinks and hardlinks being unpacked from tar.gz archives go-getter processes. Previously, these files would be copied from their empty contents from the target tar.gz to the destination file system.

I have a few questions:

  1. Is the right approach for hard links? I assume it wouldn't be possible due to the target needing to exist prior to linking – curious if there are good patterns to deal with that.
  2. How do we do this in other compressors?
  3. The table driven tests only inspect file contents for comparison (necessary for ensuring that this functionality works as intended) with single files – should I just augment the TestDecompressCase test case to accept multiple MD5s for comparison, or maybe just write an individual test case outside of the table driven style to test this?

This adds support for re-creating symlinks after untaring tgz
archives.
This adds support for re-creating links after untaring tgz
archives.
Copy link
Member

@ryanuber ryanuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pearkes Looks like you are generally on the right path. I left a few comments which address 1) and 3). For 2), maybe we implement some internal helpers, like makeLinks(map[string]string) error to start. Then you just accumulate the mapping for hard links and pass it in after you've decompressed everything, and it can be applied to all the decompressors. It would make the tests easier as well since the linker could be tested in isolation.

if (hdr.Typeflag == tar.TypeLink) {
// If the type is a link ("hardlink") we re-write it and
// continue instead of attempting to copy the contents
if err := os.Link(hdr.Linkname, path); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may need to accumulate the hard links and iterate over them after unpacking everything. If the hard link comes before the target lexically, this might not work. It's fine for symlinks since they are allowed to be broken. It would be worth it to try that in the test fixtures as well.

@@ -61,6 +61,22 @@ func TestTarGzipDecompressor(t *testing.T) {
multiplePaths,
"",
},

{
"with-symlinks.tar.gz",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is still useful but we should definitely augment and check that the symlink actually got written as a symlink (instead of just contents being duplicated).

@schmichael
Copy link
Member

👋

I started playing around with symlink, hardlink, and device node (:grimacing:) support in a fork because an approach we were taking for LXC in nomad required it. branch We ended up not needing it though, so I just pushed the unfinished go-getter branch for posterity.

At first I was afraid supporting links posed a security issue for nomad* -- however it's no different from the existing security issue of absolute paths in archives.

So! Here's my thinking:

  1. Ship this once you fix @ryanuber's comments
  2. Treat tars being able to extract files anywhere on the filesystem as a separate issue/PR to be addressed as needed (same with device nodes which I think require a root user or special capability and will probably only ever be needed in nomad)
  3. Either in this PR or another refactor tar handling into a shared function that's given an io.Reader wrapped in the appropriate decompressor.

* eg tar contains symlink foo to /etc and then extracts foo/passwd

@langmartin
Copy link
Contributor

Closing because this pr has been replaced by #192

@langmartin langmartin closed this Nov 13, 2019
@langmartin langmartin deleted the links branch November 13, 2019 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants