Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure files and links have missing parent dirs created. Support hard links. #1179

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

komish
Copy link
Contributor

@komish komish commented Jun 26, 2024

This PR addresses two issues:

  • When extracting the image tarball to the filesystem, we would fail if a file (e.g. etc/foo.conf) was created before its parent directory (e.g. etc/). This has been fairly uncommon, but for cases where this happens, we'll now create the directory if it's missing.

  • We did not handle hard links before, and now we do. We also would fail to handle all links properly, in that links pointing to a path relative to the link would not be resolved properly. This PR properly resolves the link path and the original file's path relative to the link path if it's a relative reference (e.g. ../). All links are still executed using full paths.

Copy link

openshift-ci bot commented Jun 26, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: komish

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 26, 2024
}

linkDir := filepath.Dir(newname)
return filepath.Join(linkDir, oldname), newname
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is incorrect for hard links: their targets are relative to the root directory of the layer archive. For confirmation, an example of an image with such hard links is docker.io/library/busybox.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why busybox would be relevant here, since all our images must be based off of UBI...but if you look at busybox...it doesn't have an os-release file, hence it's failure when ran against preflight.

~/busybox/etc 
❯ ll
total 20
-rw-r--r--. 1 acornett acornett 306 May 18  2023 group
-rw-r--r--. 1 acornett acornett 114 May 18  2023 localtime
drwxr-xr-x. 1 acornett acornett  82 May 18  2023 network
-rw-r--r--. 1 acornett acornett 494 May 18  2023 nsswitch.conf
-rw-r--r--. 1 acornett acornett 340 May 18  2023 passwd
-rw-------. 1 acornett acornett 136 May 18  2023 shadow

~/busybox 
❯ find . -name os-release
(no-output)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The busybox image is relevant in that it provides an example of how hard links are stored in tar archives and container image layers, and it was an image that I remembered included hard links in its layers. If it has to be UBI-based to matter, feel free to build this and examine the added layers:

FROM registry.access.redhat.com/ubi8
RUN echo hello > /usr/local/hello.txt
RUN ln /usr/local/hello.txt /usr/local/hello2.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 @nalind, thanks. I misread my test tar archives - I only treated link origins with .. as relative, and everything else as absolute (or, rather, relative to /). The inverse seems to be what I want (everything not explicitly absolute should be treated as a relative path).

Latest push should implement this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what was previously done (passing the value from the Linkname field to os.Symlink()) was correct for symbolic links. I've only seen "Linkname values are always relative to the root of the archive" with hard links.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalind I believe this resolves correctly now for both symlinks and hardlinks. From what I see, an image that looks like this:

FROM registry.access.redhat.com/ubi8

RUN ln -v /etc/os-release /os-release
RUN ln -sf /etc/os-release /os-release-soft

Has no issue linking the symlinks and hard links. E.g.

[root@2656a86bb595 fs]# ls -l
total 24
lrwxrwxrwx. 1 root root   35 Jun 27 15:52 bin -> /tmp/preflight-472685243/fs/usr/bin
drwxr-xr-x. 1 root root    0 Jun 27 15:52 boot
drwxr-xr-x. 1 root root    0 Jun 27 15:52 dev
drwxr-xr-x. 1 root root 1842 Jun 27 15:52 etc
-rw-r--r--. 1 root root    0 Jun 27 15:52 foo
drwxr-xr-x. 1 root root    0 Jun 27 15:52 home
lrwxrwxrwx. 1 root root   35 Jun 27 15:52 lib -> /tmp/preflight-472685243/fs/usr/lib
lrwxrwxrwx. 1 root root   37 Jun 27 15:52 lib64 -> /tmp/preflight-472685243/fs/usr/lib64
drwxr-xr-x. 1 root root    0 Jun 27 15:52 lost+found
drwxr-xr-x. 1 root root    0 Jun 27 15:52 media
drwxr-xr-x. 1 root root    0 Jun 27 15:52 mnt
drwxr-xr-x. 1 root root    0 Jun 27 15:52 opt
lrwxrwxrwx. 1 root root   46 Jun 27 15:52 os-release -> /tmp/preflight-472685243/fs/usr/lib/os-release
lrwxrwxrwx. 1 root root   42 Jun 27 15:52 os-release-soft -> /tmp/preflight-472685243/fs/etc/os-release
drwxr-xr-x. 1 root root    0 Jun 27 15:52 proc
drwxr-xr-x. 1 root root  254 Jun 27 15:52 root
drwxr-xr-x. 1 root root    8 Jun 27 15:52 run
lrwxrwxrwx. 1 root root   36 Jun 27 15:52 sbin -> /tmp/preflight-472685243/fs/usr/sbin
drwxr-xr-x. 1 root root    0 Jun 27 15:52 srv
drwxr-xr-x. 1 root root    0 Jun 27 15:52 sys
drwxr-xr-x. 1 root root   72 Jun 27 15:52 tmp
drwxr-xr-x. 1 root root  100 Jun 27 15:52 usr
drwxr-xr-x. 1 root root  166 Jun 27 15:52 var

You might notice /os-release is hardlinked to /etc/os-release, which itself is a link. I see some issues with links to links to links (spanning link types), but GNU tar seems to exhibit the same issue, and the running container based on the image also sees the broken link. To that end, I'm thinking this is close enough to what we expect to be written to disk for most tasks Preflight needs to execute.

Let me know if you see something different.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think that trying to mix the handling of symbolic links with the handling of hard links is adding more complexity than it's removing. For symbolic links, the oldname to pass to os.Symlink() is just the Linkname. For hard links and os.Link(), it's filepath.Join(dst, header.Linkname).

If it's possible that any of the images being analyzed will attempt tricks by having the Linkname refer to something outside of dst, it can get more complicated than that, but I don't know if that's a concern here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I've split the two, and as discussed, added a few safeguards around the writing of symlinks that may be broken as present in the archive, but may point to things outside of our extraction base directory.

I was unable to get rid of the link resolution code even with splitting it out, but overall I think the hardlink logic functions a bit better (I saw a hardlink testcase resolve that was previously broken).

PTAL.

@dcibot
Copy link

dcibot commented Jun 26, 2024

@komish komish force-pushed the fix-untar-link-and-file-res branch from 69a1942 to 8ed7767 Compare June 27, 2024 16:12
@coveralls
Copy link

Coverage Status

coverage: 84.593% (-0.2%) from 84.79%
when pulling 8ed7767 on komish:fix-untar-link-and-file-res
into 282a25b on redhat-openshift-ecosystem:main.

@dcibot
Copy link

dcibot commented Jun 27, 2024

… links

Signed-off-by: Jose R. Gonzalez <komish@flutes.dev>
@komish komish force-pushed the fix-untar-link-and-file-res branch from 8ed7767 to d4dbf03 Compare June 27, 2024 21:49
@coveralls
Copy link

Coverage Status

coverage: 84.352% (-0.4%) from 84.79%
when pulling d4dbf03 on komish:fix-untar-link-and-file-res
into 282a25b on redhat-openshift-ecosystem:main.

@dcibot
Copy link

dcibot commented Jun 27, 2024

@ramperher
Copy link

from change #1179:

Hi, fix is coming for the issue that appeared in the last DCI job, if launching a new job and failing again here, you can add the following to your PR description to launch the fix (saying this because I don't have permissions to edit the PR description in this repo):

build-depends: 31882

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants