-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not recording non-empty directories is problematic #103
Comments
This could be changed at the cost of two things:
I don't know if we're relying the tarballs that we generate being reproducible anywhere. cc @staticfloat? |
Oh, is there a umask set when you're doing this? |
Nope, just the default |
No, we don't need reproducibility yet. And if we did, we would likely check it through tree hashing, and not through sha hashing. |
Issue #103 points out that omitting directory entries for non-empty directories can confuse some tools that consume tarballs, including docker, which applies overly restrictive permissions to directories which are not explicitly included in a tarball. This commit changes `Tar.create` and `Tar.rewrite` to produce tarballs which include explicit directory entries for all (non-root) directories. This changes Tar.jl's "canonical format", which is, by design one-to-one with git tree hashes. However, it does not seem like anyone currently depends on that exact reproducibility and it seems worth making this change in order to avoid confusing external consumers of our tarballs. Closes #103.
Ok, I think we should go ahead and change this now, so I've made a PR which I'm going to merge. |
Oh, in the process I discovered a set of obscure bugs which I fixed: #105. I discovered this by accidentally changing the tarballs we produce to include each directory after the contents of that directory, which none of the tarball consuming functions handled correctly, each broken in a slightly different way. So we inadvertently also made this package more robust. |
Issue #103 points out that omitting directory entries for non-empty directories can confuse some tools that consume tarballs, including docker, which applies overly restrictive permissions to directories which are not explicitly included in a tarball. This commit changes `Tar.create` and `Tar.rewrite` to produce tarballs which include explicit directory entries for all (non-root) directories. This changes Tar.jl's "canonical format", which is, by design one-to-one with git tree hashes. However, it does not seem like anyone currently depends on that exact reproducibility and it seems worth making this change in order to avoid confusing external consumers of our tarballs. Closes #103.
Issue #103 points out that omitting directory entries for non-empty directories can confuse some tools that consume tarballs, including docker, which applies overly restrictive permissions to directories which are not explicitly included in a tarball. This commit changes `Tar.create` and `Tar.rewrite` to produce tarballs which include explicit directory entries for all (non-root) directories. This changes Tar.jl's "canonical format", which is, by design one-to-one with git tree hashes. However, it does not seem like anyone currently depends on that exact reproducibility and it seems worth making this change in order to avoid confusing external consumers of our tarballs. Closes #103.
I'm tarring a rootfs directory that looks as follows:
I'll be focusing on the
usr
directory, which contains more directories and files (i.e. it's not empty):If I create a tarball with Tar.jl and GNU tar, and then list the tarball's contents using
tar -tvf
, there's some differences. One difference is that Tar.jl's tarball doesn't list theusr
andbin
directories themselves, because it doesn't encode those directories' metadata.GNU Tar:
Tar.jl:
As per the README, this is (currently) by design. That does introduce issues with some tools, though, so I figured I'd create an issue documenting those. One of those tools is Docker, whose
docker import
tool can read a rootfs tarball to create a layer. However, lacking an entry for those directories, the permissions on those directories are "wrong", essentially rendering the created image useless for non-root uses:As you can see, the permissions of
usr
andusr/bin
are too restrictive. At the same time,usr/src
is fine because it's an empty directory which is part of the tarball. Maybe the 'only if empty' rule should be extended to all directories so that tools likedocker import
don't stumble on Tar.jl's tarballs?Note that
docker import
is probably to blame here too, as unpacking Tar.jl's tarballs using GNU Tar does result in proper permissions (but with weird timestamps as onlyusr/src
was recorded by Tar.jl with a zero timestamp):The text was updated successfully, but these errors were encountered: