-
Notifications
You must be signed in to change notification settings - Fork 21
Create tarballs with contents in deterministic order #210
Comments
I'm pretty sure you cannot reproduce the exact same aci by building it twice.
Of course it depend on the package manager you are using, but none I'm aware of, are designed to create exactly reproductible install (bytes per bytes). It's too bad, but I don't think it's a big issue, the goal is to build an immutable aci you will store, share and reuse. Rebuilding it exactly the same way is useless in this case. Also you have to notice that dgr is designed to make the build reproductible at the layer it manage. Meaning that except what you are doing inside build and builder that will access external ressources that may change, everything handled by dgr will be reproductible. Nothing on the host affect the build since the build itself is also running inside a container. |
Your point 3 is weird, download of aci is directly handled by rkt that does not support such a thing. using dependencies will reduce the size of the aci you are building. |
Thank you for looking into this! The issue is really independent from Ubuntu or Debian, though you can create such builds using them.
If dgr created a tarball with reproducible order, independent from locale and system (as by the parameter suggested above), I can easily catch any remaining differences. My point (3) has been written with a remote storage in mind (think: webserver serving the images) – not any particular local storage, such as rkt's. Regarding your description of building minimal layers: I don't see the connection to ordered files in ACIs. On the matter of minimal file sites you might indeed want to take a look on how I arrived at a minimal Ubuntu image for Docker or examine how I build most of my other Docker images for that matter. ;-) This issue is strictly about two things:
|
I'm not against doing it, it should not be a lot a work on our side. |
Thanks! I could fork and modify dgr for that, but I think it's small enough to not warrant that overhead, and would benefit more users than me. |
If the manifest is the first file then tools do not need to download the whole image file in order to detect changes or updates. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. Parameter 'f' for 'tar' must be followed by the filename for recent versions of 'tar'. Mind the order! closes blablacar#210
If the manifest is the first file then tools do not need to download the whole image file in order to detect changes or updates. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. Parameter 'f' for 'tar' must be followed by the filename for recent versions of 'tar'. Mind the order! closes blablacar#210
If the manifest is the first file then tools do not need to download the whole image file in order to detect changes or updates. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. Parameter 'f' for 'tar' must be followed by the filename for recent versions of 'tar'. Mind the order! closes blablacar#210
If the manifest is the first file then tools do not need to download the whole image file in order to detect changes or updates. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. Parameter 'f' for 'tar' must be followed by the filename for recent versions of 'tar'. Mind the order! closes blablacar#210
If the manifest is the first file then tools do not need to download the whole image file in order to detect changes or updates. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. Parameter 'f' for 'tar' must be followed by the filename for recent versions of 'tar'. Mind the order! closes blablacar#210
Enables dgr to work on hosts that don't have any tar. (See also blablacar#217.) Sorts the contents to have a reproducible order, but pulls the manifest file to the front for fast access. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. zap the "chdir-acrobatique", use `tar -C`: dgr changes paths and performs needless renames, which result in mayhem if the process quits prematurely or the timing were off. The solution is to use tar's `-C` param and transform the filenames. closes blablacar#210
Enables dgr to work on hosts that don't have any tar. (See also blablacar#217.) Sorts the contents to have a reproducible order, but pulls the manifest file to the front for fast access. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. zap the "chdir-acrobatique", use `tar -C`: dgr changes paths and performs needless renames, which result in mayhem if the process quits prematurely or the timing were off. The solution is to use tar's `-C` param and transform the filenames. closes blablacar#210
Enables dgr to work on hosts that don't have any tar. (See also blablacar#217.) Sorts the contents to have a reproducible order, but pulls the manifest file to the front for fast access. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. zap the "chdir-acrobatique", use `tar -C`: dgr changes paths and performs needless renames, which result in mayhem if the process quits prematurely or the timing were off. The solution is to use tar's `-C` param and transform the filenames. closes blablacar#210
Enables dgr to work on hosts that don't have any tar. (See also blablacar#217.) Sorts the contents to have a reproducible order, but pulls the manifest file to the front for fast access. Sorted contents of ACIs allow for easier comparison, and usage of tools such as zsync, and deduplication on the server. The price, sorting by 'tar', is cheap. zap the "chdir-acrobatique", use `tar -C`: dgr changes paths and performs needless renames, which result in mayhem if the process quits prematurely or the timing were off. The solution is to use tar's `-C` param and transform the filenames. closes blablacar#210
Contents of the
aci/tar
file are currently not ordered in any way. Yet that is needed for deterministic builds.Please use
tar --sort=name
(available in tar version ≥1.28) when creating the targetimage.aci
.Or, even better: Please expose parameter lists for tar and gnupg so an user can add flags of their choice.
You'd need deterministic builds for:
--clamp-mtime
for that, too.blitznote.com/ubuntu:16.04
and a Git repodgr-ubuntu
at a fixed commit, using said image and the build commands the image can be used to build itself to proof that nothing has been changed posteriori. (This limits the scope of an audit to a few scripts.)zsync
to only download differences between an old image and a recently updated one.The text was updated successfully, but these errors were encountered: