Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle rebasing images with empty layers properly #733

Merged
merged 1 commit into from
Jul 6, 2020

Conversation

milas
Copy link
Contributor

@milas milas commented Jun 22, 2020

Manifest v2 specs do not include layer entries for empty layers,
but the history in the config will include an entry for empty
layers (e.g. produced by ENV directive). History entries for
empty layers include empty_layer field set to true.

As a result, when rebasing and copying from the new base and
original to the rebased image, it's not sufficient to simply
iterate layers and pull from the same position/index in the
history from config. Instead, history should be iterated and
used to generate Addendum, while simultaneously iterating
layers, but only including the layer (and advancing the iterator)
if it's for a history item for a non-empty layer.

Since the history in config is optional per the OCI spec,
iteration via layers will still happen in the event that not
all layers are consumed during the history pass. (This should
also guard against a malformed history).

Example

Imagine a simple Dockerfile:

FROM busybox@sha256:95cf004f559831017cdf4628aaf1bb30133677be8702a8c5f2994629f637a209

ENV VALUE="test"
RUN echo "${VALUE}" > /test.txt

Which results in the config:

{
  "architecture": "amd64",
  "config": {
    "Env": [
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "VALUE=test"
    ],
    "Cmd": [
      "sh"
    ],
    "ArgsEscaped": true,
    "OnBuild": null
  },
  "created": "2020-06-22T16:58:15.423225002-04:00",
  "history": [
    {
      "created": "2020-06-02T21:19:57.100247864Z",
      "created_by": "/bin/sh -c #(nop) ADD file:a84c53d2fe5207d17b5110e1eeeff0ab5d7ca831d743070eab1e87dc74129049 in / "
    },
    {
      "created": "2020-06-02T21:19:57.279412246Z",
      "created_by": "/bin/sh -c #(nop)  CMD [\"sh\"]",
      "empty_layer": true
    },
    {
      "created": "2020-06-22T16:58:15.423225002-04:00",
      "created_by": "ENV VALUE=test",
      "comment": "buildkit.dockerfile.v0",
      "empty_layer": true
    },
    {
      "created": "2020-06-22T16:58:15.423225002-04:00",
      "created_by": "RUN /bin/sh -c echo \"${VALUE}\" > /test.txt # buildkit",
      "comment": "buildkit.dockerfile.v0"
    }
  ],
  "os": "linux",
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:1be74353c3d0fd55fb5638a52953e6f1bc441e5b1710921db9ec2aa202725569",
      "sha256:3104c9e118b4c9fddbc67791130783194fdebe67de2d083156e4b3fc67c09d8a"
    ]
  }
}

Now, rebase the image with crane to a different busybox base:

crane --insecure rebase \
  --original="localhost:5000/rebase-example:original" \
  --old_base="busybox@sha256:95cf004f559831017cdf4628aaf1bb30133677be8702a8c5f2994629f637a209" \
  --new_base="busybox@sha256:aaf6a895ebcb872d39306dbcda13739132674dfb5af43a9c527c5cdcb1c21e20" \
  --rebased="localhost:5000/rebase-example:rebased"

Comparison of the configuration between the two:

--- a/config.bad
+++ b/config.good
@@ -7,9 +7,20 @@
       "created_by": "/bin/sh -c #(nop) ADD file:5437654e1deb81bd4beeb1c4722443c5be1c2a259e46f51cf19e39698cc631ed in / "
     },
     {
-      "created": "2020-06-02T21:19:57.279412246Z",
+      "created": "2020-06-02T21:20:12.934903021Z",
       "created_by": "/bin/sh -c #(nop)  CMD [\"sh\"]",
       "empty_layer": true
+    },
+    {
+      "created": "2020-06-22T16:58:15.423225002-04:00",
+      "created_by": "ENV VALUE=test",
+      "comment": "buildkit.dockerfile.v0",
+      "empty_layer": true
+    },
+    {
+      "created": "2020-06-22T16:58:15.423225002-04:00",
+      "created_by": "RUN /bin/sh -c echo \"${VALUE}\" > /test.txt # buildkit",
+      "comment": "buildkit.dockerfile.v0"
     }
   ],
   "os": "",

Notice that the original config was not only missing the history items from "our" image, but actually had the wrong history item from the old base image (in this case, just the created timestamp was swapped.)

JFrog (Artifactory)

Besides messing with the metadata in the config, this actually causes problems with JFrog products (e.g. Artifactory, JFrog Container Registry):

2020/06/22 16:55:22 pushing localhost:8082/docker-local/rebase-example:bad: PUT http://localhost:8082/v2/docker-local/rebase-example/manifests/bad: MANIFEST_INVALID: manifest invalid; map[description:Circuit Breaker Threshold Reached, Breaking Operation. see log output for manifest details.]

The images cannot be pushed - presumably internally it tries to match the history with layers, skipping empty ones, so cannot handle when there are fewer non-empty history entries than layers.

These changes ensure that rebased images can be pushed to these registries and have the benefit of preserving history properly for the general case.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@milas milas force-pushed the rebase-handle-empty-layers branch from 499323a to 7983230 Compare June 22, 2020 21:58
@codecov-commenter
Copy link

codecov-commenter commented Jun 22, 2020

Codecov Report

Merging #733 into master will decrease coverage by 0.04%.
The diff coverage is 72.97%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #733      +/-   ##
==========================================
- Coverage   79.43%   79.39%   -0.05%     
==========================================
  Files         102      102              
  Lines        4668     4683      +15     
==========================================
+ Hits         3708     3718      +10     
- Misses        531      534       +3     
- Partials      429      431       +2     
Impacted Files Coverage Δ
pkg/v1/mutate/rebase.go 50.00% <66.66%> (+2.17%) ⬆️
pkg/v1/mutate/image.go 72.51% <88.88%> (+0.64%) ⬆️
pkg/v1/random/image.go 82.60% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a1a46d...b34061e. Read the comment docs.

@milas
Copy link
Contributor Author

milas commented Jun 22, 2020

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo, otherwise lgtm, though @imjasonh is more familiar with this code and might have some comments.

pkg/v1/mutate/image.go Outdated Show resolved Hide resolved
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

Manifest v2 specs do not include layer entries for empty layers,
but the history in the config will include an entry for empty
layers (e.g. produced by `ENV` directive). History entries for
empty layers include `empty_layer` field set to `true`.

As a result, when rebasing and copying from the new base and
original to the rebased image, it's not sufficient to simply
iterate layers and pull from the same position/index in the
history from config. Instead, history should be iterated and
used to generate `Addendum`, while simultaneously iterating
layers, but only including the layer (and advancing the iterator)
if it's for a history item for a non-empty layer.

Since the history in config is optional per the OCI spec [1],
iteration via layers will still happen in the event that not
all layers are consumed during the history pass. (This should
also guard against a malformed history).

[1]: https://github.com/opencontainers/image-spec/blob/master/config.md
@milas milas force-pushed the rebase-handle-empty-layers branch from 4140ded to b34061e Compare June 22, 2020 23:26
@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@milas
Copy link
Contributor Author

milas commented Jun 22, 2020

Fixed the typo + adjusted the test to include an empty layer.

Apologies for force pushing after starting review, I made the CLA bot cranky with a different email address and so rebased it down to one commit while fixing.

@imjasonh
Copy link
Collaborator

imjasonh commented Jul 6, 2020

Sorry for missing this PR, it seems reasonable to me. Thanks for the fix!

@jonjohnsonjr jonjohnsonjr merged commit 92c877e into google:master Jul 6, 2020
@jonjohnsonjr
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants