Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime of oci_image is proportional to the total size of of all tars plus the base image #365

Closed
wskinner-dd opened this issue Sep 19, 2023 · 9 comments
Labels
enhancement New feature or request performance

Comments

@wskinner-dd
Copy link

wskinner-dd commented Sep 19, 2023

The time required to build oci_image seems to be proportional to the total size of the tars plus the size of the base image. This was surprising to me. I expected oci_image to take time proportional to the size of only the modified or added layers. After reading through the source code and running some tests, it seems like the reason for this may have something to do with crane mutate.

Are these runtime characteristics an intended or necessary property of rules_oci or the OCI spec? Or are they due to implementation details of rules_oci and its dependencies like crane?

How to reproduce the issue

I tested this behavior by comparing the build time of four scenarios:

  1. An image with a 21GB base image, one 3.6GB layer and one tiny layer.
  2. An image with a 21GB base and one tiny layer.
  3. An image with a 1GB base, one 3.6GB layer and one tiny layer.
  4. An image with a 1GB base and one tiny layer.

For both scenarios, I built the target, then modified app.py and measured the time required to build again.

$ bazel build //:two_layer_big_image
INFO: Elapsed time: 58.190s, Critical Path: 58.13s

$ bazel build //:one_layer_big_image
INFO: Elapsed time: 21.697s, Critical Path: 21.65s

$ bazel build //:two_layer_small_image
INFO: Elapsed time: 48.928s, Critical Path: 48.88s

$ bazel build //:one_layer_small_image
INFO: Elapsed time: 1.089s, Critical Path: 1.04s 

BUILD

pkg_tar(
    name = "app_tar",
    srcs = ["//:app.py"],
)

pkg_tar(
    name = "big_file",
    srcs = ["//:ubuntu-22.04.3-desktop-amd64.iso"],
)

oci_image(
    name = "two_layer_small_image",
    base = "@python_311",
    tars = [
        ":big_file",
        ":app_tar",
    ],
)

oci_image(
    name = "one_layer_small_image",
    base = "@python_311",
    tars = [
        ":app_tar",
    ],
)

oci_image(
    name = "two_layer_big_image",
    base = "@nct_image",
    tars = [
        ":big_file",
        ":app_tar",
    ],
)

oci_image(
    name = "one_layer_big_image",
    base = "@nct_image",
    tars = [
        ":app_tar",
    ],
)
@wskinner-dd
Copy link
Author

After reading through the source code, I think I understand the reason this is slower than I expected. Building an oci_image basically does four things:

  • crane push
  • crane mutate --append <tar 1> --append <tar 2> ... --append <tar n>
  • crane config
  • crane pull

I wonder if there may be room to speed up the push and mutate steps. If zot's blob cache can be reused across build invocations, push becomes nearly instantaneous (this won't work for crane registries).

For crane mutate, it should in theory be possible to append each layer in order and cache the results. So if tar n changes, we could just append it to the cached result of a previous crane mutate --append <tar n - 1>.

@thesayyn
Copy link
Collaborator

#324 is gonna solve this. I can't continue my work on that right now, packed with other stuff.

@thesayyn
Copy link
Collaborator

Blocked by bazelbuild/bazel#20891

@thesayyn
Copy link
Collaborator

thesayyn commented Feb 8, 2024

Working on it at bazelbuild/bazel#21263

@thesayyn
Copy link
Collaborator

thesayyn commented May 8, 2024

this is fixed by combination of #560 #559 #558

@thesayyn thesayyn closed this as completed May 8, 2024
@brett-patterson-ent
Copy link

@thesayyn Thank you for the great work here! I was trying these changes out from the tip of the 2.x branch and we are seeing huge performance improvements.

I did run into one issue I wanted to bring up - I understand these changes aren't officially released yet so apologies if this is a known issue. When trying to include tar layers that are generated from a repository rule, the generated image.sh script fails to find them:

realpath: bazel-out/k8-fastbuild/bin/external/_main~debian_packages~apt__jammy_amd64__libgomp1_12_u_20220319_u_1ubuntu1/layer.tar: No such file or directory

I was able to fix this with a patch to include the layers as inputs to the action, even when symlinking:

diff --git a/oci/private/image.bzl b/oci/private/image.bzl
index 93a6668..bca45a6 100644
--- a/oci/private/image.bzl
+++ b/oci/private/image.bzl
@@ -175,8 +175,7 @@ def _oci_image_impl(ctx):
     # If tree artifact symlinks are supported just add tars into runfiles.
     if use_symlinks:
         transitive_inputs = transitive_inputs + ctx.files.tars
-    else:
-        inputs = inputs + ctx.files.tars
+    inputs = inputs + ctx.files.tars

     # add layers
     for (i, layer) in enumerate(ctx.files.tars):

Is this an appropriate fix or am I doing something wrong on my end?

@thesayyn
Copy link
Collaborator

thesayyn commented May 9, 2024

It's very possible that we broke something, could you please file an issue with a reproducer? i'd like to take a look

@brett-patterson-ent
Copy link

It's very possible that we broke something, could you please file an issue with a reproducer? i'd like to take a look

Filed #566, thanks!

@thesayyn
Copy link
Collaborator

Is this an appropriate fix or am I doing something wrong on my end?

Principled fix is here: #567

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

3 participants