Skip to content
This repository has been archived by the owner on Jun 5, 2024. It is now read-only.

Rework how we cache APKs #77

Merged
merged 2 commits into from
Jul 8, 2023

Conversation

jonjohnsonjr
Copy link
Contributor

@jonjohnsonjr jonjohnsonjr commented Jul 7, 2023

Fixes chainguard-dev/apko#772
Fixes chainguard-dev/apko#773
Fixes chainguard-dev/apko#780

Rather than caching the entire APK, we will cache the APK sections separately. It is trivial to recombine them using cat to produce the exact original APK, so we don't lose any data.

Doing this gives us two useful things:

  1. The process of splitting the APK is relatively expensive, as we have to parse the targz stream to know where to split, exactly. We can amortize this across builds by doing it once per APK.
  2. The individual sections have content hashes in the APK data model, so we can use those hashes as keys in the cache (filenames). When we write to the cache, we compute the hashes ourselves, so we get cache invalidation for free.

Rather than caching the entire APK, we will cache the APK sections
separately. It is trivial to recombine them using cat to produce the
exact original APK, so we don't lose any data.

Doing this gives us two useful things:

1. The process of splitting the APK is relatively expensive, as we have
   to parse the targz stream to know where to split, exactly.
2. The individual sections have content hashes in the APK data model,
   so we can use those hashes as keys in the cache (filenames).
   When we write to the cache, we compute the hashes ourselves, so we
   get cache invalidation for free.

Signed-off-by: Jon Johnson <jon.johnson@chainguard.dev>
@jonjohnsonjr
Copy link
Contributor Author

Cold before:

apko publish --keyring-append  --repository-append  --arch amd64    13.19s user 4.11s system 112% cpu 15.437 total

image

Cold after:

apko publish --keyring-append  --repository-append  --arch amd64    13.59s user 4.08s system 124% cpu 14.236 total

image

Notably, we aren't waiting for fetchPackage to finish writing the file to disk before we start running ExpandAPK, so the overall execution time drops.

Warm before:

apko publish --keyring-append  --repository-append  --arch amd64    13.12s user 4.68s system 223% cpu 7.971 total

image

Warm after:

apko publish --keyring-append  --repository-append  --arch amd64    10.24s user 2.61s system 187% cpu 6.851 total

image

Notably, we no longer pay the price of ExpandAPK on cache hits, because we do that once on cache misses and cache the results.

We shave off a second of execution time, and we also do ~20-30% less work gunzipping APKs.

@jonjohnsonjr jonjohnsonjr requested a review from mattmoor July 7, 2023 21:23
pkg/apk/implementation_test.go Show resolved Hide resolved
This will make it possible to regenerate the Checksum byte slice if
things change for any reason.

Signed-off-by: Jon Johnson <jon.johnson@chainguard.dev>
@jonjohnsonjr jonjohnsonjr merged commit eda0bb8 into chainguard-dev:main Jul 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants