Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce KOCACHE #269

Merged
merged 4 commits into from
Dec 8, 2021
Merged

Introduce KOCACHE #269

merged 4 commits into from
Dec 8, 2021

Conversation

jonjohnsonjr
Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr commented Dec 16, 2020

Start of #264

Currently, use this by setting KOCACHE=/tmp/ko (or any directory you want). We can default to doing this in the future.

We will cache binaries and metdata under $KOCACHE/bin/<import path>/<platform>:

$ tree /tmp/ko
/tmp/ko                                                                                                                                                                                                                                                                                                                        
`-- bin                                                                                                                                                                                                                                                                                                                        
    `-- github.com                                                                                                                                                                                                                                                                                                             
        `-- google                                                                                                                                                                                                                                                                                                             
            `-- ko                                                                                                                                                                                                                                                                                                             
                `-- linux                                                                                                                                                                                                                                                                                                      
                    |-- amd64                                                                                                                                                                                                                                                                                                  
                    |   |-- buildid-to-diffid                                                                                                                                                                                                                                                                                  
                    |   |-- diffid-to-descriptor                                                                                                                                                                                                                                                                               
                    |   `-- out                                                                                                                                                                                                                                                                                                
                    |-- arm                                                                                                                                                                                                                                                                                                    
                    |   |-- buildid-to-diffid                                                                                                                                                                                                                                                                                  
                    |   |-- diffid-to-descriptor                                                                                                                                                                                                                                                                               
                    |   `-- out                                                                                                                                                                                                                                                                                                
                    |-- arm64                                                                                                                                                                                                                                                                                                  
                    |   |-- buildid-to-diffid                                                                                                                                                                                                                                                                                  
                    |   |-- diffid-to-descriptor                                                                                                                                                                                                                                                                               
                    |   `-- out                                                                                                                                                                                                                                                                                                
                    |-- ppc64le                                                                                                                                                                                                                                                                                                
                    |   |-- buildid-to-diffid                                                                                                                                                                                                                                                                                  
                    |   |-- diffid-to-descriptor                                                                                                                                                                                                                                                                               
                    |   `-- out                                                                                                                                                                                                                                                                                                
                    `-- s390x                                                                                                                                                                                                                                                                                                  
                        |-- buildid-to-diffid                                                                                                                                                                                                                                                                                  
                        |-- diffid-to-descriptor                                                                                                                                                                                                                                                                               
                        `-- out   

In buildid-to-diffid, we maintain a map from the binary's buildid to the diffid we produced from it:

$ cat /tmp/ko/bin/github.com/google/ko/linux/amd64/buildid-to-diffid 
{
  "DpZ76uEtG19-5zUMMduu/ZydApuKMqo03IenzJjld/E9kHVxb9IXx9H09h2JH0/vW8Nmsi20Na9bBDsFDYX": "sha256:fb701193c5e0cedb4291fb7308219cc95ec20afe04e24491be1cedf72c3f7a4a",
  "IL1S_KSUq5wIP90ZkKhZ/j-1pixYk9Al_YB-B02oM/QDXIdXeDJc-14xTSVCTB/7wkOkmLWlnCRO42sVTBS": "sha256:d7640b5ef558e77eb69caac1405be6729dde3c7a4c9ccb6ba7fb4ea3812858ca"
}

You can determine the buildid from the binary:

$ go tool buildid /tmp/ko/bin/github.com/google/ko/linux/amd64/out 
IL1S_KSUq5wIP90ZkKhZ/j-1pixYk9Al_YB-B02oM/QDXIdXeDJc-14xTSVCTB/7wkOkmLWlnCRO42sVTBS

Which allows us to skip computing the diffid (tar + sha256) at all.

In diffid-to-descriptor, we maintain a map from diffid to a layer descriptor:

$ cat /tmp/ko/bin/github.com/google/ko/linux/amd64/diffid-to-descriptor 
{
  "sha256:d7640b5ef558e77eb69caac1405be6729dde3c7a4c9ccb6ba7fb4ea3812858ca": {
    "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
    "size": 12162562,
    "digest": "sha256:d4695baddba464a9e730d1d7580ae2aba998ae77980de64b1c4b28a945412519"
  },
  "sha256:fb701193c5e0cedb4291fb7308219cc95ec20afe04e24491be1cedf72c3f7a4a": {
    "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
    "size": 12173723,
    "digest": "sha256:ddc4badd9eacf3b55c9254ca62bd737b3994f8b1c03e99f9873e776889cd2c2b"
  }
}

... which is enough info to HEAD the blob in-registry without having to actually re-gzip it.

This is a separate structure from buildid-to-diffid so that we can decouple diffid-to-descriptor from go builds and reuse it for the kodata layers (or really anything -- docker does very similar stuff here): #262

@mattmoor @imjasonh

Numbers

Warm + cache miss (and fill) + registry miss:

$ time KOCACHE=/tmp/ko KO_DOCKER_REPO=localhost:1338/ko ko publish --platform=all .

real  0m10.501s
user  0m11.848s
sys   0m4.252s

Warm + cache miss (and fill) + registry hits:

$ time KOCACHE=/tmp/ko KO_DOCKER_REPO=localhost:1338/ko ko publish --platform=all .

real  0m9.456s
user  0m11.410s
sys   0m4.165s

Warm + cache hits + registry miss:

$ time KOCACHE=/tmp/ko KO_DOCKER_REPO=localhost:1338/ko ko publish --platform=all .

real  0m5.721s
user  0m6.478s
sys   0m3.758s

Warm + cache hits + registry hits:

$ time KOCACHE=/tmp/ko KO_DOCKER_REPO=localhost:1338/ko ko publish --platform=all .

real  0m4.071s
user  0m5.216s
sys   0m4.358s

What more can we do? Well... about half of this time is actually spent fetching the base image. If we pull that down to a local registry:

$ crane cp gcr.io/distroless/static:nonroot localhost:1338/distroless:nonroot
$ time KO_DEFAULTBASEIMAGE=localhost:1338/distroless:nonroot KOCACHE=/tmp/ko KO_DOCKER_REPO=localhost:1338/ko ko publish --platform=all .

real  0m1.682s
user  0m3.750s
sys   0m2.630s

Starting towards achieving that in #525

@jonjohnsonjr
Copy link
Collaborator Author

@mattmoor boilerplate requires /* */ instead of // now?

@jonjohnsonjr jonjohnsonjr marked this pull request as draft December 16, 2020 01:04
@mattmoor
Copy link
Collaborator

@jonjohnsonjr boilerplate requests whatever is in hack/boilerplate/boilerplate.go.txt IIRC

Copy link
Collaborator

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to start making tarball.LayerFromOpener optionally return an estargz layer, what about this would need to change?

pkg/build/gobuild.go Outdated Show resolved Hide resolved
hasher := md5.New() //nolint: gosec // No strong cryptography needed.
hasher.Write([]byte(strings.Join(args, " ") + " " + strings.Join(defaultEnv, " ")))

tmpDir = filepath.Join(os.TempDir(), "ko", ip, hashInputs(args, defaultEnv))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you factor in the working directory? What if I have multiple ko workspaces? 🤔

Comment on lines 323 to 283
if os.Getenv("KO_STABLE_OUTPUT") == "" {
os.RemoveAll(tmpDir)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why cache if it is an error?

Copy link
Collaborator Author

@jonjohnsonjr jonjohnsonjr Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't want to delete the other metadata in that directory, even though the out binary is bad (or never got built). I think I want to rework this directory structure a bit before we merge it, e.g. maybe buildid-to-diffid and diffid-to-descriptor are global and have the same layout as go env GOCACHE.

Base automatically changed from master to main February 3, 2021 19:56
@github-actions
Copy link

github-actions bot commented May 5, 2021

This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen. Mark as fresh by adding the
comment /remove-lifecycle stale.

@imjasonh
Copy link
Member

imjasonh commented May 5, 2021

/remove-lifecycle stale

I still want this.

@jonjohnsonjr
Copy link
Collaborator Author

I don't think we have a bot that actually obeys us. I've just been manually changing the labels.

@codecov-commenter
Copy link

codecov-commenter commented Dec 6, 2021

Codecov Report

Merging #269 (75914d7) into main (5640c33) will decrease coverage by 2.72%.
The diff coverage is 8.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #269      +/-   ##
==========================================
- Coverage   50.60%   47.88%   -2.73%     
==========================================
  Files          41       41              
  Lines        2051     2170     +119     
==========================================
+ Hits         1038     1039       +1     
- Misses        838      955     +117     
- Partials      175      176       +1     
Impacted Files Coverage Δ
pkg/build/layer.go 0.00% <0.00%> (ø)
pkg/build/gobuild.go 46.40% <9.77%> (-10.05%) ⬇️
pkg/commands/commands.go 72.72% <0.00%> (-2.28%) ⬇️
pkg/commands/completion.go

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5640c33...75914d7. Read the comment docs.

@jonjohnsonjr jonjohnsonjr force-pushed the ko-faster branch 4 times, most recently from f6da708 to 4ea7b57 Compare December 7, 2021 00:01
Cache binaries under $KOCACHE/<import path>/<platform>

Cache metdata mapping buildid to diffid and diffid to descriptor.
@jonjohnsonjr jonjohnsonjr marked this pull request as ready for review December 7, 2021 00:17
@jonjohnsonjr
Copy link
Collaborator Author

Want to take a look and fiddle with it?

pkg/build/gobuild.go Outdated Show resolved Hide resolved
@jonjohnsonjr jonjohnsonjr changed the title POC: KO_META_CACHE layer info Introduce KOCACHE Dec 7, 2021
mattmoor
mattmoor previously approved these changes Dec 7, 2021
@jonjohnsonjr
Copy link
Collaborator Author

jonjohnsonjr commented Dec 7, 2021

I'm tempted to namespace stuff a little bit like $KOCACHE/bin/<import path> instead of just $KOCACHE/<import path>.

Then we could cache base image metadata under $KOCACHE/img/<repo> or something.

imjasonh
imjasonh previously approved these changes Dec 7, 2021
pkg/build/gobuild.go Outdated Show resolved Hide resolved
pkg/build/gobuild.go Outdated Show resolved Hide resolved
@jonjohnsonjr jonjohnsonjr dismissed stale reviews from imjasonh and mattmoor via 75e0a4c December 7, 2021 18:15
@jonjohnsonjr jonjohnsonjr force-pushed the ko-faster branch 2 times, most recently from a4bb483 to 42179d5 Compare December 7, 2021 18:19
This makes things a little cleaner by having a single place that calls
buildLayer and passing a thunk down into the cache logic to call that on
a cache miss.

Also, remove the debug logging to make the code easier to follow (if you
need to recompile anyway, it's easy enough to add log lines).
Copy link
Collaborator

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants