Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: kaniko dir config option #1997

Merged

Conversation

jdockerty
Copy link
Contributor

@jdockerty jdockerty commented Mar 20, 2022

Fixes #1945

Description

Moving away from a hard-coded /kaniko directory, this was in constants.go, to one which is configurable by the user - this can be done either through a flag, --kaniko-dir, or an environment variable, KANIKO_DIR.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes unit tests
  • Adds integration tests if needed.

See the contribution guide for more details.

Reviewer Notes

  • The code flow looks good.
  • Unit tests and or integration tests added.

Release Notes

  • executor adds a new flag --kaniko-dir or environment variable KANIKO_DIR to override the default directory that kaniko uses.

@jdockerty jdockerty marked this pull request as ready for review March 20, 2022 19:20
@jdockerty jdockerty changed the title Feat/kaniko dir config option feat: kaniko dir config option Mar 20, 2022
@jdockerty
Copy link
Contributor Author

@imjasonh I can't set you as a reviewer myself unfortunately 😢

Could you approve the workflows too, please 😄

@imjasonh imjasonh self-requested a review March 20, 2022 19:27
@imjasonh
Copy link
Collaborator

This looks good! We'll see if the tests pass, and I think we can merge this. 👍

But it's not the whole story either. If a build has KANIKO_DIR set, we need to do something to ensure that the contents of /kaniko in the image (the executor binary, the cred helpers, the Dockerfile, etc.) get moved into KANIKO_DIR so they don't end up in the final image. That shouldn't necessarily block this PR, but I want to make sure we have a plan for making and testing that next change before we proceed.

@jdockerty
Copy link
Contributor Author

Much appreciated @imjasonh

Looks as though it had an issue with the credentials on the PR itself, invalid version: Get "https://proxy.golang.org/github.com/....": net/http: TLS handshake timeout.

Is this what you mean by the cred helpers not being in the correct place? Unsure whether it's a temporary network issue too on the GitHub Actions side, I know they've had a couple of problems recently, with it being a TLS timeout, I'll take a dig into it a little bit later, just to be absolutely sure.

@imjasonh
Copy link
Collaborator

That TLS issue seems like a flake, I'll just keep retrying until that works.

Re: cred helpers etc. The Kaniko image contains all its own important bits in the /kaniko directory today. That's specified in the Dockerfile:

kaniko/deploy/Dockerfile

Lines 52 to 58 in 8651c06

FROM scratch
COPY --from=0 /src/out/executor /kaniko/executor
COPY --from=0 /usr/local/bin/docker-credential-gcr /kaniko/docker-credential-gcr
COPY --from=0 /usr/local/bin/docker-credential-ecr-login /kaniko/docker-credential-ecr-login
COPY --from=0 /usr/local/bin/docker-credential-acr-env /kaniko/docker-credential-acr-env
COPY --from=certs /ca-certificates.crt /kaniko/ssl/certs/
COPY --from=0 /kaniko/.docker /kaniko/.docker

As the executor runs, it fills up its own filesystem with the results of the ongoing build, and after each RUN scans the filesystem for changes since the last time it scanned. This tells it what it needs to put in the new layer. But, it also needs to ignore the contents of /kaniko, because that's not meant for the final image, that's just its own little island inside the container filesystem where it knows Kaniko's own stuff lives.

With your change, all of this should work just fine -- if KANIKO_DIR is unset, the executor will default to /kaniko as the special island directory, and ignore it when it scans for updates. If KANIKO_DIR is set, Kaniko won't ignore /kaniko anymore, and when it sees stuff in it (its own executor binary, for example), it will think those files belong in the final image. We don't want that.

So what we need to do -- it doesn't have to be in this PR, but we should plan to do it soon -- is, at startup, if KANIKO_DIR != "/kaniko", have Kaniko move any existing contents of /kaniko into KANIKO_DIR and rm /kaniko so it doesn't end up in the final image.

Does that make sense? It's all pretty mind-bending. Let me know if I can help explain it better in any way.

@jdockerty
Copy link
Contributor Author

jdockerty commented Mar 21, 2022

I believe that makes sense, so that change (I'll raise another PR later on) would be specifically targeted towards the files within deploy/.

I had a skim over the dockerfile package, but I don't believe anything would need to be altered in here? I.e. there is no alteration required to any instructions there. Although I might be misunderstanding it.

Do you think it might be best served to have another issue for it as well? Feel free to assign me onto that as well as I'll link back to your comment for reference in the resulting PR, so others can see a full conversation history as it evolved.

@imjasonh
Copy link
Collaborator

I believe that makes sense, so that change (I'll raise another PR later on) would be specifically targeted towards the files within deploy/.

I had a skim over the dockerfile package, but I don't believe anything would need to be altered in here? I.e. there is no alteration required to any instructions there. Although I might be misunderstanding it.

Files inside deploy/ wouldn't change. Those files determine how the Kaniko image is produced, and they should remain as-is. The executor binary, cred helpers, etc, should remain in /kaniko in the image.

The change we need is for the executor to understand that if it's not in KANIKO_DIR (because KANIKO_DIR != /kaniko), that it should move there so it doesn't end up in the final image.

That might actually just be as easy as adding os.MkdirAll(os.Getenv("KANIKO_DIR")) and os.Rename("/kaniko", os.Getenv("KANIKO_DIR")) when the executor starts. The bigger part of that change will be ensuring that this doesn't break any existing behavior.

Do you think it might be best served to have another issue for it as well? Feel free to assign me onto that as well as I'll link back to your comment for reference in the resulting PR, so others can see a full conversation history as it evolved.

All of this is part of the overarching work to make /kaniko configurable, so I think it belongs under the current issue.

@jdockerty
Copy link
Contributor Author

jdockerty commented Mar 21, 2022

The change we need is for the executor to understand that if it's not in KANIKO_DIR (because KANIKO_DIR != /kaniko), that it should move there so it doesn't end up in the final image.

That makes a lot of sense, I was getting confused in thinking we'd need to modify the actual image build process itself.

I've added a small check now. The unit tests are passing locally from make test, although make integration-test is seemingly not able to create the test_dir data as expected, even without setting KANIKO_DIR. I'd like to check it from within CI too so I can be sure I'm not missing anything that I shouldn't be.

I believe there's certainly some extra test cases to be done. I'd be hesitant to test the checkKanikoDir function itself, since this would practically be testing the os package again - although I could be wrong here. I'd probably add some more integration tests, since we'd be setting the environment and reading/writing files based on KANIKO_DIR or the --kaniko-dir flag, but I'm not 100% sure how I'd actually go about that yet; I have a general idea, but I'll wait on the CI results before charging ahead.

Copy link
Collaborator

@imjasonh imjasonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks promising! I'll try to figure out how to run a test where KANIKO_DIR is set to something else, to make sure this works. Thanks for working on this!

cmd/executor/cmd/root.go Outdated Show resolved Hide resolved
Co-authored-by: Jason Hall <jasonhall@redhat.com>
@jdockerty
Copy link
Contributor Author

No worries at all Jason, much appreciated for your help too!

@imjasonh
Copy link
Collaborator

Looking at the integration tests it looks like it should be fairly straightforward to add one that exercises KANIKO_DIR. For example, based on this test:

func TestBuildViaRegistryMirrors(t *testing.T) {

You could do something like:

func TestKanikoDir(t *testing.T) {
	...

	// Build with kaniko
	kanikoImage := GetKanikoImage(config.imageRepo, "Dockerfile_registry_mirror")
	dockerRunFlags := []string{"run", "--net=host"}
	dockerRunFlags = addServiceAccountFlags(dockerRunFlags, config.serviceAccount)
	dockerRunFlags = append(dockerRunFlags, ExecutorImage,
		"-f", dockerfile,
		"-d", kanikoImage,
		"-e", "KANIKO_DIR=/not-kaniko",
		"-c", fmt.Sprintf("git://%s", repo))
	...
}

You can run that with go test ./integration/... -test.run=TestKanikoDir -repo=gcr.io/your-repo

If that build works, then it tell us that setting KANIKO_DIR doesn't break any builds. We should also try to check that the image it produces doesn't contain anything in /not/kaniko or /kaniko. That would give us a good idea whether this

@jdockerty jdockerty force-pushed the feat/kaniko-dir-config-option branch from a77ff5d to 355ae4a Compare March 29, 2022 20:01
@jdockerty
Copy link
Contributor Author

jdockerty commented Mar 29, 2022

That's super useful, much appreciated Jason. I've fixed the mkdir error now too which was there previously, I realised I was passing in os.Getenv to the check, rather than the config.KanikoDir option where we retrieve the environment variable via a function, so that is working nicely now.

If you don't mind, I'd like to quickly sanity check with the CI, as I'm seeing a failure, even on the main branch, with container-diff, so I want to rule out that there is some misconfiguration of docker or otherwise on my machine.

    integration_test.go:414: []integration.diffOutput differ (-got, +want):   []integration.diffOutput{
                {
                        Image1:   "localhost:5000/docker-dockerfile_registry_mirror",
                        Image2:   "localhost:5000/kaniko-dockerfile_registry_mirror",
                        DiffType: "File",
                        Diff: &integration.fileDiffResult{
                                Adds: nil,
        -                       Dels: []integration.fileDiff{{Name: "/etc/machine-id", Size: 33}},
        +                       Dels: nil,
                        },
                }

For some reason, the machine-id is being mounted in, I don't believe it should be though. Especially with kaniko, it might have something to do with my setup, but I'd like to 100% confirm as it does seem rather unrelated, I could quite easily be wrong though.

@imjasonh
Copy link
Collaborator

If you don't mind, I'd like to quickly sanity check with the CI, as I'm seeing a failure, even on the main branch, with container-diff, so I want to rule out that there is some misconfiguration of docker or otherwise on my machine.

                        Diff: &integration.fileDiffResult{
                                Adds: nil,
        -                       Dels: []integration.fileDiff{{Name: "/etc/machine-id", Size: 33}},
        +                       Dels: nil,
                        },
                }

For some reason, the machine-id is being mounted in, I don't believe it should be though. Especially with kaniko, it might have something to do with my setup, but I'd like to 100% confirm as it does seem rather unrelated, I could quite easily be wrong though.

That's definitely weird! It doesn't seem related to your change though, so I wouldn't worry about it for now.

@jdockerty
Copy link
Contributor Author

The tests passed in the CI, so it looks like it was some weird configuration on my end - I'm developing on a VM, so I wonder whether there's any funkiness going on there with docker.

Copy link
Collaborator

@imjasonh imjasonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good! The integration test especially makes me pretty confident this isn't going to break any existing behavior. 👍

cmd/executor/cmd/root.go Outdated Show resolved Hide resolved
cmd/executor/cmd/root.go Outdated Show resolved Hide resolved
integration/integration_test.go Outdated Show resolved Hide resolved
integration/integration_test.go Show resolved Hide resolved
jdockerty and others added 4 commits March 31, 2022 16:50
Co-authored-by: Jason Hall <jasonhall@redhat.com>
Co-authored-by: Jason Hall <jasonhall@redhat.com>
Co-authored-by: Jason Hall <jasonhall@redhat.com>
@jdockerty
Copy link
Contributor Author

Yeah absolutely, the extra test makes me less anxious about it 😆 Much appreciated for pointing me in the right direction on the example test to follow!

I've added your suggested changes now too, no worries on the styling side. I think I'll actually start implementing what you said in my own stuff, I quite like it 😄

@imjasonh imjasonh merged commit d4cf490 into GoogleContainerTools:main Mar 31, 2022
@jdockerty jdockerty deleted the feat/kaniko-dir-config-option branch March 31, 2022 19:13
@imjasonh imjasonh mentioned this pull request Apr 1, 2022
@shpprfb
Copy link

shpprfb commented Apr 29, 2022

@jdockerty I'm very excited about this new functionality but I'm seeing issues with this PR as others are mentioning in #1945.

  1. --kaniko-dir seems to not do anything at all, I can pass the flag and see /kaniko is still filling up with snapshots and such.
  2. Setting KANIKO_DIR as an env var seems to trigger the codepath in checkKanikoDir (
    if err := os.MkdirAll(dir, os.ModeDir); err != nil {
    return err
    }
    if err := os.Rename(constants.DefaultKanikoPath, dir); err != nil {
    return err
    }
    }
    return nil
    ) for doing the rename, but it always fails with "file exists". This happens whether the destination previously exists (like mounting a volume in kubernetes), or when passing a dirname that I expect kaniko to create for me(although Im not sure what the usefulness of that would be).

I made a quick playground repro of this issue here https://play.golang.com/p/ev1FjrASYAp

Any ideas on how to handle this? Maybe if the destination doesn't exist, the rename is easy and will succeed as long as we don't create it. If the destination does exist, I think we may need to recursively copy the contents of /kaniko there.

@jdockerty jdockerty mentioned this pull request Apr 29, 2022
4 tasks
@jdockerty
Copy link
Contributor Author

jdockerty commented Apr 29, 2022

@shpprfb I do believe I know where the issue is of it not picking it up, it was an oversight by me.

Here it is passing the config options value. I've raised #2067 rather quickly to get my initial thoughts down, which I'll use to investigate further during some free evenings and iterate on it - fixing failing tests etc. Then I'll probably further investigate into some other things with Jason, as I believe I've missed some things, even though the tests previously passed for this feature to be merged.

Thanks for raising this 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make KanikoDir a configuration option
3 participants