Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(toolkit): improve docker build time in CI #1776

Merged
merged 7 commits into from
Feb 21, 2019
Merged

Conversation

jogold
Copy link
Contributor

@jogold jogold commented Feb 15, 2019

When running in CI, try to pull the latest image first and use it as cache for the build. CI is detected by the presence of the CI environment variable or the --ci command line option.

When pushing docker images, always tag with latest.

Closes #1748

To be discussed: name of env var for detection, CI is set by lots of CI providers (not by CodeBuild though), use a opt-in mechanism instead?

Test (with empty ECR):

$ docker images -a | grep "/cdk/" | awk '{print $3}' | xargs docker rmi # or a more specific grep
$ cdk deploy # should build and push image
$ docker images -a | grep "/cdk/" | awk '{print $3}' | xargs docker rmi --force # or a more specific grep
$ CI=true cdk deploy # should pull, build almost instantly but not push

Pull Request Checklist

  • Testing
    • Unit test added
    • CLI change?: coordinate update of integration tests with team
    • cdk-init template change?: coordinated update of integration tests with team
  • Docs
    • jsdocs: All public APIs documented
    • README: README and/or documentation topic updated
  • Title and Description
    • Change type: title prefixed with fix, feat will appear in changelog
    • Title: use lower-case and doesn't end with a period
    • Breaking?: last paragraph: "BREAKING CHANGE: <describe what changed + link for details>"
    • Issues: Indicate issues fixed via: "Fixes #xxx" or "Closes #xxx"
  • Sensitive Modules (requires 2 PR approvers)
    • IAM Policy Document (in @aws-cdk/aws-iam)
    • EC2 Security Groups and ACLs (in @aws-cdk/aws-ec2)
    • Grant APIs (only if not based on official documentation with a reference)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.

When running in CI, try to pull the latest image first and use it as cache
for the build. CI is detected by the presence of the `CI` environment variable.

Closes aws#1748
@jogold jogold requested a review from a team as a code owner February 15, 2019 21:41
Copy link
Contributor

@rix0rrr rix0rrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like the change and the refactorings you made, thanks for that.

  • Can you describe a little more about the rationale behind the change? Are we not trading docker build time for docker pull time? Are we guaranteed that pulling will be faster than building? Why limit this behavior change to CI builds only?

  • Though I like the use of the CI environment variable, I think there needs to be a command-line switch --ci which is threaded through to the build step. When not supplied, it could default to the presence of the environment variable CI. Be aware that --no-ci should also work to disable an autodetected CI build (it'll be the difference between undefined and false in the yargs object).

@@ -95,6 +132,9 @@ async function calculateImageFingerprint(imageId: string) {
// Metadata that has no bearing on the image contents
delete manifest.Created;

// Parent can change when using --cache-from in CI
delete manifest.Parent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Parent represent here? Please tell me it has nothing to do with disk images

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the sha256 of the docker's parent image. When using --cache-from, the sha256 of the parent gets modified, even for the same final image.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, if it's a parent image, and the parent image changes on every build, then we cannot assume that the "current" image is the same either, can we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the sha256 of the layers will change normally. I added this when I saw that for the same digest (on ECR) when using --cache-from, calculateImageFingerprint returned a different finger print, resulting in 3 tags on the same image (latest, fingerprint1 and fingerprint2). This needs further investigation. The optimal solution may be a mix between the custom finger print calculation and the docker digest. Looking into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing this delete manifest.Parent, it's a mistake and not needed here. The sha256 of the Parent is also present in the Config.Image key so deleting it in one place only should not have had an effect.

await shell(['docker', 'tag', imageId, qualifiedImageName]);
await shell(['docker', 'tag', imageId, latest]); // Tag with `latest` also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to describe this change in the PR description as well.

@rix0rrr rix0rrr self-assigned this Feb 18, 2019
@jogold
Copy link
Contributor Author

jogold commented Feb 18, 2019

Can you describe a little more about the rationale behind the change? Are we not trading docker build time for docker pull time? Are we guaranteed that pulling will be faster than building?

We are indeed trading docker build time for docker pull time. Generally pulling will be faster than building for Dockerfiles with OS packages installation and code packages installation (pip, npm, etc.). If nothing has changed in the current commit (in CI), it will be faster to pull than to build. If there are changes, it's more likely to see changes in the code that is copied in the image (COPY/ADD, the bottom layers) and again pulling should be faster than building.

For very simple Dockerfiles (FROM + COPY/ADD, few layers) or for a complete change of the Dockerfile, I can imagine that it could be faster to build.

Why limit this behavior change to CI builds only?

When running locally chances are that we already have a lot of layer cache available. In this case, pulling will almost always be slower than building.

Though I like the use of the CI environment variable, I think there needs to be a command-line switch --ci which is threaded through to the build step. When not supplied, it could default to the presence of the environment variable CI. Be aware that --no-ci should also work to disable an autodetected CI build (it'll be the difference between undefined and false in the yargs object).

OK, I will look into this.

@rix0rrr
Copy link
Contributor

rix0rrr commented Feb 18, 2019

Thanks for the explanation. I'd like to see that rationale captured in one or two sentenced in a code comment somewhere.

@@ -61,6 +61,7 @@ async function parseCommandLineArguments() {
.command('deploy [STACKS..]', 'Deploys the stack(s) named STACKS into your AWS account', yargs => yargs
.option('exclusively', { type: 'boolean', alias: 'e', desc: 'only deploy requested stacks, don\'t include dependencies' })
.option('require-approval', { type: 'string', choices: [RequireApproval.Never, RequireApproval.AnyChange, RequireApproval.Broadening], desc: 'what security-sensitive changes need manual approval' }))
.option('ci', { type: 'boolean', desc: 'Force CI detection. Use --no-ci to disable CI autodetection.', default: undefined })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rix0rrr The description makes no reference to docker/asset preparation. I thought that this option could be used elsewhere in the future. What about the documentation for this new option? I see that https://github.com/awsdocs/aws-cdk-user-guide/blob/master/doc_source/tools.md is not up to date...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems good. @Doug-AWS will update the tools section once we release.

Please follow this by something like:

if (argv.ci === undefined) {
   argv.ci = process.env.CI !== undefined;
}

To make sure that afterwards argv.ci is well-defined. Saves having to redo the CI-autodetection logic in every place where it's used. Configuring:

default: process.env.CI !== undefined

To force the default value to either true or false might also work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tools is updated. I should have a new public build of the guide by EOW.

@@ -61,6 +61,7 @@ async function parseCommandLineArguments() {
.command('deploy [STACKS..]', 'Deploys the stack(s) named STACKS into your AWS account', yargs => yargs
.option('exclusively', { type: 'boolean', alias: 'e', desc: 'only deploy requested stacks, don\'t include dependencies' })
.option('require-approval', { type: 'string', choices: [RequireApproval.Never, RequireApproval.AnyChange, RequireApproval.Broadening], desc: 'what security-sensitive changes need manual approval' }))
.option('ci', { type: 'boolean', desc: 'Force CI detection. Use --no-ci to disable CI autodetection.', default: undefined })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems good. @Doug-AWS will update the tools section once we release.

Please follow this by something like:

if (argv.ci === undefined) {
   argv.ci = process.env.CI !== undefined;
}

To make sure that afterwards argv.ci is well-defined. Saves having to redo the CI-autodetection logic in every place where it's used. Configuring:

default: process.env.CI !== undefined

To force the default value to either true or false might also work.

let loggedIn = false;

// In CI we try to pull latest first
if (ci === true || (process.env.CI && ci !== false)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do the CI autodetection here, move it up to cdk.ts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I have a script that recreates the 'cdk --help' part of the tools section whenever we have a new release.

'build',
'--quiet',
asset.path];
const command = process.env.CI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use the variable here.

@rix0rrr rix0rrr merged commit 1060b95 into aws:master Feb 21, 2019
eladb pushed a commit that referenced this pull request Feb 26, 2019
When running in CI, try to pull the latest image first and use it as cache
for the build. CI is detected by the presence of the `CI` environment variable.

Closes #1748
@eladb eladb mentioned this pull request Mar 11, 2019
5 tasks
@jogold jogold deleted the docker-cache branch March 12, 2019 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve docker build time in CI environment
3 participants