Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: speedup docker build time #6787

Closed
wants to merge 8 commits into from
Closed

chore: speedup docker build time #6787

wants to merge 8 commits into from

Conversation

jeluard
Copy link
Contributor

@jeluard jeluard commented May 15, 2024

Motivation

Speedup docker builds

Description

Tweak current Docker build process to improve build speeds.

This is achieved by:

  • using regular node images for build layers; this remove the need to add extra tooling via apk every run
  • document how to use yarn offline mirror to reduce the need to download dependencies

Using regular node images comes at a higher download cost (image is bigger) but is cached. slim images do not ship with the necessary tooling.

Note that the current Docker build creates a multi-platform Docker image (arm64 and amd64). We can probably improve things further by having a dedicated Dockerfile for dev usage.

Results

Before

1st run: 5m 22s
Subsequent runs, after code change: 4m 52s

After

1st run: 3m 16s
Subsequent runs, after code change: 2m 32s

@jeluard jeluard requested a review from a team as a code owner May 15, 2024 16:04
@jeluard jeluard mentioned this pull request May 15, 2024
Copy link

codecov bot commented May 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.20%. Comparing base (49c1689) to head (6da58fd).

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #6787   +/-   ##
=========================================
  Coverage     62.19%   62.20%           
=========================================
  Files           571      571           
  Lines         60021    60021           
  Branches       1973     1976    +3     
=========================================
+ Hits          37333    37334    +1     
+ Misses        22645    22644    -1     
  Partials         43       43           

Dockerfile Outdated Show resolved Hide resolved
@jeluard jeluard marked this pull request as draft May 16, 2024 07:02
Copy link
Member

@nflaig nflaig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to confirm this does not increase the image size

Dockerfile Outdated Show resolved Hide resolved
@jeluard
Copy link
Contributor Author

jeluard commented May 17, 2024

I added some experiment using a lodestar base builder image. This reduce the need to either use a big image or update via apk every build.
This image will have to be pushed to ChainSafe dockerhub beforehand.

@@ -0,0 +1,2 @@
FROM node:20-alpine
RUN apk update && apk add --no-cache g++ make python3 && rm -rf /var/cache/apk/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the argument against using the node:20 as builder image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some unclear reason it fails to build native libs in the second stage. Might be the original reason for using alpine.
Also this one would be smaller and offers more opportunity for further tweaks down the road.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really not sure it's worth to have a separate builder image just for these dependencies as the layer will always be retrieved from local cache after the first local build, and even if it is executed it just took 11 seconds on my machine.

The issues I see with using a separate builder image is that right now we automatically use the latest 20.x version which is good to get security updates in. There is also extra maintenance to keep this up-to-date if it's not part of the CI. Although we could build this image before running the main docker build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved back to a single node image.


```bash
yarn config set yarn-offline-mirror .yarn/yarn-offline-mirror
mv ~/.yarnrc ./ # `yarn config` commands are always global, make `.yarnrc` specific to this project
Copy link
Member

@nflaig nflaig May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone has settings in their global yarn config? can't we just commit a .yarnrc to the project with this setting in it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, the offline mirror will always be enabled. Not sure if we want that by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the disadvantage other than storage increase which is not a lot.

> du -sh .yarn
396M    .yarn

The advice to move the global yarn config into the current folder is not ideal, what if there are other settings in this file, means the global yarn config was effectively deleted.

If we wanna support offline mirror would just add a .yarnrc in git

yarn-offline-mirror ".yarn/yarn-offline-mirror"
yarn-offline-mirror-pruning true

@nflaig
Copy link
Member

nflaig commented Jun 7, 2024

Note that the current Docker build creates a multi-platform Docker image (arm64 and amd64). We can probably improve things further by having a dedicated Dockerfile for dev usage.

I don't think this is correct. By default docker will only create an image for the platform it's built on. To create multiarch images you need to pass --platform linux/amd64,linux/arm64 and requires an emulator like QEMU to be installed

@jeluard
Copy link
Contributor Author

jeluard commented Jun 7, 2024

@nflaig Yes that's what we are doing, see this. If we were not doing multi-platform build the Dockerfile could be simplified to have only a single build stage (w/o the FROM --platform).

@jeluard jeluard marked this pull request as ready for review June 7, 2024 12:01
@nflaig
Copy link
Member

nflaig commented Jun 7, 2024

@nflaig Yes that's what we are doing, see this. If we were not doing multi-platform build the Dockerfile could be simplified to have only a single build stage (w/o the FROM --platform).

ah you mean we could remove the middle stage in case we don't need to support arm64 images?

@jeluard
Copy link
Contributor Author

jeluard commented Jun 7, 2024

In a separate Dockerfile only used for local dev, yes

ARG COMMIT
WORKDIR /usr/app
RUN apk update && apk add --no-cache g++ make python3 py3-setuptools && rm -rf /var/cache/apk/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using regular node images for build layers; this remove the need to add extra tooling via apk every run

this layer will be retrieved from cache after the first run, but rather than reducing build time, a good argument for using the node:22 full image as base is that we reduce the likelyhood that systems deps are missing.

But overall, this might increase the build time in the CI as the node:22 image is not cached there, and afaik quite large to download.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that if download time is an issue for CI, there would be ways to improve this via some caching.
The main goal of those changes is to speedup docker builds during dev cycles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal of those changes is to speedup docker builds during dev cycles.

it's not clear to me then why we need to use a full node:22 as based on my observations the apk install layer will always be retrieved from cache after first run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm can't reproduce what I believe I was seeing during the interop indeed.

Copy link
Member

@nflaig nflaig Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did some testings with offline mirror vs. without, the build time seems to be roughly the same. The reason I would assume for this is that I have anyways all packages cached so there shouldn't be a difference.

Hmm can't reproduce what I believe I was seeing during the interop indeed.

maybe it still does some checks against the yarn registry? There were definitely request timeouts I noticed during the interop which we might be able to avoid by using a offline mirror.

@jeluard
Copy link
Contributor Author

jeluard commented Jun 7, 2024

Doesn't appear to provide value.

@jeluard jeluard closed this Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants