Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: Exploring Docker multi-stage and best practices #19962

Closed
wants to merge 2 commits into from

Conversation

sirreal
Copy link
Member

@sirreal sirreal commented Nov 17, 2017

DO NOT MERGE

This is an exploration.

See:

@matticbot
Copy link
Contributor

@sirreal sirreal requested a review from samouri November 17, 2017 16:04
Dockerfile Outdated
# changes. For local development this should always
# be an empty file and therefore this layer should
# cache well.
# Build step
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine deps/build

Dockerfile Outdated
COPY ./env-config.sh /tmp/env-config.sh
ENV CALYPSO_ENV=production \
NODE_ENV=production
COPY --from=build /calypso/build ./build
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wishful thinking 😀

Depends other things…

@sirreal sirreal changed the title Build: Exploring multi-step and best practices Build: Exploring Docker multi-stage and best practices Nov 17, 2017
@sirreal
Copy link
Member Author

sirreal commented Nov 19, 2017

As of e870a4d, this setup generates a working image at 344MB, current master builds are around 1.58GB 🎉

wp-calypso-multi-step   latest              fc575cdb8b9b        40 seconds ago       344MB
wp-calypso              latest              1f8408fa4807        2 hours ago          1.58GB

@sirreal
Copy link
Member Author

sirreal commented Nov 30, 2017

needs secrets.json copy ✅ config is copied

@samouri
Copy link
Contributor

samouri commented Dec 6, 2017

Amazing work!! ⭐️
I just tested out this branch and it performed really well!

I'm so exciting about the possibility of using multi-step capabilities so that our Dockerfile can both be easy to understand while staying optimal in terms of size and speed of build.

Is there anything I can do to help move this along?


a question I have looking at the docker file is whether or not we can split it into more layers to help with caching.

For example:

  1. node_modules changes the least frequently and is the biggest
  2. many public assets also change very infrequently
  3. server and build are based on node_modules and change less frequently than the client bundles
  4. finally the client bundles change pretty frequently and could be last

@sirreal
Copy link
Member Author

sirreal commented Dec 7, 2017

Thanks @samouri!

I'd intended this as an exploration and investigation as to whether changes like this could improve our build speed and size. As such, I'll share my findings here.

I believe the largest impact on build size is the base image change for the final layer to node alpine:

FROM node:8.9.1-alpine

Much of the multi-stage build size and speed improvements are negated by the fact that Calypso builds are rather unusual.

I'd expect to build software from source, producing executables which exist and can function without the source being present. Calypso does not appear to behave this way. Notice that at least the following directories are required for running Calypso after it's been built:

wp-calypso/Dockerfile

Lines 35 to 42 in 2927f50

COPY --from=build /calypso/.github /calypso/.github
COPY --from=build /calypso/docs /calypso/docs
COPY --from=build /calypso/config /calypso/config
COPY --from=build /calypso/node_modules /calypso/node_modules
COPY --from=build /calypso/server /calypso/server
COPY --from=build /calypso/build /calypso/build
COPY --from=build /calypso/public /calypso/public
RUN chown -R nobody /calypso

The presence of so many files makes the final instruction in this layer chown -R nobody /calypso the slowest in the Docker build process. It needs to inspect and change properties on all of the files.

If we can avoid the chown nobody step altogether we should get a big speed boost. If the goal is not to use root, running the entire build as a normal user like calypso may be a viable option.

As I understand it, multi-stage builds are primarily worthwhile in cases where there image required to build an executable is large, but the environment required to run and executable and the executable itself are relatively small. That doesn't appear to be the case with Calypso, so I'm not sure how much advantage a multi-stage build has for us.

Conclusion: I've learned a lot and we can make some improvements. Multi-stage probably isn't an important improvement for speed or size given the way Calypso builds work.


Is there anything I can do to help move this along?

I think the way forward here may be to incrementally apply what's been learned in this PR to our existing builds. Things that should have real, noticeable impact:

  • Add anything not strictly necessary to .dockerignore
  • Explicitly COPY files and directories (no COPY . /calypso).
  • Try to avoid chown nobody
  • Change base image to node alpine

Here's a checklist extracted from above

  • Improve .dockerignore
  • chown -R nobody node_modules after install (Build: Cache chown node_modules #20589)
  • Explicit installs over copy . /calypso.
    • chown -R nobody $DIRECTORY for each explicit copy (cache this expensive operation)
    • chown -R nobody $WHATEVER_IS_PRODUCED_BY_BUILD after build (don't perform redundant chown)
  • Check with systems about image change (node-alpine)
  • Check with devs about image change (node-alpine)
  • Why nobody user? If the goal is not-root, can we create normal system user calypso?
  • Ensure Docker infrastructure is Docker 17.05+. Only required for multi-stage and it's not clear this had much advantage.

a question I have looking at the docker file is whether or not we can split it into more layers to help with caching.

For example:

  1. node_modules changes the least frequently and is the biggest
  2. many public assets also change very infrequently
  3. server and build are based on node_modules and change less frequently than the client bundles
  4. finally the client bundles change pretty frequently and could be last

Some of this layer caching should already occur as long as it's not disabled by the actual build command (--no-cache=true). Each instruction should produce an independent cached layer. Please fact check me 🙂

The trick is getting them into the right order, when a layer has changed, all subsequent layers will need to be rebuilt. That said, most of the operations are quite cheap (COPY, with the exception of npm install, which is already an early cached layer.

However, all of the COPY instructions are followed by a single chown -R nobody /calypso, which is very time consuming. If we move to independent copies, each copy could be followed by a chown -R nobody $DIRECTORY, which would cache this expensive operation in a layer.

It would also be great to add a layer to chown -R nobody node_modules immediately after npm install, and exclude node_modules from later chown. I expect this alone would be a major improvement given the problem is the sheer number of files involved, which is mostly in node_modules.

@sirreal
Copy link
Member Author

sirreal commented Dec 7, 2017

Exploring chown -R nobody node_modules caching in #20589

@dmsnell
Copy link
Member

dmsnell commented Jan 14, 2018

@sirreal can you provide a status-update on this awesome PR? Is it worth updating? Worth closing? Is there anything you need to keep iterating on it?

@sirreal
Copy link
Member Author

sirreal commented Jan 15, 2018

Several improvements have been spun off from the learnings here, most significantly #20589 and #21537.

I'd like to revisit this after those are closed, there may be more wins that can be found. Marking as on hold for now.

@sirreal
Copy link
Member Author

sirreal commented Jan 18, 2018

Multi-stage is not currently an option.

@sirreal sirreal force-pushed the try/docker/multi-step branch 2 times, most recently from 42b0c61 to 2192a23 Compare January 26, 2018 23:59
@sirreal
Copy link
Member Author

sirreal commented Jan 27, 2018

I've rebased and updated this to reflect lessons learned, especially #21570. This is now an example of modern Docker best practices applied to Calypso.

@lsl
Copy link
Contributor

lsl commented Mar 13, 2018

FYI, this was merged recently(ish) giving you chown on copy. See updated COPY docs for more. Sorry - not sure which version exactly.

Example usage would be:

COPY --from=build --chown=nobody:nobody /calypso/public /calypso/public

Doesn't seem to add much overhead in my experience - maybe not copying large enough dir's?

By the way, if you want to shave off a few more mb's you could npm install --only=production in the main final stage instead of copying in node_modules used in the build stage. If you want to prevent the double download you could have an extra build stage that just pulls in production deps, which the build stage then adds in dev deps to. That would also prevent any potential cache issues.

@sirreal
Copy link
Member Author

sirreal commented Mar 13, 2018

@lsl Thanks! This branch is a bit outdated right now, but as we reach compatibility with multi-stage builds I do hope to revisit it.

The whole chown issue was gracefully sidestepped by #21962, so while this PR was leveraging COPY --chown, that should all be irrelevant now 🙂

If you have the interested and expertise, please feel free to improve it. I do hope to continue with this in the near future 👍

@alisterscott
Copy link
Contributor

This PR needs a rebase - is this PR still required?

@sirreal
Copy link
Member Author

sirreal commented Aug 6, 2018

Let's close this. There are relevant findings here and improvements to be made, but the exploration here has largely served it's purpose and will continue to be available in this conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants