Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Docker image: Multi arch support #7397

Closed
wants to merge 5 commits into from

Conversation

Starbix
Copy link
Contributor

@Starbix Starbix commented May 3, 2020

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
  • Pull request includes a sign off
  • Code style is correct (run the linters)

This PR updates the CircleCI configuration to use docker buildx to build the docker image. This way multi-arch images can be built on amd64. Currently only amd64 and arm64v8 are enabled. But other architectures such as arm/v7 or even s390x can be easily added.

Building for amd64 and arm64v8 took around 1h 30m on a Docker Medium instance on CircleCI.

Closes #4599

In case you want to test it, the multiarch image created by CircleCI is available under starbix/synapse:latest (master branch).

Signed-off-by: Cédric Laubacher cedric@laubacher.io

@Starbix Starbix mentioned this pull request May 3, 2020
Copy link
Contributor

@babolivier babolivier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

It looks like both jobs are almost doing the same thing, with a consequent amount of instructions which would make this config harder to maintain in the long run.

I'd suggest moving that to a bash script (e.g. in .circleci) so that each job can just have one run instruction which could look like .circleci/dockerhub_upload [TAG] or something like that.

@Starbix Starbix force-pushed the multi-arch branch 3 times, most recently from 906824f to eee30ac Compare May 9, 2020 13:48
@Starbix
Copy link
Contributor Author

Starbix commented May 9, 2020

I've added CircleCI commands to remove duplicate code

@babolivier
Copy link
Contributor

This looks globally sane to me, however when I try to run it in CircleCI it seems to be stuck on the pip installs. I'll try to find some time to investigate why.

@babolivier
Copy link
Contributor

Also you can fix the CI by pulling and merging the develop branch into your branch.

@Starbix
Copy link
Contributor Author

Starbix commented May 11, 2020

This looks globally sane to me, however when I try to run it in CircleCI it seems to be stuck on the pip installs. I'll try to find some time to investigate why.

How long did you let it run for? It may just be slow

@babolivier
Copy link
Contributor

It's currently been running for 14min.

@babolivier
Copy link
Contributor

Oh wait it is indeed very slow, the amd64 passed that step, now it's just waiting for the arm64 one. I'll let it run and see what happens.

@Starbix
Copy link
Contributor Author

Starbix commented May 11, 2020

It can take 1.5 hours. Maybe only enabling multi-arch for releases would make sense.

@anoadragon453
Copy link
Member

We've decided this is something we want, but are aware that it would delay docker images after a release is made. We need to see whether this can be sped up at all, but if not we'll probably still do it.

@Starbix
Copy link
Contributor Author

Starbix commented May 26, 2020

Using a native arm64 machine as a self-hosted runner would speed things up.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started taking a look at this; a few initial comments and questions below.

Incidentally, I found https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/ a helpful introduction here.

Anyway, the problem here is the build time. I think the problem is simply that building all the arm stuff in an emulated arm environment is very slow. (for example: Building wheel for cryptography (PEP 517) takes almost 10 minutes).

I don't think it's appropriate that it should take an hour and a half for amd64 images to be built and published after each release simply to make it easier to run the images on arm, so we can't land this as-is.

A few thoughts on what we can do instead:

  • One option would be to build and push the amd64 and arm images separately, so that the amd64 users aren't made to suffer for the sake of the arm users. I don't know how easy that is to do.
  • We could switch the docker images away from alpine, which would mean we could take advantage of prebuilt manylinux wheels rather than having to build a load of stuff ourselves. Update the dockerfile to be Debian based, not Alpine based #6373 started this, but confused the issue with unrelated changes and eventually got abandoned. A clean start on this would be welcome.
  • Using a native arm64 machine would certainly help, but setting one up is going to be a whole bunch of work; I'd prefer to keep that as a last resort.

steps:
- checkout
- run: docker build -f docker/Dockerfile --label gitsha1=${CIRCLE_SHA1} -t matrixdotorg/synapse:${CIRCLE_TAG} -t matrixdotorg/synapse:${CIRCLE_TAG}-py3 .
- setup_remote_docker:
version: 18.09.3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to pin the docker version here? Indeed my impression is that buildx works better with 19.03?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not responding earlier, I wasn't near a computer for the last 2 weeks.
If I remember correctly, 18.09 is the highest version CircleCI supports and it defaulted to a lower version.
A higher version would be beneficial, as there wouldn't be a need to create context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newest version available seems to be 19.03 now: https://circleci.com/docs/2.0/building-docker-images/#docker-version
I'll be able to test it and make the other changes this week.

- run: curl --silent -L "https://github.com/docker/buildx/releases/download/<< parameters.buildx_version >>/buildx-<< parameters.buildx_version >>.linux-amd64" > ~/.docker/cli-plugins/docker-buildx
- run: chmod a+x ~/.docker/cli-plugins/docker-buildx
- run: docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
- run: docker context create old-style
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the context called old-style ? (indeed, why do we need to create a context here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name could be anything, but it was often called old-style in the tutorials I read. Creating a context is needed as CircleCI isn't running a relatively modern version of Docker.

Comment on lines +60 to +65
dockerfile:
type: string
default: docker/Dockerfile
push:
type: boolean
default: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any point in parameterising these? it seems to complicate the commandline without any real benefit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not at this moment. I added these parameters so one could easily expand the pipeline. I can remove the unused parameters.

type: string
default: --label gitsha1=${CIRCLE_SHA1}
steps:
- run: docker buildx build -f << parameters.dockerfile >><<# parameters.push >> --push<</ parameters.push >> --platform << parameters.platforms >> << parameters.parameters >> << parameters.context >>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default output format does not work well in circleci. suggest --progress=plain.

@richvdh
Copy link
Member

richvdh commented Jul 3, 2020

@richvdh richvdh added the X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged label Jul 3, 2020
richvdh pushed a commit that referenced this pull request Jul 17, 2020
As mentioned in #7397, switching to a debian base should help with multi-arch work to save time on compiling. This is unashamedly based on #6373, but without the extra functionality. Switch python version back to generic 3.7 to always pull the latest. Essentially, keeping this as small as possible. The image is bigger though unfortunately.
@richvdh
Copy link
Member

richvdh commented Jul 17, 2020

Now that #7839 has landed, I've kicked off another test build at https://app.circleci.com/pipelines/github/matrix-org/synapse/14292/workflows/0c35f3e6-05fe-42e3-a955-30712a8c13fa/jobs/25558. I'll be interested to see how long it takes this time.

@richvdh
Copy link
Member

richvdh commented Jul 21, 2020

I'll be interested to see how long it takes this time.

answer: it failed, for missing header files. Looks like more changes are needed to support arm64 builds here.

@richvdh
Copy link
Member

richvdh commented Aug 6, 2020

I think this is superceded by #7921, so I'm going to close it.

@richvdh richvdh closed this Aug 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add aarch64 docker image
4 participants