-
Notifications
You must be signed in to change notification settings - Fork 764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: add support for linux-arm64 build #1214
Conversation
I believe this PR can now be improved with the techniques described here: https://www.docker.com/blog/dockerfiles-now-support-multiple-build-contexts/ Before I do this, is this project still maintained? Is this feature even wanted or likely to be merged? Just checking before I do any more work, since I haven't seen any interaction here. |
@strophy Thanks so much for the contrib! Apologies on the delay -- i saw this PR a while ago but i was unfamiliar with many of the concepts / tools used here so i didn't get around to it until now. Sorry again! Pardon my ignorance, but i wonder how popular is linux-arm64? What's its most common use cases? In general, we're aiming to provide pre-built binaries for the most popular platforms, and so i'm not sure we're willing to pay for the complexity of maintaining for less popular ones, especially if the complexity is high. There are a few things in this PR that i'm not too sure about.. and i wonder if they can be addressed somehow:
Also, curious how is this PR related to #1159 as you've mentioned in the PR description? Do you think this will be a progressive step to getting us M1 support? (and if so how big is the gap there?) In any case, thanks so much for the PR again! Btw, just ran CI and it's failing early. Could you take a look at the failure (link)? Thanks! |
@@ -49,10 +51,11 @@ RUN ./scripts/init_submodules.sh | |||
###################################### | |||
# Stage 2: Copy source files and build | |||
###################################### | |||
FROM node:12.22.6-stretch AS copy-and-build | |||
FROM node:16-bullseye AS copy-and-build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change prevents the following error under arm64 when running the closure compiler as follows:
#0 159.1 /github/grpc-web/packages/grpc-web/node_modules/.bin/google-closure-compiler --js=exports.js --js=../../javascript --js=node_modules/google-closure-library --entry_point=grpc.web.Exports --externs=externs.js --dependency_mode=PRUNE --compilation_level=ADVANCED_OPTIMIZATIONS --generate_exports --export_local_property_definitions --js_output_file=index.js
#0 179.0 npm ERR! code ENOENT
#0 179.0 npm ERR! syscall access
#0 179.0 npm ERR! path /github/grpc-web/packages/grpc-web/node_modules/grpc-web.js
#0 179.1 npm ERR! errno -2
#0 179.1 npm ERR! enoent ENOENT: no such file or directory, access '/github/grpc-web/packages/grpc-web/node_modules/grpc-web.js'
#0 179.1 npm ERR! enoent This is related to npm not being able to find a file.
#0 179.1 npm ERR! enoent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./bazel-installer.sh && \ | ||
rm ./bazel-installer.sh | ||
RUN wget -nv -O /usr/local/bin/bazelisk \ | ||
https://github.com/bazelbuild/bazelisk/releases/download/v$BAZELISK_VERSION/bazelisk-linux-$TARGETARCH && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of bazelisk is required, since bazel-$BAZEL_VERSION-installer-linux-$arch
does not exist for arm64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
May i ask you a favor to list out all the dependency updates you've made in this PR in a bullet-point summary? (Feel free to mention all other notable changes you made as well!)
@@ -19,22 +19,24 @@ | |||
# Stage 1: Fetch binaries | |||
###################################### | |||
# node:... Docker image is based on buildpack-deps:stretch | |||
FROM buildpack-deps:stretch AS prepare | |||
FROM buildpack-deps:bullseye AS prepare |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was made to match base image versions with the requirement to use node16:bullseye
below
|
||
ARG BUILDIFIER_VERSION=1.0.0 | ||
ARG BUILDIFIER_VERSION=5.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is required because Buildifier 1.0.0 does not exist for arm64
Hello, I have refactored the PR to use I have documented the various build errors that occur if I do not bump the dependency versions. It might be possible to resolve some of these in another way, but I am tempted to just use newer versions that take non-x86 targets into account rather than debug dependencies. I still occasionally face the following issues:
|
Hi thanks for the reply!
Oh aha i see!! Thanks for providing the context! Actually, historically i've always been using This is actually something an interesting point that you've brought up because i've noticed the mismatch between build-time v.s. runtime dependency before, but i didn't know if there's an alternative way of specifying build-time dependency other than using
I really appreciate you finding an alternative solution using the buildx here! Although, i'm a bit hesitant in doing this across the board (other than just in this workflow), because we have a bunch of documentations in various sites that we need to update too (example) if we decide to do that, and i'm not yet convinced that docker buildx bake + HCL is broadly adopted enough for that purpose.. (maybe it is? :)) For that reason, may I ask you to "fork" the workflow into a separate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks so much for the updates! Just a few comments! 😃
Mainly, as mentioned in a previous comment, would you mind forking out the workflow changes into a separate file for now? And will consider merging as we adopt buildx + hcl more broadly. Thanks!!
UPDATE: Ahh there's a CI error too :)
Step 7/35 : RUN case $TARGETARCH in arm64) arch="aarch_64"; ;; amd64) arch="x86_64"; ;; *) echo "ERROR: Machine type $TARGETARCH not supported."; ;; esac && curl -sSL https://github.com/protocolbuffers/protobuf/releases/download/v$PROTOBUF_VERSION/protoc-$PROTOBUF_VERSION-linux-$arch.zip -o protoc.zip && unzip -qq protoc.zip && cp ./bin/protoc /usr/local/bin/protoc
---> Running in 4f88b2ef6fea
ERROR: Machine type not supported.
[protoc.zip]
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of protoc.zip or
protoc.zip.zip, and cannot find protoc.zip.ZIP, period.
Service 'prereqs' failed to build: The command '/bin/sh -c case $TARGETARCH in arm64) arch="aarch_64"; ;; amd64) arch="x86_64"; ;; *) echo "ERROR: Machine type $TARGETARCH not supported."; ;; esac && curl -sSL https://github.com/protocolbuffers/protobuf/releases/download/v$PROTOBUF_VERSION/protoc-$PROTOBUF_VERSION-linux-$arch.zip -o protoc.zip && unzip -qq protoc.zip && cp ./bin/protoc /usr/local/bin/protoc' returned a non-zero code: 9
@@ -49,10 +51,11 @@ RUN ./scripts/init_submodules.sh | |||
###################################### | |||
# Stage 2: Copy source files and build | |||
###################################### | |||
FROM node:12.22.6-stretch AS copy-and-build | |||
FROM node:16-bullseye AS copy-and-build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./bazel-installer.sh && \ | ||
rm ./bazel-installer.sh | ||
RUN wget -nv -O /usr/local/bin/bazelisk \ | ||
https://github.com/bazelbuild/bazelisk/releases/download/v$BAZELISK_VERSION/bazelisk-linux-$TARGETARCH && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
May i ask you a favor to list out all the dependency updates you've made in this PR in a bullet-point summary? (Feel free to mention all other notable changes you made as well!)
Thanks for reviewing! I've made the requested changes, but any testing is incredibly frustrating due to random failures running Bazelisk. I can't reproduce it reliably, but it only occurs when running the build of a whole file, stopping the build just before the problematic step and then running the command directly inside the partially built image always works perfectly, so there is no easy way for me to read the contents of the error log. I think it is best if we pause here and wait for the release of Bazel 5.2.0 which should contain some fixes relating to running in an emulator, allowing us to remove some of the hacks in this PR. I think it's also best if we stop working on caching solutions and do that in a separate PR, since we can now build arm64 in a separate experimental workflow, speed is no longer a priority. List of updated dependencies:
I'm unable to spend much more time working on this, and I don't know how your CI system (kokoro?) works. From the log output I can see it is using the Docker classic builder to attempt to process a Dockerfile that requires buildx to build, so some changes will be necessary here. Also interesting to note how building from a compose file will likely be handled in the future here, tl;dr it will parse the compose yml file, generate .hcl files and delegate everything to |
Oh ok i see!! Very sorry to hear about the frustrations! I'm perfectly fine to pause here and wait for the 5.2.0 release! 😃
Sounds great to me too! Speed is indeed not an issue here at all (since we only do this every x months really..) 😃
Thanks so much for the list of deps! I'll edit the PR summary to add these 😃
Do you think the Bazel 5.2.0 will help resolve the issue here? I think our kokoro CI essentially just runs the basic and interop scripts. Do they work locally for you?
Oh wow! That's awesome! Would probably get a great improvement when it happens! |
Hi @sampajano Bazel 5.2.0 was released with a lot of ARM fixes, but I am still seeing the same error ( |
@strophy Thanks so much for checking again! Indeed the Zig solution is already checked in and seems to be a more self-contained change overall (and seems easy to apply to all existing platforms too). 😃 Thank you so much for the contributions here and iterations again! 😃 I'll close this PR for now but i think whatever discussion/change we have here would help provide a great context should we ever need to reevaluate our build method! 😃 |
This PR (my first here!) contains GitHub Actions CI to build
linux-arm64
artifacts in CI, partially addressing some of the issues raised in #1159.While writing this I was unable to use (deprecated)
docker-compose
or the newerdocker compose
ordocker bake
system to build a compose file withdepends_on
. I think this may be due to different behaviour between Docker classic builds and buildx, but buildx is required for multi-arch images. As a result, I had to explicitly build and push each Dockerfile manually and in sequence so running the action will now also push updatedprereq
andprotoc_plugin
images to Docker Hub. Together with running in an emulator, this is quite slow (although caching improves this on subsequent runs), so I'm open to suggestions for improvements. It might also make sense to separate out building docker images and building a release, depending on how often you release.Several underlying versions and dependencies had to be bumped in order to build successfully on arm64, and I switched from
bazel
tobazelisk
, which was also necessary to build.Dependency Updates:
node:16.15.0-bullseye
prepare
stage now usesbuildpack-deps:bullseye
for build toolchain consistency, although I think this bump is not strictly necessary5.1.0
because1.0.0
does not support arm644.1.0
replaced with Bazelisk1.11.0
(installs latest Bazel, currently5.2.0
)google-closure-compiler
JS dependency bumped to~20220301.0.0
to avoid build errorgoogle-protobuf
JS dependency bumped to3.19.4
to match existing dependency used inprereqs
image