Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to build aarch64 wheels and artifacts #8196

Closed
wants to merge 2 commits into from

Conversation

janaknat
Copy link

@janaknat janaknat commented Jan 8, 2021

Add support to build kokoro artifacts and python wheels

@google-cla
Copy link

google-cla bot commented Jan 8, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Jan 8, 2021
@janaknat
Copy link
Author

janaknat commented Jan 8, 2021

@googlebot I signed it!

@jtattermusch
Copy link
Contributor

@dlj-NaN can you please review?

@janaknat
Copy link
Author

I've updated to build python 3.9 artifacts also.

@dlj-NaN
Copy link
Contributor

dlj-NaN commented Jan 27, 2021

+Josh for a second set of eyes.

Where is this expected to be run?

@janaknat
Copy link
Author

@dlj-NaN In the internal ci (Kokoro?) with arm64 hosts. EC2 has Graviton2 (*6g) instances.

@jtattermusch
Copy link
Contributor

I think we should add support for building aarch64 linux wheels sooner than later and this PR is a great first step towards that, but protobuf team won't be able to make these wheels "official" until we also have some test coverage for protobuf python on aarch64.

Here's what's IMHO our best way forward:

  • we'd actually build the ARM64 linux wheels by crosscompiling them (because that's easier for us to run on the existing kokoro infrastructure and doesn't require collecting build results from multiple machines after the build is done).
  • once we manually verify the crosscompiled build works fine, we'd merge the PR that adds the build support, but we wouldn't publish the arm64 wheels into any official channels (= so we would know we can build them and the infrastructure for that is ready, but we wouldn't publish them officially since at this point they'd be untested)
  • over time we'd add some testing for protobuf python on ARM (either using an emulator or using real ARM hardware, depending on what's easier. while testing on real hardware seems important for projects like gRPC, running protobuf tests on an emulator would also work since protobuf's tests are deterministic, single threaded and relatively simple).
  • once we have both the ability to build ARM64 wheels and we do have some (basic should be enough) test coverage for protobuf python on ARM64, we can proceed and make the python ARM64 wheels official. Once the wheels are official they can be uploaded them to pypi with each protobuf release (just as the x64 wheels are). Of course this step needs to be approved by the protobuf team.
  • this shouldn't really affect the existing release process as multiple wheels are already being uploaded in that step and the ARM64 wheels would come from the same source (as they'd be crosscompiled).

@jtattermusch
Copy link
Contributor

@dlj-NaN In the internal ci (Kokoro?) with arm64 hosts. EC2 has Graviton2 (*6g) instances.

While the EC2 Graviton instances will be useful for running gRPC tests (and perhaps for protobuf tests as well, but we still need to evaluate the best approach here), I think it makes sense to try to build ARM64 wheels by crosscompiling them. The reasons why that seems better are explained in my other comment.

Have you experimented with wheel crosscompilation? (It shouldn't be very different from building on a real ARM64 linux machine).

@janaknat
Copy link
Author

I haven't experimented with cross compilations. Though I have compiled under emulation.

From v1.8.0, cibuildwheel allows to build wheels of non-native architecture using CIBW_ARCHS_LINUX. See https://cibuildwheel.readthedocs.io/en/stable/options/#archs for more details about it.

I've got a working PR for scikit-image using cibuildwheel and Github Actions (x86_64): scikit-image/scikit-image#5197

Though protobuf is structured differently, would running under emulation with cibuildwheel work as a solution?

Copy link
Contributor

@jtattermusch jtattermusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://cibuildwheel.readthedocs.io/en/stable/options/#archs

I haven't experimented with cross compilations. Though I have compiled under emulation.

From v1.8.0, cibuildwheel allows to build wheels of non-native architecture using CIBW_ARCHS_LINUX. See https://cibuildwheel.readthedocs.io/en/stable/options/#archs for more details about it.

I've got a working PR for scikit-image using cibuildwheel and Github Actions (x86_64): scikit-image/scikit-image#5197

Though protobuf is structured differently, would running under emulation with cibuildwheel work as a solution?

Building the wheel under an emulator is one of the options, but the biggest concern here is the speed of build. Protobuf's codebase is not small so the concern is whether the python wheel build will become very slow.
If the wheel build under emulator takes a few minutes, it would be a nice option, but if it takes e.g. 30mins, that wouldn't be acceptable I think. If we do use an emulator, we should just build off of quay.io/pypa/manylinux2014_aarch64

@@ -0,0 +1,7 @@
FROM quay.io/pypa/manylinux2014_aarch64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried that out yet, but dockcross/manylinux2014-aarch64 should be a good start for being able to build aarch64 wheels on a x64 machine.

https://github.com/dockcross/dockcross

IIURC, the build process should be very similar to what's currently in place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtattermusch I've been trying to run with dockcross/manylinux2014-aarch64. I've run into issues with the protoc binary built. Error:

../src/protoc: line 117: /io/protobuf/src/.libs/lt-protoc: cannot execute binary file
../src/protoc: line 117: /io/protobuf/src/.libs/lt-protoc: Success
Generating google/protobuf/descriptor_pb2.py...

I checked the issues tab and found #3912 with a similar error signature. But that looks like it has been taken care of. Any suggestions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlj-NaN Any suggestions on the cross compilation? Ran into the binary file issue. It continues with 'Success' but abruptly fails after.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-compilation is not something I would expect to work as-is.

If everything runs under the target platform (hardware or emulator) -- and this means not just the Python build, but everything, starting from ./configure -- then the build steps should work. It sounds like like your investigation was based on building the regular protobuf runtime (the part built by make) for a different target (probably x86_64?).

A cross-build would instead have to build the protobuf runtime both for the target and the host running the build. In theory, this is possible... but it is very difficult to do (as you see).

Copy link
Author

@janaknat janaknat Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlj-NaN Currently I am running everything under dockcross/manylinux2014-aarch64.

In kokoro/release/python/linux/build_artifacts.sh if I set DOCKER_IMAGE to dockcross/manylinux2014-aarch64, multibuild will pull it in and use it to run everything from configure, to wheel building, to repair.

I had to modify the ./configure => ./configure --host=aarch64-linux-gnu (while building protoc) as part of the pre-build process in multibuild specified in kokoro/release/python/linux/config.sh

Immediately after, when running python setup.py build_py is when I run into the error. I've made sure that the env
PROTOC is set to pwd/src/protoc. setup.py looks for it before anything.

The host is x86_64 and dockcross cross compiles for aarch64 using manylinux2014-aarch64.

@@ -17,11 +17,16 @@ if [ $# -lt 3 ]; then
exit 1
fi

ARCH=`uname -m`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this file is now obsolete (I'm not 100% sure though): #8250

@jtattermusch
Copy link
Contributor

I've also experimented with crosscompiling the python wheels using the dockcross/manylinux2014-aarch64 image. This is what I've done (some steps are very similar to what @janaknat has tried)

  • set DOCKER_IMAGE=dockcross/manylinux2014-aarch64 in the top level script
  • in config.sh I ran ./configure --host=aarch64 instead of just ./configure
  • Since the make build now builds protoc binary that targets aarch64, I've created a trivial shell qemu-arm wrapper for protoc and set the PROTOC=path_to_my_wrapper before invoking python setup.py build_py (this step only runs python codegeneration and running protoc process under an emulator that's already available in the dockcross docker image is fast and simple).
  • In the python setup.py bdist_wheel --cpp_implementation --compile_static_extension command, I added --plat-name=manylinux2014_aarch64 argument.

This way the binary wheel build succeeds and all the native code gets crosscompiled correctly (both libprotobuf.a and the python extension) and all of the changes above would be quite easy to integrate into the existing script for building python wheels.

Unfortunately there is two issue I've discovered:

  • the binary wheel produced by the crosscompilation process above does actually contain native .so files that are incorrectly named - the files should be named e.g. google/protobuf/pyext/_message.cpython-39-aarch64-linux-gnu.so, but instead is named google/protobuf/pyext/_message.cpython-39-x86_64-linux-gnu.so (it's just the file name that's wrong, the binary itself is a aarch64 ELF as it should). AFAIK this suffix comes from sysconfig.get_config_var('EXT_SUFFIX') and I haven't yet found an easy way to override it (e.g. a cmdline arg that would enable me to set it).

  • the last step in building the wheel is running auditwheel repair, but it seems that auditwheel doesn't work in a cross-architecture environment (you need to be running on aarch64 to be able to audit and repair an aarch64 wheel). Since auditwheel running auditwheel is quite fast and it's a separate build step, it would be possible to first build the wheel under crosscompilation environment and then run auditwheel under aarch64 emulator but unfortunately that complicates the build process a bit.

I think both the issues I've ran into are fixable but the question is whether the fix would overly complicate the wheel build or not.

@jtattermusch
Copy link
Contributor

As an aside, to test the whether the native extension works I came up with this little script:
https://gist.github.com/jtattermusch/6d56350c6d19e21bf06633e6ef2667b3
(note that the protobuf wheel has two modes of operation and by default, the C++ acceleration seem to be off and pure python impl is used).

@jtattermusch
Copy link
Contributor

Update: looks like I've been able to successfully crosscompile an aarch64 wheel (in a way that's reasonably clean and integrates well with the existing script for building python wheels) and the wheel seems to work well on a real ARM machine and it passes the auditwheel show check.

I'll try to put together a PR that demonstrates what I've done soon (hopefully tomorrow).

@jtattermusch
Copy link
Contributor

Ok, looks like I've been able to solve all the remaining challenges in crosscompiling the python wheels and I now have a ready-to-review PR: #8280

@TeBoring
Copy link
Contributor

TeBoring commented Jun 1, 2021

Assuming this is no longer needed?

@TeBoring TeBoring closed this Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants