Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement scripts for binary release build #932

Merged
merged 10 commits into from
Sep 19, 2024

Conversation

parthchandra
Copy link
Contributor

Which issue does this PR close?

Closes #721

Rationale for this change

Allows us to publish artifacts to maven

What changes are included in this PR?

Scripts, and Dockerfile to do the binary build in a docker container and include them in an uber jar

How are these changes tested?

Locally.

@parthchandra parthchandra marked this pull request as draft September 10, 2024 17:48
@parthchandra
Copy link
Contributor Author

@andygrove FYI. This will build an uber jar but does not have the script to deploy. That script can be a different PR.
Note: This includes support for MacOS binaries but that part does not actually work correctly because the build breaks on compiling Blake3. The MacOS build is skipped if the XCode library is not provided.

@parthchandra
Copy link
Contributor Author

MacOS build hits this - BLAKE3-team/BLAKE3#180. Will try the suggested solutions.

-t "comet-rm:$IMGTAG" \
--build-arg HAS_MACOS_SDK=${HAS_MACOS_SDK} \
--build-arg MACOS_SDK=${MACOS_SDK} \
--load \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make this change to get this to work (or at least get further along) on linux, based on the comment at docker/buildx#59 (comment).

Suggested change
--load \
--push \

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next issue I hit was:

------
 > exporting to image:
------
ERROR: failed to solve: failed to push comet-rm:latest: push access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
Cleaning up ...
Error response from daemon: No such container: comet-arm64-builder-container

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--push will try to push to docker hub and we don't want to do that since this is a temporary image. --load will load into your local containerd store but it that does not work for you then let me look for a workaround for this.

@parthchandra
Copy link
Contributor Author

@andygrove I changed the script to build a different image for each architecture instead of a single multi-arch image. It makes things much simpler at the cost of having multiple images (and a small increase in building time). It also removes the need to have the local container store and will also work with a custom docker backend as long as the backend supports docker build.
I'm hoping this addresses some of the authentication issues you are seeing.
Also, I have noticed I get some network errors doing the build inside a container when on a VPN. Maybe we could try when not on a VPN?
Anyway, could you take this for a spin?

@parthchandra
Copy link
Contributor Author

@andygrove @viirya For the binary builder I chose to use Ubuntu 20.04 as the base image because that is the image we currently use for our published docker images.
Ubuntu 20.04 has glibc 2.31 which means that many redhat based releases will be incompatible because they have an older glibc version. Centos 7 for instance has glibc 2.17
(See: https://gist.github.com/wagenet/35adca1a032cec2999d47b6c40aa45b1)

Should we consider using an older version of Ubuntu?
(BTW I tried to build with an older version of glibc but the build kept failing for one reason or the other so I abandoned that effort).

@viirya
Copy link
Member

viirya commented Sep 13, 2024

Hmm, I think for OSS Comet we don't have the restriction on supported glibc for platform compatibility. Glibc 2.31 seems to be released on 2020. I think it is old enough for the compatibility of our binary release. For example, Centos 7 is already EOL (https://blog.centos.org/2023/04/end-dates-are-coming-for-centos-stream-8-and-centos-linux-7/)

Ubuntu 20.04 looks like a reasonable choice.

I personally wouldn't want to spend too much efforts on resolving issues on building on older versions of Ubuntu.

@andygrove
Copy link
Member

I ran the scripts locally and they seem to have worked.

I ran this command:

./dev/release/build-release-comet.sh -r https://github.com/parthchandra/datafusion-comet.git -b binary-build

The resulting jar file contains the following native libs:

% jar tvf  spark/target/comet-spark-spark3.4_2.12-0.3.0-SNAPSHOT.jar | grep libcomet
149504624 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/darwin/aarch64/libcomet.dylib
52964152 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/linux/aarch64/libcomet.so
56773320 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/linux/amd64/libcomet.so

@parthchandra
Copy link
Contributor Author

The artifact

149504624 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/darwin/aarch64/libcomet.dylib

seems to be a leftover from a manual run. The script will not prepare macos binaries at the moment.

@parthchandra parthchandra marked this pull request as ready for review September 16, 2024 20:40
@parthchandra
Copy link
Contributor Author

@andygrove thank you for testing! This is ready for review.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @parthchandra!

@andygrove andygrove requested a review from viirya September 16, 2024 22:18
endif

# build native libs for arm64 architecture Linux/MacOS on a Linux/arm64 machine/container
core-arm64-libs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two core-arm64-libs targets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Makefile Outdated
Comment on lines 52 to 55
ifdef $(HAS_OSXCROSS)
cd native && cargo zigbuild -j 1 --target aarch64-apple-darwin --release
endif
cd native && cargo build -j 2 --release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for MacOSX build, we need to run both cargo zigbuild and cargo build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. This was a mistake. I experimented with zigbuild for macos. Removed

Makefile Outdated
# build native libs for arm64 architecture Linux/MacOS on a Linux/arm64 machine/container
core-arm64-libs:
# if the environment variable HAS_OSXCROSS is defined
ifdef $(HAS_OSXCROSS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need MacOS X SDK installed for HAS_OSXCROSS case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is a placeholder for future work to enable MacOS. MacOS Sdk has to be provided to the Docker file as input and the build-release-comet script will copy it into the release builder's Docker image.
I removed the option because the build did not succeed but left the work so we can fix this later. I can remove it if it makes things clearer.

# See the License for the specific language governing permissions and
# limitations under the License.
#
ARG HAS_MACOS_SDK="false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://hub.docker.com/r/messense/cargo-zigbuild claims they have MacOS X SDK pre-installed in their docker image. Can we reuse it to use MacOS X SDK for Comet build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not try the docker image from zigbuild (yet). I will try it and if it works, then we can remove the HAS_OSXCROSS portions entirely.
Follow up issue: #947

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the zigbuild docker image and the build failed. I'll investigate the failure in the followup.

@parthchandra
Copy link
Contributor Author

@viirya Any further comments?

@@ -46,6 +46,22 @@ format:
./mvnw compile test-compile scalafix:scalafix -Psemanticdb $(PROFILES)
./mvnw spotless:apply $(PROFILES)

# build native libs for amd64 architecture Linux/MacOS on a Linux/amd64 machine/container
core-amd64-libs:
cd native && cargo build -j 2 --release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to specify target for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will build the binary for the same architecture as the machine. So no need to specify target.

Comment on lines +52 to +55
ifdef HAS_OSXCROSS
rustup target add x86_64-apple-darwin
cd native && cargo build -j 2 --target x86_64-apple-darwin --release
endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as the L51 is not in an else block, if HAS_OSXCROSS is true, we will build the library for x86_64-apple-darwin additionally? I.e., two libraries for core-amd64-libs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For the amd64 architecture, one for linux and one for MacOS

./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.5 -P scala-2.13 -DskipTests install

echo "Installed to local repo: ${LOCAL_REPO}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to remove the created docker image/container after installation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container is removed in the cleanup part of the script which is invoked on exit or error.

@viirya
Copy link
Member

viirya commented Sep 19, 2024

Looks good to me, with a few minor questions.

@andygrove andygrove merged commit fa275f1 into apache:main Sep 19, 2024
74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create binary releases
3 participants