Skip to content

Commit

Permalink
feat: implement scripts for binary release build (#932)
Browse files Browse the repository at this point in the history
* feat: implement scripts for binary release build

* Install to temp local maven repo and updates for MacOS

* newline

* Use independent docker images for different architectures instead of
a multi-arch image

* update docs and cleanup

* remove unused code

* fail build script on error

* Build all profiles

* remove duplicate target from makefile

---------

Co-authored-by: Andy Grove <agrove@apache.org>
  • Loading branch information
parthchandra and andygrove authored Sep 19, 2024
1 parent 9dfd6d1 commit fa275f1
Show file tree
Hide file tree
Showing 6 changed files with 422 additions and 0 deletions.
16 changes: 16 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,22 @@ format:
./mvnw compile test-compile scalafix:scalafix -Psemanticdb $(PROFILES)
./mvnw spotless:apply $(PROFILES)

# build native libs for amd64 architecture Linux/MacOS on a Linux/amd64 machine/container
core-amd64-libs:
cd native && cargo build -j 2 --release
ifdef HAS_OSXCROSS
rustup target add x86_64-apple-darwin
cd native && cargo build -j 2 --target x86_64-apple-darwin --release
endif

# build native libs for arm64 architecture Linux/MacOS on a Linux/arm64 machine/container
core-arm64-libs:
cd native && cargo build -j 2 --release
ifdef HAS_OSXCROSS
rustup target add aarch64-apple-darwin
cd native && cargo build -j 2 --target aarch64-apple-darwin --release
endif

core-amd64:
rustup target add x86_64-apple-darwin
cd native && RUSTFLAGS="-Ctarget-cpu=skylake -Ctarget-feature=-prefer-256-bit" CC=o64-clang CXX=o64-clang++ cargo build --target x86_64-apple-darwin --release
Expand Down
53 changes: 53 additions & 0 deletions dev/release/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,50 @@ python3 generate-changelog.py 0.0.0 HEAD 0.1.0 > ../changelog/0.1.0.md
Create a PR against the _main_ branch to add this change log and once this is approved and merged, cherry-pick the
commit into the release branch.

### Build the jars

#### Setup to do the build
The build process requires Docker. Download the latest Docker Desktop from https://www.docker.com/products/docker-desktop/.
If you have multiple docker contexts running switch to the context of the Docker Desktop. For example -

```shell
$ docker context ls
NAME DESCRIPTION DOCKER ENDPOINT ERROR
default Current DOCKER_HOST based configuration unix:///var/run/docker.sock
desktop-linux Docker Desktop unix:///Users/parth/.docker/run/docker.sock
my_custom_context * tcp://192.168.64.2:2376

$ docker context use desktop-linux
```
#### Run the build script
The `build-release-comet.sh` script will create a docker image for each architecture and use the image
to build the platform specific binaries. These builder images are created every time this script is run.
The script optionally allows overriding of the repository and branch to build the binaries from (Note that
the local git repo is not used in the building of the binaries, but it is used to build the final uber jar).

```shell
Usage: build-release-comet.sh [options]

This script builds comet native binaries inside a docker image. The image is named
"comet-rm" and will be generated by this script

Options are:

-r [repo] : git repo (default: https://github.com/apache/datafusion-comet.git)
-b [branch] : git branch (default: release)
-t [tag] : tag for the spark-rm docker image to use for building (default: "latest").
```

Example:

```shell
cd dev/release && ./build-release-comet.sh && cd ../..
```

#### Build output
The build output is installed to a temporary local maven repository. The build script will print the name of the repository
location at the end. This location will be required at the time of deploying the artifacts to a staging repository

### Tag the Release Candidate

Tag the release branch with `0.1.0-rc1` and push to the `apache` repo
Expand All @@ -105,6 +149,15 @@ Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to crea
GH_TOKEN=<TOKEN> ./dev/release/create-tarball.sh 0.1.0 1
```

### Publish the maven artifacts
#### Setup maven
##### One time project setup
Setting up your project in the ASF Nexus Repository from here: https://infra.apache.org/publishing-maven-artifacts.html
##### Release Manager Setup
Set up your development environment from here: https://infra.apache.org/publishing-maven-artifacts.html

TODO: build and publish a release candidate to nexus.

### Start an Email Voting Thread

Send the email that is generated in the previous step to `dev@datafusion.apache.org`.
Expand Down
202 changes: 202 additions & 0 deletions dev/release/build-release-comet.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null && pwd)"
COMET_HOME_DIR=$SCRIPT_DIR/../..

function usage {
local NAME=$(basename $0)
cat <<EOF
Usage: $NAME [options]
This script builds comet native binaries inside a docker image. The image is named
"comet-rm" and will be generated by this script
Options are:
-r [repo] : git repo (default: ${REPO})
-b [branch] : git branch (default: ${BRANCH})
-t [tag] : tag for the spark-rm docker image to use for building (default: "latest").
EOF
exit 1
}

function cleanup()
{
if [ $CLEANUP != 0 ]
then
echo Cleaning up ...
if [ "$(docker ps -a | grep comet-arm64-builder-container)" != "" ]
then
docker rm comet-arm64-builder-container
fi
if [ "$(docker ps -a | grep comet-amd64-builder-container)" != "" ]
then
docker rm comet-amd64-builder-container
fi
CLEANUP=0
fi
}

trap cleanup SIGINT SIGTERM EXIT

CLEANUP=1

REPO="https://github.com/apache/datafusion-comet.git"
BRANCH="release"
MACOS_SDK=
HAS_MACOS_SDK="false"
IMGTAG=latest

while getopts "b:hr:t:" opt; do
case $opt in
r) REPO="$OPTARG";;
b) BRANCH="$OPTARG";;
t) IMGTAG="$OPTARG" ;;
h) usage ;;
\?) error "Invalid option. Run with -h for help." ;;
esac
done

echo "Building binaries from $REPO/$BRANCH"

WORKING_DIR="$SCRIPT_DIR/comet-rm/workdir"
cp $SCRIPT_DIR/../cargo.config $WORKING_DIR

# TODO: Search for Xcode (Once building macos binaries works)
#PS3="Select Xcode:"
#select xcode_path in `find . -name "${MACOS_SDK}"`
#do
# echo "found Xcode in $xcode_path"
# cp $xcode_path $WORKING_DIR
# break
#done

if [ -f "${WORKING_DIR}/${MACOS_SDK}" ]
then
HAS_MACOS_SDK="true"
fi

BUILDER_IMAGE_ARM64="comet-rm-arm64:$IMGTAG"
BUILDER_IMAGE_AMD64="comet-rm-amd64:$IMGTAG"

# Build the docker image in which we will do the build
docker build \
--platform=linux/arm64 \
-t "$BUILDER_IMAGE_ARM64" \
--build-arg HAS_MACOS_SDK=${HAS_MACOS_SDK} \
--build-arg MACOS_SDK=${MACOS_SDK} \
"$SCRIPT_DIR/comet-rm"

docker build \
--platform=linux/amd64 \
-t "$BUILDER_IMAGE_AMD64" \
--build-arg HAS_MACOS_SDK=${HAS_MACOS_SDK} \
--build-arg MACOS_SDK=${MACOS_SDK} \
"$SCRIPT_DIR/comet-rm"

# Clean previous Java build
pushd $COMET_HOME_DIR && ./mvnw clean && popd

# Run the builder container for each architecture. The entrypoint script will build the binaries

# AMD64
echo "Building amd64 binary"
docker run \
--name comet-amd64-builder-container \
--memory 24g \
--cpus 6 \
-it \
--platform linux/amd64 \
$BUILDER_IMAGE_AMD64 "${REPO}" "${BRANCH}" amd64

if [ $? != 0 ]
then
echo "Building amd64 binary failed."
exit 1
fi

# ARM64
echo "Building arm64 binary"
docker run \
--name comet-arm64-builder-container \
--memory 24g \
--cpus 6 \
-it \
--platform linux/arm64 \
$BUILDER_IMAGE_ARM64 "${REPO}" "${BRANCH}" arm64

if [ $? != 0 ]
then
echo "Building arm64 binary failed."
exit 1
fi

echo "Building binaries completed"
echo "Copying to java build directories"

JVM_TARGET_DIR=$COMET_HOME_DIR/common/target/classes/org/apache/comet
mkdir -p $JVM_TARGET_DIR

mkdir -p $JVM_TARGET_DIR/linux/amd64
docker cp \
comet-amd64-builder-container:"/opt/comet-rm/comet/native/target/release/libcomet.so" \
$JVM_TARGET_DIR/linux/amd64/

if [ "$HAS_MACOS_SDK" == "true" ]
then
mkdir -p $JVM_TARGET_DIR/darwin/x86_64
docker cp \
comet-amd64-builder-container:"/opt/comet-rm/comet/native/target/x86_64-apple-darwin/release/libcomet.dylib" \
$JVM_TARGET_DIR/darwin/x86_64/
fi

mkdir -p $JVM_TARGET_DIR/linux/aarch64
docker cp \
comet-arm64-builder-container:"/opt/comet-rm/comet/native/target/release/libcomet.so" \
$JVM_TARGET_DIR/linux/aarch64/

if [ "$HAS_MACOS_SDK" == "true" ]
then
mkdir -p $JVM_TARGET_DIR/linux/aarch64
docker cp \
comet-arm64-builder-container:"/opt/comet-rm/comet/native/target/aarch64-apple-darwin/release/libcomet.dylib" \
$JVM_TARGET_DIR/darwin/aarch64/
fi

# Build final jar
echo "Building uber jar and publishing it locally"
pushd $COMET_HOME_DIR

GIT_HASH=$(git rev-parse --short HEAD)
LOCAL_REPO=$(mktemp -d /tmp/comet-staging-repo-XXXXX)

./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.4 -P scala-2.12 -DskipTests install
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.4 -P scala-2.13 -DskipTests install
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.3 -P scala-2.12 -DskipTests install
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.3 -P scala-2.13 -DskipTests install
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.5 -P scala-2.12 -DskipTests install
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-3.5 -P scala-2.13 -DskipTests install

echo "Installed to local repo: ${LOCAL_REPO}"

popd
91 changes: 91 additions & 0 deletions dev/release/comet-rm/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
ARG HAS_MACOS_SDK="false"

FROM ubuntu:20.04 AS base

USER root

# For apt to be noninteractive
ENV DEBIAN_FRONTEND=noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN=true

ENV LC_ALL=C
# Install pr-requisites for rust
RUN export LC_ALL=C \
&& apt-get update \
&& apt-get install --no-install-recommends -y \
ca-certificates \
build-essential \
curl \
wget \
git \
llvm \
clang \
libssl-dev \
lzma-dev \
liblzma-dev \
openssh-client \
cmake \
cpio \
libxml2-dev \
patch \
bzip2 \
libbz2-dev \
zlib1g-dev


# Install rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
RUN cargo install cargo2junit

# Stage to add OSXCross if MacOSSDK is provided
FROM base AS with-macos-sdk-true
ARG MACOS_SDK

COPY workdir/$MACOS_SDK /opt/xcode/

RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
rustup target add aarch64-apple-darwin; \
elif [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
rustup target add x86_64-apple-darwin; \
fi

# Build OSXCross
RUN cd /opt && git clone --depth 1 https://github.com/tpoechtrager/osxcross.git \
&& cd /opt/osxcross \
&& ./tools/gen_sdk_package_pbzx.sh /opt/xcode/${MACOS_SDK} \
&& cd .. \
&& cp /opt/osxcross/*.tar.xz tarballs \
&& UNATTENDED=1 ./build.sh
ENV PATH="/opt/osxcross/target/bin:${PATH}"
# Use osxcross toolchain for cargo
COPY workdir/cargo.config /root/.cargo/config
ENV HAS_OSXCROSS="true"

# Placeholder Stage if MacOSSDK is not provided
FROM base AS with-macos-sdk-false
RUN echo "Building without MacOS"


FROM with-macos-sdk-${HAS_MACOS_SDK} AS final

COPY build-comet-native-libs.sh /opt/comet-rm/build-comet-native-libs.sh
WORKDIR /opt/comet-rm

ENTRYPOINT [ "/opt/comet-rm/build-comet-native-libs.sh"]
Loading

0 comments on commit fa275f1

Please sign in to comment.