Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hive 4.0 image #218

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ jobs:
- image: hive3.1-hive
platforms: linux/amd64,linux/arm64
test: hive3.1-hive
- image: hive4.0-hive
# Haven't added `linux/arm64` platform as test image fails with `The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v3) and no specific platform was requested`
platforms: linux/amd64
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't aded linux/arm64 platform as test image had been complaining about different arch

The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v3) and no specific platform was requested

https://github.com/trinodb/docker-images/actions/runs/12242033317/job/34148454814?pr=218#step:5:837

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a code comment I guess

Copy link
Member

@hashhar hashhar Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on ci this is expected, what happens when you build the image locally? if it builds then you should allow all platforms here - CI will use emulation to build the arm version of the image but then devs who have arm machines will get arm images instead of having to emulate amd64 images.

on ci the node running the build is not an arm runner hence the warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya I realized about the runner spec. However, its just not about warning but image test is failing with arm. Please advise if there is a way to run only amd64 test and not arm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to merge without arm image.

The fix isn't to skip tests on arm but rather to find what is failing. The logs here are almost useless, I'll take a look at this and submit follow up PR.

test: hive4.0-hive
- image: hdp3.1-hive-kerberized
test: hdp3.1-hive
# TODO add test https://github.com/trinodb/trino/issues/14543
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ jobs:
testing/spark3-hudi
testing/polaris-catalog
testing/unity-catalog
testing/hive4.0-hive
)
referenced_images=("${skipped_images[@]}" "${single_arch[@]}" "${multi_arch[@]}")
make meta
Expand Down
19 changes: 19 additions & 0 deletions bin/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,15 @@ function run_hive_transactional_tests() {
true
}

function check_hive4() {
environment_compose exec hiveserver2 beeline -u jdbc:hive2://localhost:10000 -e 'SELECT 1;' >/dev/null 2>&1
}

function run_hive4_tests() {
environment_compose exec hiveserver2 beeline -u jdbc:hive2://localhost:10000 -e 'SHOW DATABASES;' &&
true
}

function check_spark() {
environment_compose exec spark curl --http0.9 -f http://localhost:10213 -o /dev/null
}
Expand Down Expand Up @@ -186,6 +195,16 @@ for ARCH in "${platforms[@]}"; do
test true
elif [[ ${ENVIRONMENT} == "kerberos" ]]; then
run_kerberos_tests
elif [[ ${ENVIRONMENT} == *"hive4"* ]]; then
# wait until hiveserver is started
retry check_hive4

# run tests
set -x
set +e
sleep 10
run_hive4_tests
# TODO add transactional hive tests
elif [[ ${ENVIRONMENT} == *"hive"* ]]; then
# wait until hadoop processes is started
retry check_hadoop
Expand Down
7 changes: 7 additions & 0 deletions etc/compose/hive4.0-hive/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
version: '2.0'
services:
hiveserver2:
hostname: hiveserver2
image: testing/hive4.0-hive:latest$ARCH
environment:
- SERVICE_NAME=hiveserver2
26 changes: 26 additions & 0 deletions testing/hive4.0-hive/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM apache/hive:4.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the base image include some arch-specific code? if not we can make this multi-arch by inlining the base image.

I can take a look at this as follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use 4.0.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing speaks against it but I am not sure on what major/minor releases do we test.
If we want to use the latest one then, I can change it to 4.0.1.


# TODO replace with aws sdk v2 by following https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/aws_sdk_upgrade.html
ARG AWS_JAVA_SDK_BUNDLE_VERSION=1.12.367
mayankvadariya marked this conversation as resolved.
Show resolved Hide resolved
ARG HADOOP_AWS_VERSION=3.3.6
mayankvadariya marked this conversation as resolved.
Show resolved Hide resolved

USER root
RUN apt-get -y update
RUN apt install curl -y

mayankvadariya marked this conversation as resolved.
Show resolved Hide resolved
# Install AWS SDK so we can access S3; the version must match the hadoop-aws* jars which are part of SPARK distribution
RUN mkdir -p /opt/hive/auxlib && \
curl -fLsS -o /opt/hive/auxlib/aws-java-sdk-bundle-$AWS_JAVA_SDK_BUNDLE_VERSION.jar https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/$AWS_JAVA_SDK_BUNDLE_VERSION/aws-java-sdk-bundle-$AWS_JAVA_SDK_BUNDLE_VERSION.jar && \
curl -fLsS -o /opt/hive/auxlib/hadoop-aws-$HADOOP_AWS_VERSION.jar https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/$HADOOP_AWS_VERSION/hadoop-aws-$HADOOP_AWS_VERSION.jar
Loading