Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Commit

Permalink
[NSE-273]Support spark311 for branch 1.1.1 (#319)
Browse files Browse the repository at this point in the history
* [NSE-262] fix remainer loss in decimal divide (#263)

* fix decimal divide int issue

* correct cpp uts

* use const reference

Co-authored-by: Yuan <yuan.zhou@outlook.com>

Co-authored-by: Yuan <yuan.zhou@outlook.com>

* [NSE-261] ArrowDataSource: Add S3 Support (#270)

Closes #261

* [NSE-196] clean up configs in unit tests (#271)

* remove testing config

* remove unused configs

* [NSE-265] Reserve enough memory before UnsafeAppend in builder (#266)

* change the UnsafeAppend to Append

* fix buffer builder in shuffle

shuffle builder use UnsafeAppend API for better performance. it
tries to reserve enough space based on results of last recordbatch,
this maybe not buggy if there's a dense recordbatch after a sparse one.

this patch adds below fixes:
- adds Reset() after Finish() in builder
- reserve length for offset_builder in binary builder

A further clean up on the reservation logic should be needed.

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>

* [NSE-274] Comment to trigger tpc-h RAM test (#275)

Closes #274

* bump cmake to 3.16 (#281)

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* [NSE-276] Add option to switch Hadoop version (#277)

Closes #276

* [NSE-119] clean up on comments (#288)

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* [NSE-206]Update installation guide and configuration guide. (#289)

* [NSE-206]Update installation guide and configuration guide.

* Fix numaBinding setting issue. & Update description for protobuf

* [NSE-206]Fix Prerequisite and Arrow Installation Steps. (#290)

* [NSE-245]Adding columnar RDD cache support (#246)

* Adding columnar RDD cache support

Signed-off-by: Chendi Xue <chendi.xue@intel.com>

* Directly save reference, only convert to Array[Byte] when calling by BlockManager

Signed-off-by: Chendi Xue <chendi.xue@intel.com>

* Add DeAllocator to construction to make sure this instance will be released once it be deleted by JVM

Signed-off-by: Chendi Xue <chendi.xue@intel.com>

* Delete cache by adding a release in InMemoryRelation

Since unpersist only delete RDD object, seems our deAllocator wasn't being called along
Now we added a release function in InMemoryRelation clearCache() func, may need to think
a new way for 3.1.0

Signed-off-by: Chendi Xue <chendi.xue@intel.com>

* [NSE-207] fix issues found from aggregate unit tests (#233)

* fix incorrect input in Expand

* fix empty input for aggregate

* fix only result expressions

* fix empty aggregate expressions

* fix res attr not found issue

* refine

* fix count distinct with null

* fix groupby of NaN, -0.0 and 0.0

* fix count on mutiple cols with null in WSCG

* format code

* support normalize NaN and 0.0

* revert and update

* support normalize function in WSCG

* [NSE-206]Update documents and License for 1.1.0 (#292)

* [NSE-206]Update documents and remove duplicate parts

* Modify documents by comments

* [NSE-293] fix unsafemap with key = '0' (#294)

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* [NSE-257] fix multiple slf4j bindings (#291)

* [NSE-297] Disable incremental compiler in GHA CI (#298)

Closes #297

* [NSE-285] ColumnarWindow: Support Date input in MAX/MIN (#286)

Closes #285

* [NSE-304] Upgrade to Arrow 4.0.0: Change basic GHA TPC-H test target OAP Arrow branch (#306)

* [NSE-302] remove exception (#303)

* [NSE-273] support spark311 (#272)

* support spark 3.0.2

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* update to use spark 302 in unit tests

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* support spark 311

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix missing dep

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix broadcastexchange metrics

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix arrow data source

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix sum with decimal

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix c++ code

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* adding partial sum decimal sum

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix hashagg in wscg

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix partial sum with number type

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix AQE shuffle copy

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix shuffle redudant reat

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix rebase

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* fix format

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* avoid unecessary fallbacks

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* on-demand scala unit tests

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* clean up

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

* [NSE-311] Build reports errors (#312)

Closes #311

* [NSE-257] fix the dependency issue on v2

Co-authored-by: Rui Mo <rui.mo@intel.com>
Co-authored-by: Hongze Zhang <hongze.zhang@intel.com>
Co-authored-by: JiaKe <ke.a.jia@intel.com>
Co-authored-by: Wei-Ting Chen <weiting.chen@intel.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Hong <hong2.wang@intel.com>
  • Loading branch information
7 people authored May 13, 2021
1 parent d3b6271 commit 2b69e5d
Show file tree
Hide file tree
Showing 429 changed files with 16,338 additions and 3,837 deletions.
65 changes: 0 additions & 65 deletions .github/workflows/report_ram_log.yml

This file was deleted.

39 changes: 23 additions & 16 deletions .github/workflows/tpch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,23 @@
name: Native SQL Engine TPC-H Suite

on:
pull_request
issue_comment:
types: [created, edited]

jobs:
ram-usage-test:
if: ${{ contains(github.event.pull_request.labels.*.name, 'RAM Report') }}
if: ${{ github.event.issue.pull_request && startsWith(github.event.comment.body, '@github-actions ram-usage-test') }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Checkout Pull Request
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
PR_URL="${{ github.event.issue.pull_request.url }}"
PR_NUM=${PR_URL##*/}
echo "Checking out from PR #$PR_NUM based on URL: $PR_URL"
hub pr checkout $PR_NUM
- name: Set up JDK 1.8
uses: actions/setup-java@v1
with:
Expand All @@ -42,15 +51,15 @@ jobs:
run: |
cd /tmp
git clone https://github.com/oap-project/arrow.git
cd arrow && git checkout arrow-3.0.0-oap && cd cpp
cd arrow && git checkout arrow-4.0.0-oap && cd cpp
mkdir build && cd build
cmake .. -DARROW_JNI=ON -DARROW_GANDIVA_JAVA=ON -DARROW_GANDIVA=ON -DARROW_PARQUET=ON -DARROW_CSV=ON -DARROW_HDFS=ON -DARROW_FILESYSTEM=ON -DARROW_WITH_SNAPPY=ON -DARROW_JSON=ON -DARROW_DATASET=ON -DARROW_WITH_LZ4=ON -DARROW_JEMALLOC=OFF && make -j2
sudo make install
cd ../../java
mvn clean install -B -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -P arrow-jni -am -Darrow.cpp.build.dir=/tmp/arrow/cpp/build/release/ -DskipTests -Dcheckstyle.skip
- name: Run Maven tests - BHJ
run: |
mvn test -B -pl native-sql-engine/core/ -am -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -DmembersOnlySuites=com.intel.oap.tpc.h -DtagsToInclude=com.intel.oap.tags.BroadcastHashJoinMode -DargLine="-Xmx1G -XX:MaxDirectMemorySize=500M -Dio.netty.allocator.numDirectArena=1"
mvn test -B -P full-scala-compiler -Dbuild_arrow=OFF -pl native-sql-engine/core/ -am -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -DmembersOnlySuites=com.intel.oap.tpc.h -DtagsToInclude=com.intel.oap.tags.BroadcastHashJoinMode -DargLine="-Xmx1G -XX:MaxDirectMemorySize=500M -Dio.netty.allocator.numDirectArena=1"
env:
MALLOC_ARENA_MAX: "4"
MAVEN_OPTS: "-Xmx1G"
Expand All @@ -59,7 +68,7 @@ jobs:
ENABLE_TPCH_TESTS: "true"
- name: Run Maven tests - SMJ
run: |
mvn test -B -pl native-sql-engine/core/ -am -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -DmembersOnlySuites=com.intel.oap.tpc.h -DtagsToInclude=com.intel.oap.tags.SortMergeJoinMode -DargLine="-Xmx1G -XX:MaxDirectMemorySize=500M -Dio.netty.allocator.numDirectArena=1"
mvn test -B -P full-scala-compiler -Dbuild_arrow=OFF -pl native-sql-engine/core/ -am -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -DmembersOnlySuites=com.intel.oap.tpc.h -DtagsToInclude=com.intel.oap.tags.SortMergeJoinMode -DargLine="-Xmx1G -XX:MaxDirectMemorySize=500M -Dio.netty.allocator.numDirectArena=1"
env:
MALLOC_ARENA_MAX: "4"
MAVEN_OPTS: "-Xmx1G"
Expand All @@ -69,14 +78,12 @@ jobs:
- run: |
cml-publish /tmp/comment_image_1.png --md > /tmp/comment.md
cml-publish /tmp/comment_image_2.png --md >> /tmp/comment.md
- run: echo "::set-output name=event_path::${GITHUB_EVENT_PATH}"
id: output-envs
- uses: actions/upload-artifact@v2
with:
name: comment_content
path: /tmp/comment.md
- uses: actions/upload-artifact@v2
with:
name: pr_event
path: ${{steps.output-envs.outputs.event_path}}

- name: Run Maven tests - Report
run: |
mvn test -B -P full-scala-compiler -Dbuild_arrow=OFF -Dbuild_protobuf=OFF -pl native-sql-engine/core/ -am -DmembersOnlySuites=com.intel.oap.tpc.h -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -DtagsToInclude=com.intel.oap.tags.CommentOnContextPR -Dexec.skip=true
env:
PR_URL: ${{ github.event.issue.pull_request.url }}
MAVEN_OPTS: "-Xmx1G"
COMMENT_CONTENT_PATH: "/tmp/comment.md"
GITHUB_TOKEN: ${{ github.token }}
ENABLE_TPCH_TESTS: "true"
9 changes: 5 additions & 4 deletions .github/workflows/unittests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ jobs:
ctest -R
scala-unit-test:
if: ${{ github.event.issue.pull_request && startsWith(github.event.comment.body, '@github-actions scala-unit-test') }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
Expand All @@ -82,8 +83,8 @@ jobs:
- name: Install Spark
run: |
cd /tmp
wget http://archive.apache.org/dist/spark/spark-3.0.0/spark-3.0.0-bin-hadoop2.7.tgz
tar -xf spark-3.0.0-bin-hadoop2.7.tgz
wget http://archive.apache.org/dist/spark/spark-3.0.2/spark-3.0.2-bin-hadoop2.7.tgz
tar -xf spark-3.0.2-bin-hadoop2.7.tgz
- name: Install OAP optimized Arrow (C++ libs)
run: |
cd /tmp
Expand All @@ -100,9 +101,9 @@ jobs:
cd arrow-data-source
mvn clean install -DskipTests -Dbuild_arrow=OFF
cd ..
mvn clean package -am -pl native-sql-engine/core -DskipTests -Dbuild_arrow=OFF
mvn clean package -P full-scala-compiler -am -pl native-sql-engine/core -DskipTests -Dbuild_arrow=OFF
cd native-sql-engine/core/
mvn test -DmembersOnlySuites=org.apache.spark.sql.travis -am -DfailIfNoTests=false -Dexec.skip=true -DargLine="-Dspark.test.home=/tmp/spark-3.0.0-bin-hadoop2.7" &> log-file.log
mvn test -P full-scala-compiler -DmembersOnlySuites=org.apache.spark.sql.travis -am -DfailIfNoTests=false -Dexec.skip=true -DargLine="-Dspark.test.home=/tmp/spark-3.0.0-bin-hadoop2.7" &> log-file.log
echo '#!/bin/bash' > grep.sh
echo "module_tested=0; module_should_test=1; tests_total=0; while read -r line; do num=\$(echo \"\$line\" | grep -o -E '[0-9]+'); tests_total=\$((tests_total+num)); done <<<\"\$(grep \"Total number of tests run:\" log-file.log)\"; succeed_total=0; while read -r line; do [[ \$line =~ [^0-9]*([0-9]+)\, ]]; num=\${BASH_REMATCH[1]}; succeed_total=\$((succeed_total+num)); let module_tested++; done <<<\"\$(grep \"succeeded\" log-file.log)\"; if test \$tests_total -eq \$succeed_total -a \$module_tested -eq \$module_should_test; then echo \"All unit tests succeed\"; else echo \"Unit tests failed\"; exit 1; fi" >> grep.sh
bash grep.sh
Expand Down
Loading

0 comments on commit 2b69e5d

Please sign in to comment.