Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci(tests): add mini dataset to stateful test #6599

Closed
wants to merge 34 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
656b734
add stateful test of mini dataset
ZeaLoVe Jul 11, 2022
50acb29
Merge branch 'datafuselabs:main' into stateful-mini-dataset
Jul 13, 2022
070fb8f
fix config format
ZeaLoVe Jul 13, 2022
0bd8db3
pass secret through action inputs
ZeaLoVe Jul 13, 2022
fe373ed
add execute permission and fix yaml lint
ZeaLoVe Jul 13, 2022
c0de1b0
execute permission to scripts file
ZeaLoVe Jul 13, 2022
eb0e7f5
Fix Permission denied for shell
Xuanwo Jul 13, 2022
2d25087
set planner v2 every statements
ZeaLoVe Jul 14, 2022
479c0fe
Merge branch 'main' into stateful-mini-dataset
mergify[bot] Jul 14, 2022
bf9a646
Merge branch 'main' into stateful-mini-dataset
mergify[bot] Jul 15, 2022
143035b
Fix value not wrapped correctly
Xuanwo Jul 15, 2022
5e63ad0
Make sure repo related envs exported
Xuanwo Jul 15, 2022
a81baa7
debug
Xuanwo Jul 15, 2022
2451034
Try fix
Xuanwo Jul 15, 2022
d56690e
Fix env
Xuanwo Jul 15, 2022
f66966f
Try to use IAM
Xuanwo Jul 16, 2022
ce414fc
Another try
Xuanwo Jul 16, 2022
e2754e4
Don't need to load
Xuanwo Jul 16, 2022
ae6ae8c
Merge remote-tracking branch 'origin/main' into stateful-mini-dataset
Xuanwo Jul 18, 2022
fb5b8ff
Merge remote-tracking branch 'origin/main' into stateful-mini-dataset
Xuanwo Jul 28, 2022
d6cdb6c
Use https to load data
Xuanwo Jul 28, 2022
212ed25
Increate timeout for stateful tests
Xuanwo Jul 28, 2022
2c330bf
Merge branch 'main' into stateful-mini-dataset
mergify[bot] Jul 28, 2022
54f07f5
fix mini ontime result
ZeaLoVe Jul 29, 2022
a9d9fc2
fix ontime result error
ZeaLoVe Jul 29, 2022
f2997df
upload stdout when stateful test failed
ZeaLoVe Jul 29, 2022
fb45d8f
Merge branch 'main' into stateful-mini-dataset
mergify[bot] Jul 30, 2022
8f2e133
comment out the sql with error and fix result file
ZeaLoVe Aug 1, 2022
8ef5455
Merge branch 'stateful-mini-dataset' of https://github.com/ZeaLoVe/da…
ZeaLoVe Aug 1, 2022
b2451f7
suite name change to 04
ZeaLoVe Aug 1, 2022
e24f292
add chmod to sh files
ZeaLoVe Aug 1, 2022
cbd5461
Merge branch 'main' into stateful-mini-dataset
mergify[bot] Aug 1, 2022
ea94647
add order by to ensure consistent order of results
ZeaLoVe Aug 1, 2022
5c89082
Merge branch 'stateful-mini-dataset' of https://github.com/ZeaLoVe/da…
ZeaLoVe Aug 1, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/actions/test_stateful_standalone_linux/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ runs:
aws --endpoint-url http://127.0.0.1:9900/ s3 cp tests/data/ontime_200.parquet s3://testbucket/admin/data/ontime_200_v1.parquet

- name: Run Stateful Tests with Standalone mode (ubuntu-latest only)
env:
REPO_AWS_ACCESS_KEY_ID: ${{ secrets.REPO_ACCESS_KEY_ID }}
Xuanwo marked this conversation as resolved.
Show resolved Hide resolved
REPO_AWS_SECRET_ACCESS_KEY: ${{ secrets.REPO_SECRET_ACCESS_KEY }}
shell: bash
run: |
bash ./scripts/ci/ci-run-stateful-tests-standalone-s3.sh
Expand Down
56 changes: 56 additions & 0 deletions tests/suites/1_stateful/04_mini_dataset/00_0000_mini_ontime.result
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
count()
99999
DayOfWeek c
1 16773
2 16651
7 15551
3 13671
5 13159
4 13153
6 11041
DayOfWeek c
1 4279
2 3637
7 3571
3 2788
5 2191
4 1880
6 1813
Origin c
ORD 2036
DTW 1817
DFW 1719
MSP 1506
ATL 877
LGA 816
CVG 603
BOS 556
MEM 509
RDU 371
Carrier c3
FL 221.4065829978688
OH 208.2369273484498
NW 207.8941134667772
MQ 205.4136942462831
HA 54.34782608695652
Carrier c3
HA 5713.526570048309
MQ 12316.754450296687
FL 17308.666824532324
OH 12599.259602036094
NW 11370.336373327018
Year avg(DepDelay)
2006 12.16979169791698
avg(c1)
99999.0
OriginCityName DestCityName c
San Diego, CA Los Angeles, CA 622
Los Angeles, CA San Diego, CA 619
Kahului, HI Honolulu, HI 605
Honolulu, HI Kahului, HI 592
Chicago, IL Minneapolis, MN 586
Minneapolis, MN Chicago, IL 565
New York, NY Boston, MA 523
Boston, MA New York, NY 522
New York, NY Raleigh/Durham, NC 488
Lihue, HI Honolulu, HI 486
26 changes: 26 additions & 0 deletions tests/suites/1_stateful/04_mini_dataset/00_0000_mini_ontime.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env bash

CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. "$CURDIR"/../../../shell_env.sh

echo "drop table if exists ontime_mini;" | $MYSQL_CLIENT_CONNECT
## Create table
cat $CURDIR/../ddl/ontime.sql | sed 's/ontime/ontime_mini/g' | $MYSQL_CLIENT_CONNECT

## Load data
echo "COPY INTO ontime_mini FROM 's3://repo.databend.rs/dataset/stateful/ontime_2006_100000.csv' credentials=(aws_key_id='$REPO_AWS_ACCESS_KEY_ID' aws_secret_key='$REPO_AWS_SECRET_ACCESS_KEY') FILE_FORMAT = ( type = 'CSV' field_delimiter = ',' record_delimiter = '\n' skip_header = 1 );" | $MYSQL_CLIENT_CONNECT
Xuanwo marked this conversation as resolved.
Show resolved Hide resolved

## Run test
echo 'SELECT DayOfWeek, count(*) AS c FROM ontime_mini WHERE (Year >= 2000) AND (Year <= 2008) GROUP BY DayOfWeek ORDER BY c DESC;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT DayOfWeek, count(*) AS c FROM ontime_mini WHERE (DepDelay > 10) AND (Year >= 2000) AND (Year <= 2008) GROUP BY DayOfWeek ORDER BY c DESC;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT Origin, count(*) AS c FROM ontime_mini WHERE (DepDelay > 10) AND (Year >= 2000) AND (Year <= 2008) GROUP BY Origin ORDER BY c DESC LIMIT 10;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT IATA_CODE_Reporting_Airline AS Carrier, count() FROM ontime_mini WHERE (DepDelay > 10) AND (Year = 2007) GROUP BY Carrier ORDER BY count() DESC;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(CAST(DepDelay > 10, Int8)) * 1000 AS c3 FROM ontime_mini WHERE Year = 2007 GROUP BY Carrier ORDER BY c3 DESC;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(CAST(DepDelay > 10, Int8)) * 1000 AS c3 FROM ontime_mini WHERE (Year >= 2000) AND (Year <= 2008) GROUP BY Carrier ORDER BY c3 DESC;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(DepDelay) * 1000 AS c3 FROM ontime_mini WHERE (Year >= 2000) AND (Year <= 2008) GROUP BY Carrier;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT Year, avg(DepDelay) FROM ontime_mini GROUP BY Year;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT avg(c1) FROM ( SELECT Year, Month, count(*) AS c1 FROM ontime_mini GROUP BY Year, Month ) AS a;' |$MYSQL_CLIENT_CONNECT
echo 'SELECT OriginCityName, DestCityName, count(*) AS c FROM ontime_mini GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;' |$MYSQL_CLIENT_CONNECT

## Clean table
echo "drop table ontime_mini all;" | $MYSQL_CLIENT_CONNECT
Loading