Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(tpc): add tpc-ds tests #9467

Merged
merged 1 commit into from
Jul 1, 2024
Merged

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Jun 28, 2024

First 27 TPC-DS queries running against DuckDB, Trino, Snowflake, and DataFusion, sans the ones that requires ROLLUP.

More fail on DataFusion than the others. Those are marked with appropriate xfail marker.

@cpcloud cpcloud added this to the 9.2 milestone Jun 28, 2024
@cpcloud cpcloud added the tests Issues or PRs related to tests label Jun 28, 2024
@cpcloud cpcloud requested review from jcrist and gforsyth June 28, 2024 14:10
@cpcloud cpcloud force-pushed the tpc-ds-tests branch 5 times, most recently from 22a07b6 to 53747a4 Compare June 29, 2024 12:42
@cpcloud cpcloud added datafusion The Apache DataFusion backend duckdb The DuckDB backend snowflake The Snowflake backend trino The Trino backend labels Jun 29, 2024
@cpcloud
Copy link
Member Author

cpcloud commented Jul 1, 2024

If this PR is too large I can split up the things into a separate PR for each backend, and disable the runs until the refactoring is done.

@gforsyth
Copy link
Member

gforsyth commented Jul 1, 2024

Nah, I'm most of the way through reviewing it

Copy link
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changeset here is good -- do we want to keep the tpch and tpc-ds snapshots around? Since we're comparing results against hand-written SQL, I don't know that we're getting much out of them

@cpcloud
Copy link
Member Author

cpcloud commented Jul 1, 2024

They're gone.

@cpcloud
Copy link
Member Author

cpcloud commented Jul 1, 2024

The empty result sets are questionable, but I couldn't find a SF (up to sf=10) where all queries were non-empty. I believe that q17 is supposed to empty, based on the fact that it's always empty for the larger scale factors.

So, basically what I did was selectively allow empty queries in the tpc_test marker.

@gforsyth
Copy link
Member

gforsyth commented Jul 1, 2024

I believe that q17 is supposed to empty, based on the fact that it's always empty for the larger scale factors.

lol, I have an optimization for O(1) compute for query 17 at any scale factor

@gforsyth
Copy link
Member

gforsyth commented Jul 1, 2024

snowflake is passing:

🐚 pytest -m snowflake ibis/backends/tests/tpc/ds/test_queries.py -v
================================ test session starts ================================
platform linux -- Python 3.10.14, pytest-8.2.2, pluggy-1.5.0 -- /nix/store/1lj814h1vbfi6p7m6vr2rvcvkdqxc426-python3-3.10.14-env/bin/python3.10
cachedir: .pytest_cache
hypothesis profile 'dev' -> deadline=None, max_examples=50, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase(PosixPath('/home/gil/github.com/ibis-project/ibis/.hypothesis/examples'))
Using --randomly-seed=3164630463
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/gil/github.com/ibis-project/ibis
configfile: pyproject.toml
plugins: hypothesis-6.104.2, snapshot-0.9.0, anyio-4.4.0, randomly-3.15.0, mock-3.14.0, benchmark-4.0.0, timeout-2.3.1, cov-5.0.0, repeat-0.9.3, clarity-1.0.1, pytest_httpserver-1.0.10, xdist-3.6.1
collected 540 items / 513 deselected / 27 selected                                  

ibis/backends/tests/tpc/ds/test_queries.py::test_15[snowflake] PASSED    [  3%]
ibis/backends/tests/tpc/ds/test_queries.py::test_04[snowflake] PASSED    [  7%]
ibis/backends/tests/tpc/ds/test_queries.py::test_07[snowflake] PASSED    [ 11%]
ibis/backends/tests/tpc/ds/test_queries.py::test_23[snowflake] XFAIL     [ 14%]
ibis/backends/tests/tpc/ds/test_queries.py::test_03[snowflake] PASSED    [ 18%]
ibis/backends/tests/tpc/ds/test_queries.py::test_20[snowflake] PASSED    [ 22%]
ibis/backends/tests/tpc/ds/test_queries.py::test_24[snowflake] PASSED    [ 25%]
ibis/backends/tests/tpc/ds/test_queries.py::test_18[snowflake] XFAIL     [ 29%]
ibis/backends/tests/tpc/ds/test_queries.py::test_10[snowflake] PASSED    [ 33%]
ibis/backends/tests/tpc/ds/test_queries.py::test_16[snowflake] PASSED    [ 37%]
ibis/backends/tests/tpc/ds/test_queries.py::test_26[snowflake] PASSED    [ 40%]
ibis/backends/tests/tpc/ds/test_queries.py::test_05[snowflake] XFAIL     [ 44%]
ibis/backends/tests/tpc/ds/test_queries.py::test_17[snowflake] PASSED    [ 48%]
ibis/backends/tests/tpc/ds/test_queries.py::test_02[snowflake] PASSED    [ 51%]
ibis/backends/tests/tpc/ds/test_queries.py::test_08[snowflake] PASSED    [ 55%]
ibis/backends/tests/tpc/ds/test_queries.py::test_12[snowflake] PASSED    [ 59%]
ibis/backends/tests/tpc/ds/test_queries.py::test_01[snowflake] PASSED    [ 62%]
ibis/backends/tests/tpc/ds/test_queries.py::test_25[snowflake] PASSED    [ 66%]
ibis/backends/tests/tpc/ds/test_queries.py::test_27[snowflake] PASSED    [ 70%]
ibis/backends/tests/tpc/ds/test_queries.py::test_21[snowflake] PASSED    [ 74%]
ibis/backends/tests/tpc/ds/test_queries.py::test_13[snowflake] PASSED    [ 77%]
ibis/backends/tests/tpc/ds/test_queries.py::test_06[snowflake] PASSED    [ 81%]
ibis/backends/tests/tpc/ds/test_queries.py::test_22[snowflake] XFAIL     [ 85%]
ibis/backends/tests/tpc/ds/test_queries.py::test_09[snowflake] PASSED    [ 88%]
ibis/backends/tests/tpc/ds/test_queries.py::test_19[snowflake] PASSED    [ 92%]
ibis/backends/tests/tpc/ds/test_queries.py::test_11[snowflake] PASSED    [ 96%]
ibis/backends/tests/tpc/ds/test_queries.py::test_14[snowflake] XFAIL     [100%]

============= 22 passed, 513 deselected, 5 xfailed in 124.88s (0:02:04) =============

Copy link
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boy do I not understand waht asceding is about, but it's clearly intentional.

This looks good, and makes it easy for other contributors to tackle individual tpc-ds queries.

:shipit:

@gforsyth gforsyth merged commit d2dff68 into ibis-project:main Jul 1, 2024
80 checks passed
@cpcloud cpcloud deleted the tpc-ds-tests branch July 1, 2024 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion The Apache DataFusion backend duckdb The DuckDB backend snowflake The Snowflake backend tests Issues or PRs related to tests trino The Trino backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants