Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: unexpected error with EXPLAIN (VEC) on TPC-DS query 45 #47045

Closed
yuzefovich opened this issue Apr 4, 2020 · 2 comments
Closed

sql: unexpected error with EXPLAIN (VEC) on TPC-DS query 45 #47045

yuzefovich opened this issue Apr 4, 2020 · 2 comments
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Apr 4, 2020

If we set vectorize=on and run EXPLAIN (VEC) on query 45 of TPC-DS benchmark we get an unexpected error:

ERROR: unable to vectorize execution plan: at or near ")": syntax error
SQLSTATE: 42601
DETAIL: source SQL:
SET ROW ((substr(@3, 1:::INT8, 5:::INT8) IN ('80348':::STRING, '81792':::STRING, '83405':::STRING, '85392':::STRING, '85460':::STRING, '85669':::STRING, '86197':::STRING, '86475':::STRING, '88274':::STRING)) OR (@5 = ANY CAST(NULL AS )))
                                                                                                                                                                                                                                          ^

and I'm quite puzzled why this would be occurring.

Jira issue: CRDB-5047

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Apr 4, 2020
@yuzefovich
Copy link
Member Author

The problem here seems related to #40677 - we have a subquery which we need to see the concrete values from. What's different in this case is that EXPLAIN (DISTSQL) and EXPLAIN (OPT) run ok.

root@:26257/tpcds> explain (distsql) SELECT ca_zip, ca_city, sum(ws_sales_price) FROM web_sales, customer, customer_address, date_dim, item WHERE ws_bill_customer_sk = c_customer_sk
AND c_current_addr_sk = ca_address_sk AND ws_item_sk = i_item_sk AND ( substr(ca_zip, 1, 5) IN ( '85669', '86197', '88274', '83405', '86475', '85392', '85460', '80348', '81792' ) OR
i_item_id IN ( SELECT i_item_id FROM item WHERE i_item_sk IN ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ) ) ) AND ws_sold_date_sk = d_date_sk AND d_qoy = 1 AND d_year = 2000 GROUP BY ca_zi
p, ca_city ORDER BY ca_zip, ca_city LIMIT 100;
  automatic |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    url
------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    true    | https://cockroachdb.github.io/distsqlplan/decode.html#eJzEVl1vo0YUfe-vuJqXgHaqMHyDFIm08e66cnBqO2qjykLETL2o2HhnxkrTKP-9msFg4xhWzkv84svhHu6ZO-cOvCD-vUAhGvx5N7oexqDdDKez6e8jHaaD0eDXGSzS5L98g-X_IhfPGPh2pT3xhKcF5cmG5Quqw-fJ-Bae6GOFYlhsuShXlO2jJM0yRjnHkKWCJlm-wpALuoI_vg4mA9DU74knj3lRJA2J_wNXsDi81uE6vgFNYozRtVDP3eWldRGZt0t84oksU2XkdVzf1fj2kQum1YskGBwdZBsufMd1gwsMF75LAk8FvunZKrBsw6lu2V4VOFZgVoHtGiowLNtXAfEC80LXYTwBbScgz1SNXYf3oGrjQVcavSrdxGBhcDB4GAjBQCwMRMYBBtPCYAa6_DXr5mWRJarbavFNXKdkyffyGa6A7IFnmjK4AtMwDB2-TMb3d_DLw7EFYDy5GUxO3RgNb4czIIaBMFqXGY3TFeUo_AsRNMdow8oF5bxkEnpRCcPsXxQaGOXrzVZIeI7RomQUhS9I5KKgKESz9LGgE5pmlF3KB2dUpHmhHrth-Splz9GxyRBG00265iH8jDAab0UIEcGRhyNioPkrRuVW7AtykS4pCskrfp8oclqU3LoOIWanCLNTxL72dl2yjDKaterOJfNHKSdW8jXl334r8zVll2Z7IeMY6vmIpNXCMBzGMx-Ds4vqSVFWD8NwOpsM4y-N51uIGpkWokamjcjZaSNyBluIGsY2Imewhag5bZDd6EUOXMF1_HD-2DXrtt50AIPXRGTfILLPJAcJQROa-wSzRnVdb3kERxaObBw5nWax3utYq3-MOlzbLcR-rxD7tJD6FdEW8jkvBGUhaBEh8thqXKgOr8jbHVw1fDz5pFO-85FD5xwN3Rn7736kbretu6B_Cy0i-hXLl9-EFpkdZsaRq7bDx1HQexx77_WUd9pTzddJh7srcWb30ex_ZLf9U912lWq_6bmlGk0OOm9Xre9cU3BOk6-XS0aXqSjZZdCWUxXB6Dp-SOLxLInvR6OdBdqQJaHp_a10SveL2DhH1bRkQr6EjWNJn3BEPjWtkI2pRI7yVS7UR0qngLM-BSaUb8o1p8dbfvLRhtxnmi1p5RtebtmC3rFyocpUl2PFU0BGuajumtXFcK1uKYGHZHIG2Twmm71kt7-y1Ut2-sn2GeQ3sp0zZL8hu71kv1-2dwb5TWW_lxy0yMYxOeglE6OfTfo9RsgRff760_8BAAD__5ZFW10=
(1 row)
root@:26257/tpcds> explain (opt) SELECT ca_zip, ca_city, sum(ws_sales_price) FROM web_sales, customer, customer_address, date_dim, item WHERE ws_bill_customer_sk = c_customer_sk AND
c_current_addr_sk = ca_address_sk AND ws_item_sk = i_item_sk AND ( substr(ca_zip, 1, 5) IN ( '85669', '86197', '88274', '83405', '86475', '85392', '85460', '80348', '81792' ) OR i_item_id IN ( SELECT i_item_id FROM item WHERE i_item_sk IN ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ) ) ) AND ws_sold_date_sk = d_date_sk AND d_qoy = 1 AND d_year = 2000 GROUP BY ca_zip, ca_city ORDER BY ca_zip, ca_city LIMIT 100;
                                                                        text
-----------------------------------------------------------------------------------------------------------------------------------------------------
  limit
   ├── sort
   │    └── group-by
   │         ├── inner-join (hash)
   │         │    ├── inner-join (hash)
   │         │    │    ├── inner-join (cross)
   │         │    │    │    ├── scan customer_address
   │         │    │    │    ├── scan item
   │         │    │    │    └── filters
   │         │    │    │         └── or
   │         │    │    │              ├── substr(ca_zip, 1, 5) IN ('80348', '81792', '83405', '85392', '85460', '85669', '86197', '86475', '88274')
   │         │    │    │              └── any: eq
   │         │    │    │                   ├── project
   │         │    │    │                   │    └── scan item
   │         │    │    │                   │         └── constraint: /116
   │         │    │    │                   │              ├── [/2 - /3]
   │         │    │    │                   │              ├── [/5 - /5]
   │         │    │    │                   │              ├── [/7 - /7]
   │         │    │    │                   │              ├── [/11 - /11]
   │         │    │    │                   │              ├── [/13 - /13]
   │         │    │    │                   │              ├── [/17 - /17]
   │         │    │    │                   │              ├── [/19 - /19]
   │         │    │    │                   │              ├── [/23 - /23]
   │         │    │    │                   │              └── [/29 - /29]
   │         │    │    │                   └── i_item_id
   │         │    │    ├── inner-join (cross)
   │         │    │    │    ├── scan customer
   │         │    │    │    ├── select
   │         │    │    │    │    ├── scan date_dim
   │         │    │    │    │    └── filters
   │         │    │    │    │         ├── d_qoy = 1
   │         │    │    │    │         └── d_year = 2000
   │         │    │    │    └── filters (true)
   │         │    │    └── filters
   │         │    │         └── c_current_addr_sk = ca_address_sk
   │         │    ├── scan web_sales
   │         │    └── filters
   │         │         ├── ws_bill_customer_sk = c_customer_sk
   │         │         ├── ws_item_sk = i_item_sk
   │         │         └── ws_sold_date_sk = d_date_sk
   │         └── aggregations
   │              └── sum
   │                   └── ws_sales_price
   └── 100

I guess in those cases we do not look at the expression in the filter like we do in EXPLAIN (VEC) when trying to actually set up an ExprHelper.

@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@yuzefovich
Copy link
Member Author

EXPLAIN (VEC) on Q45 now works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
None yet
Development

No branches or pull requests

2 participants