Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate semi-join with empty right side #13803

Closed
findepi opened this issue Aug 23, 2022 · 2 comments · Fixed by #14335
Closed

Eliminate semi-join with empty right side #13803

findepi opened this issue Aug 23, 2022 · 2 comments · Fixed by #14335
Labels
enhancement New feature or request

Comments

@findepi
Copy link
Member

findepi commented Aug 23, 2022

trino> EXPLAIN SELECT * FROM tpch.tiny.nation WHERE nationkey IN (SELECT 1 WHERE false);
                                                         Query Plan
----------------------------------------------------------------------------------------------------------------------------
                                                         Query Plan
----------------------------------------------------------------------------------------------------------------------------
 Fragment 0 [SINGLE]
     Output layout: [nationkey, name, regionkey, comment]
     Output partitioning: SINGLE []
     Output[columnNames = [nationkey, name, regionkey, comment]]
     │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
     │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
     └─ InnerJoin[criteria = ("nationkey" = "subquery"), hash = [$hashvalue, $hashvalue_7], distribution = REPLICATED]
        │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
        │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
        │   Distribution: REPLICATED
        ├─ LocalExchange[partitioning = ROUND_ROBIN]
        │  │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), $hashvalue:bigint]
        │  │   Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
        │  └─ Project[]
        │     │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), $hashvalue_4:bigint]
        │     │   Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
        │     │   $hashvalue_4 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("nationkey"), 0))
        │     └─ Values[]
        │            Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
        │            Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
        └─ Project[]
           │   Layout: [subquery:bigint, $hashvalue_7:bigint]
           │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
           │   $hashvalue_7 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("subquery"), 0))
           └─ Aggregate[type = FINAL, keys = [subquery]]
              │   Layout: [subquery:bigint]
              │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
              └─ LocalExchange[partitioning = HASH, hashColumn = [$hashvalue_5], arguments = ["subquery"]]
                 │   Layout: [subquery:bigint, $hashvalue_5:bigint]
                 │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
                 └─ Project[]
                    │   Layout: [subquery:bigint, $hashvalue_6:bigint]
                    │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
                    │   $hashvalue_6 := combine_hash(bigint '0', COALESCE("$operator$hash_code"("subquery"), 0))
                    └─ Aggregate[type = PARTIAL, keys = [subquery]]
                       │   Layout: [subquery:bigint]
                       │   Estimates:
                       └─ Values[]
                              Layout: [subquery:bigint]
                              Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}

the WHERE nationkey IN (SELECT 1 WHERE false) should get simplified to WHERE false.

@albericgenius
Copy link
Contributor

@findepi i am not sure it is right solution to reused RemoveRedundantJoin here.
Or I should create a new Rule.
Welcome for any comments, I will update asap.
Thanks

@findepi
Copy link
Member Author

findepi commented Sep 28, 2022

let's have a separate rule

note that SemiJoinNode doesn't do filtering, it only projects whether there is match, so a rule would need to match FilterNode + SemiJoinNode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging a pull request may close this issue.

2 participants