-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional tests for outer join push downs #14841
Add additional tests for outer join push downs #14841
Conversation
hasBehavior(SUPPORTS_JOIN_PUSHDOWN_WITH_VARCHAR_EQUALITY), | ||
joinOverTableScans); | ||
|
||
// multiple bigint predicates | ||
assertThat(query(session, "SELECT n.name, c.name FROM nation n JOIN customer c ON n.nationkey = c.nationkey and n.regionkey = c.custkey")) | ||
assertThat(query(session, format("SELECT n.name, c.name FROM nation n %s customer c ON n.nationkey = c.nationkey and n.regionkey = c.custkey", joinOperator))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next part // inequality - is tricky one, not sure why but queries with most of inequality operators start to be fully pushdown with OUTER joins (working on this to understand why)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is also added now, the problem was that we treat INNER join in this case as CROSS JOIN and disable push down for such cases:
public class PushJoinIntoTableScan
implements Rule<JoinNode>
@Override
public Result apply(JoinNode joinNode, Captures captures, Context context)
{
if (joinNode.isCrossJoin()) {
return Result.empty();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem was that we treat INNER join in this case as CROSS JOIN
Join with no equi conditions gets planned as
FilterNode
- CrossJoin
- Source A
- Source B
and disable push down for such cases:
a safety measure
But we need to match the plan patterns like above and run Join pushdown for these as well.
I guiess @wendigo may be working on this. I remember explaining this to him.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
Looks like Postgres doesn't support FULL OUTER JOINs with inequality operators:
So I disabled this type of JOINs for postgres client, not sure it should be in this PR or separate, and maybe we could disable it in more smart way like not all OUTER joins but only for inequality operators |
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
@@ -970,6 +970,10 @@ public Optional<PreparedQuery> implementJoin( | |||
Map<JdbcColumnHandle, String> leftAssignments, | |||
JoinStatistics statistics) | |||
{ | |||
if (joinType == JoinType.FULL_OUTER) { | |||
// FULL JOIN is only supported with merge-joinable or hash-joinable join conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this suggests in future we should look into making available the joinType
to isSupportedJoinCondition
so that we can add logic to check that for FULL_OUTER
the joinCondition
is as Postgres supports and allow pushdown in those cases.
(No change requested but maybe we create an issue about possible future enhancement).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue added #14929
core/trino-main/src/test/java/io/trino/sql/query/QueryAssertions.java
Outdated
Show resolved
Hide resolved
hasBehavior(SUPPORTS_JOIN_PUSHDOWN_WITH_VARCHAR_EQUALITY), | ||
joinOverTableScans); | ||
|
||
// multiple bigint predicates | ||
assertThat(query(session, "SELECT n.name, c.name FROM nation n JOIN customer c ON n.nationkey = c.nationkey and n.regionkey = c.custkey")) | ||
assertThat(query(session, format("SELECT n.name, c.name FROM nation n %s customer c ON n.nationkey = c.nationkey and n.regionkey = c.custkey", joinOperator))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem was that we treat INNER join in this case as CROSS JOIN
Join with no equi conditions gets planned as
FilterNode
- CrossJoin
- Source A
- Source B
and disable push down for such cases:
a safety measure
But we need to match the plan patterns like above and run Join pushdown for these as well.
I guiess @wendigo may be working on this. I remember explaining this to him.
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
@DataProvider | ||
public Object[][] joinOperators() | ||
{ | ||
if (hasBehavior(SUPPORTS_JOIN_PUSHDOWN_WITH_FULL_JOIN)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when ! has behavior SUPPORTS_JOIN_PUSHDOWN_WITH_FULL_JOIN
, it should be verified that FULL JOIN pushdown isn't supported.
otherwise the connectors' declarations won't be tested for truthfulness (and will be wrong)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
@@ -970,6 +970,10 @@ public Optional<PreparedQuery> implementJoin( | |||
Map<JdbcColumnHandle, String> leftAssignments, | |||
JoinStatistics statistics) | |||
{ | |||
if (joinType == JoinType.FULL_OUTER) { | |||
// FULL JOIN is only supported with merge-joinable or hash-joinable join conditions | |||
return Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like disabling a lot of functionality. However, "merge- or hash- joinable" conditions sounds like "equality and inequality", so all the comparison expressions?
Can we support outer join with some conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can have FULL OUTER JOIN only for equality =,
for all others like <, <=, <>, DISTINCT we got exception,
I created issue for this - #14929
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Show resolved
Hide resolved
26713ea
to
38a77bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments.
|
||
// join over join | ||
assertThat(query(session, "SELECT * FROM nation n, region r, customer c WHERE n.regionkey = r.regionkey AND r.regionkey = c.custkey")) | ||
.isFullyPushedDown(); | ||
} | ||
} | ||
|
||
@DataProvider | ||
public Object[][] joinOperators() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operators is the "execution" term while "type" is the SQL term maybe? Would joinTypes
be better? (especially since there's no seaprate join operator for each of these "operators").
(No change requested, just seeking opinion from others).
plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Outdated
Show resolved
Hide resolved
7769acc
to
33e66f3
Compare
@DataProvider | ||
public Object[][] joinOperators() | ||
{ | ||
return new Object[][] {{JOIN}, {LEFT_JOIN}, {RIGHT_JOIN}, {FULL_JOIN}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems all values from JoinOperator are enlisted here, so JoinOperator.values()
+ DataProviders#toDataProvider
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
public String toString() | ||
{ | ||
return value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice trick
public enum JoinOperator | ||
{ | ||
JOIN("JOIN"), | ||
LEFT_JOIN("LEFT JOIN"), | ||
RIGHT_JOIN("RIGHT JOIN"), | ||
FULL_JOIN("FULL JOIN"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it 1-to-1 mapping with io.trino.spi.connector.JoinType
?
have you considered to reuse JoinType
?
have you considered to verify that JoinOperator
contains all values of JoinType
?
Do you anticipate other values can be present here in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this operator it's more about how JOIN condition is present in actual String query,
for example in future or for some connectors (in case of some bugs) we could add/write LEFT OUTER JOIN instead/additionally to LEFT JOIN. (however this is the same type of join)
So I'd prefer to keep these things separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to tackle already defined enums and their string representation elegantly.
And this does not seem to change frequently or any time soon.
so may be at least some verify (JoinOperator.values.size() vs JoinType.values().size())?
or for some connectors (in case of some bugs) we could add/write LEFT OUTER JOIN instead/additionally to LEFT JOIN.
don't see how to do that easily with current implementation, but I think it does not matter here/now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't see how to do that easily with current implementation, but I think it does not matter here/now.
you can just add to enum :
...
LEFT_OUTER_JOIN("LEFT OUTER JOIN"),
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so may be at least some verify (JoinOperator.values.size() vs JoinType.values().size())?
For me it's not mapping to actual joins, maybe another naming will help - like JoinOperatorStringRepresentation, don't know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important to decouple the following three things:
- Join types on the plan level (represented by
JoinNode.Type
) - Join types on SPI level (represented by
JoinType
) - Mapping SPI join types to SQL strings (represented by
JoinOperators
here).
There's no reason for 1:1 mapping between 1 and 2.
There also no reason for 1:1 mapping between 2 and 3. e.g. LEFT OUTER
can appear in SQL text as LEFT
or LEFT OUTER
- both are same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to verify JoinType
values are subset of JoinOperator
but it sounds premature - it's only a problem when someone implements a new Join node, adds plan optimizer to push down to table scan, implements in some connector - all of this without adding tests.
33e66f3
to
c4054d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % #14841 (comment).
54a94a5
to
db7755e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This is easier to use than using isNotFullyPushedDown in cases when you only need to verify some PlanNode still exists and was not consumed by the connector instead of verifying exact shape of some sub-plan.
This uncovers a bug in Postgres connector that pushing down FULL OUTER join queries with inequality join conditions fails with an error like "FULL JOIN is only supported with merge-joinable or hash-joinable join conditions". So FULL OUTER join pushdown is disabled for Postgres connector at the moment.
assertJoinConditionallyPushedDown is simpler to use and more generic as it doesn't get affected by exact plan shape.
db7755e
to
fcee24d
Compare
just reworded + rebased (since a new release was done) to avoid logical conflict if they exist |
let's wait with merge |
I think we are ok now to move forward |
Description
In Implement Join pushdown for JDBC connectors we implemented join pushdown for JDBC connectors.
As part of that a test BaseJdbcConnectorTest#testJoinPushdown was added which verifies that LEFT, RIGHT and FULL joins get pushed down with all possible operators (=, !=, <, <=, >, >=, IS DISTINCT FROM, IS NOT DISTINCT FROM).
This PR is initial attempt to add more test cases for outer join pushdowns
Non-technical explanation
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: