-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter out Iceberg splits that do not match predicates not expressible by tuple domain #9830
Filter out Iceberg splits that do not match predicates not expressible by tuple domain #9830
Conversation
af3f1f3
to
c27e6c9
Compare
Failing tests are |
@@ -511,7 +511,7 @@ private PlanRoot doPlanQuery() | |||
private void planDistribution(PlanRoot plan) | |||
{ | |||
// plan the execution on the active nodes | |||
DistributedExecutionPlanner distributedPlanner = new DistributedExecutionPlanner(splitManager, metadata, dynamicFilterService); | |||
DistributedExecutionPlanner distributedPlanner = new DistributedExecutionPlanner(splitManager, metadata, dynamicFilterService, new TypeAnalyzer(sqlParser, metadata)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inject TypeAnalyzer
from guice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about that but I noticed that in the doPlanQuery
method it is created the same way so I thought it is better to be consisted. On the other hand I can replace both occurrences with an injected value.
WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense to change both and not construct TypeAnalyzer
manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a prep commit
|
||
public LayoutConstraintEvaluator(Metadata metadata, TypeAnalyzer typeAnalyzer, Session session, TypeProvider types, Map<Symbol, ColumnHandle> assignments, Expression expression) | ||
{ | ||
this.assignments = assignments; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rnn
and ImmutableMap.copyOf
(now that the class is public, it's more important)
.../trino-main/src/main/java/io/trino/sql/planner/iterative/rule/LayoutConstraintEvaluator.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/connector/ConstraintEvaluator.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorSplitManager.java
Outdated
Show resolved
Hide resolved
@@ -2429,9 +2429,7 @@ public void testSplitPruningForFilterOnPartitionColumn() | |||
verifySplitCount("SELECT * FROM " + tableName + " WHERE regionkey < 2", 2); | |||
verifySplitCount("SELECT * FROM " + tableName + " WHERE regionkey < 0", 0); | |||
verifySplitCount("SELECT * FROM " + tableName + " WHERE regionkey > 1 AND regionkey < 4", 2); | |||
|
|||
// TODO(https://github.com/trinodb/trino/issues/9309) should be 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark your PR as fixing this issue (and #7608 too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I do that? I already have #9309 in the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -158,7 +159,8 @@ public void testQueryDividedIntoSplitsFirstSplitHasRightTime() | |||
null, | |||
new PrometheusTableHandle("default", table.getName()), | |||
null, | |||
(DynamicFilter) null); | |||
(DynamicFilter) null, | |||
null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass ALWAYS_TRUE (aka EMPTY)
(here & below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, again I followed DynamicFiltering example ;)
@@ -203,12 +207,21 @@ private Visitor( | |||
dynamicFilter = dynamicFilterService.createDynamicFilter(session.getQueryId(), dynamicFilters, node.getAssignments(), typeProvider); | |||
} | |||
|
|||
ConstraintEvaluator constraintEvaluator = ConstraintEvaluator.EMPTY; | |||
Optional<ConstraintEvaluator> existingEvaluator = filter.map(FilterNode::getPredicate) | |||
.filter(expression -> !DynamicFilters.isDynamicFilter(expression)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to disassemble the expression
into conjuncts, filtering out dynamic filters
Expression nonDynamicPredicate = [ExpressionUtils.]filterConjuncts(metadata, predicate, expression -> !DynamicFilters.isDynamicFilter(expression));
core/trino-main/src/main/java/io/trino/sql/planner/DistributedExecutionPlanner.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java
Show resolved
Hide resolved
7778b5a
to
5394f6b
Compare
@findepi please take another look ;) |
core/trino-spi/src/main/java/io/trino/spi/connector/ConstraintEvaluator.java
Outdated
Show resolved
Hide resolved
import java.util.Map; | ||
import java.util.Set; | ||
|
||
public interface ConstraintEvaluator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some javadoc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can give it a try ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just:
Represents a constraint passed to {@link ConnectorSplitManager#getSplits}.
If {@link ConstraintEvaluator#isCandidate} returns false given actual partial column values binding, for given split,
it means split can be safely dropped without processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i know why we do not pass Constraint
into split manager -- we don't want to pass a TupleDomain so that connector doesn't want to intersect DynamicFilter's domain with Constraint's domain.
but then, the two classes: Constraint
and ConstraintEvaluator
have naturally similar names, and the second looks like the evaluator of the first.
- i am not sure about the naming. maybe we can find a better name, like
TestableConstraint
or something - should
Constraint
reuse this new class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @losipiuk ^
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/ExtractSpatialJoins.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
5394f6b
to
60c6a8e
Compare
60c6a8e
to
e3ba940
Compare
core/trino-main/src/main/java/io/trino/sql/planner/DistributedExecutionPlanner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/testing/LocalQueryRunner.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergSplitSource.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/test/java/io/trino/plugin/kudu/TestKuduIntegrationDynamicFilter.java
Outdated
Show resolved
Hide resolved
testing/trino-benchmark/src/main/java/io/trino/benchmark/AbstractOperatorBenchmark.java
Outdated
Show resolved
Hide resolved
Nothing important |
e3ba940
to
49491a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm except 49491a2#r745486692
nits
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package io.trino.sql.planner.iterative.rule; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.trino.sql.planner
, as it's going to be used outside of iterative optimizer
{ | ||
Map<ColumnHandle, NullableValue> bindings = new HashMap<>(); | ||
for (IcebergColumnHandle partitionColumn : identityPartitionColumns) { | ||
Object partitionValue = deserializePartitionValue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump?
49491a2
to
fc3644d
Compare
1ad71f2
to
32dc400
Compare
@@ -203,12 +212,19 @@ private Visitor( | |||
dynamicFilter = dynamicFilterService.createDynamicFilter(session.getQueryId(), dynamicFilters, node.getAssignments(), typeProvider); | |||
} | |||
|
|||
Constraint constraint = filterPredicate | |||
.map(predicate -> filterConjuncts(metadata, predicate, expression -> !DynamicFilters.isDynamicFilter(expression))) | |||
.map(expression -> new LayoutConstraintEvaluator(metadata, typeAnalyzer, session, typeProvider, node.getAssignments(), expression)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you use predicate
here again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
Constraint constraint = filterPredicate | ||
.map(predicate -> filterConjuncts(metadata, predicate, expression -> !DynamicFilters.isDynamicFilter(expression))) | ||
.map(expression -> new LayoutConstraintEvaluator(metadata, typeAnalyzer, session, typeProvider, node.getAssignments(), expression)) | ||
.map(evaluator -> new Constraint(TupleDomain.all(), evaluator::isCandidate, evaluator.getArguments())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment
// we are interested only in functional predicate here so we set the summary to ALL.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java
Show resolved
Hide resolved
32dc400
to
dd64c8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message nits:
Inject type analyzer int SqlQueryExecution instead of creating it
"int" -> "into"
Filter out splits that do not match predicates not expressible by tup…
is too long
{ | ||
try (SetThreadName ignored = new SetThreadName("Query-%s", stateMachine.getQueryId())) { | ||
this.slug = requireNonNull(slug, "slug is null"); | ||
this.metadata = requireNonNull(metadata, "metadata is null"); | ||
this.typeOperators = requireNonNull(typeOperators, "typeOperators is null"); | ||
this.sqlParser = requireNonNull(sqlParser, "sqlParser is null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the sqlParser
parameter is unused now, remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used, in line 193. it is passed to analyze
method
@@ -167,6 +190,9 @@ public IcebergSplitSource( | |||
continue; | |||
} | |||
} | |||
if (!partitionMatchesConstraint(identityPartitionColumns, partitionValues.get(), constraint)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Supplier's get
is always getting called here, which kinda defeats the purpose. You can add an constraint.predicate().isPresent()
check first, before calling partitionMatchesConstraint
, or pass the entire Supplier to partitionMatchesConstraint
rather than calling get
here to prolong constructing the Map until actually necessary.
…le domain fixes trinodb#9309 fixes trinodb#7608 The idea is to push constraint down to split source and not generate splits that would be filtered out by this constraint's predicate.
dd64c8f
to
a901d2c
Compare
@alexjo2144 @losipiuk @findepi comments addressed |
fixes #9309
fixes #7608
The idea is to push predicate down to split source and not generate splits
that would be filtered out by this predicate.