Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectorExpression pushdown #7994

Merged
merged 2 commits into from
Mar 2, 2022

Conversation

assaf2
Copy link
Member

@assaf2 assaf2 commented May 20, 2021

For #18

@cla-bot
Copy link

cla-bot bot commented May 20, 2021

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Assaf Bern.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@alec-heif
Copy link
Contributor

hi @assaf2 are you still planning to work on this PR? this would be very useful for my organization for being able to push down more complex queries into postgres.

@cla-bot cla-bot bot added the cla-signed label Jun 20, 2021
@assaf2
Copy link
Member Author

assaf2 commented Jun 20, 2021

hi @assaf2 are you still planning to work on this PR? this would be very useful for my organization for being able to push down more complex queries into postgres.

@alec-heif good to hear, yes I'm on it

@assaf2 assaf2 force-pushed the connector_expression_pushdown branch 2 times, most recently from 8dadc26 to c1c043a Compare June 20, 2021 14:00
@findepi
Copy link
Member

findepi commented Jun 21, 2021

@martint do you have a doc or some writeup outlining your thinking about desired end-goal for ConnectorExpression-based pushdowns, especially predicate pushdown?

Besides the end-goal I would love to help us define intermediate steps that will allow us split the work into smaller pieces, without boiling the ocean.
It seems to me the ideal scenario is when we can gradually migrate from TupleDomain to ConnectorExpression
This requires to marry the two (e.g. be able to embed TupleDomain or Domain in ConnectorExpression.

Defining those intermediate steps should not only help reach the end-goal sooner, but also would allow deliver partial value to users as we go.

cc @losipiuk

@rzeyde-varada
Copy link
Contributor

Greetings,
We'll be happy to help with the development of this feature - so please let us know the next implementation steps.
Our use case is complex predicates' pushdown (e.g. using string-related functions over some of the columns):
https://trinodb.slack.com/archives/CGJS76ZSR/p1607419450007700

@martint martint self-requested a review July 11, 2021 16:17
@willzgw
Copy link
Contributor

willzgw commented Aug 18, 2021

Hi everyone, may I know where we are with this PR?
Recently, we have a requirement to push time_floor(__time, 'period') down to Druid queries.
I did some research and found that we need a new class like PushAggregationIntoTableScan to support this feature.
Maybe I'm in a wrong way. I am not sure.
Could I get your guidance?

int dfaStatesLimit = SystemSessionProperties.getRe2JDfaStatesLimit(session);
int dfaRetries = SystemSessionProperties.getRe2JDfaRetries(session);
Re2JRegexp re2JRegExp = new Re2JRegexp(dfaStatesLimit, dfaRetries, slice);
Type type = new Re2JRegexpType(dfaStatesLimit, dfaRetries);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obtain the type object from TypeManager via getType(Re2JRegexpType.RE2J_REGEXP_SIGNATURE) instead of constructing an instance manually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to add TypeManager to PlanOptimizers's @Inject constructor and to pass it from there. I had to change ~15 files including tests, but then I got a runtime error since PlanOptimizers is bound at CoordinatorModule, while TypeManager is not bound yet.
I couldn't find another way to obtain TypeManager, any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we need to implement this more generically, without having to hardcode any knowledge about how regexp functions work:

protected Optional<Expression> translateCall(Call call)
{
    if (LIKE_FUNCTION_NAME.equals(call.getName())) {
        return translateLike(call.getArguments().get(0), call.getArguments().get(1));
    }

    QualifiedName name = QualifiedName.of(call.getName());
    FunctionCallBuilder builder = new FunctionCallBuilder(metadata)
            .setName(name);

    List<TypeSignature> argumentTypes = call.getArguments().stream()
            .map(argument -> argument.getType().getTypeSignature())
            .collect(toCollection(ArrayList::new));

    ResolvedFunction resolved = metadata.resolveFunction(name, TypeSignatureProvider.fromTypeSignatures(argumentTypes));
    for (int i = 0; i < call.getArguments().size(); i++) {
        Expression expression = ConnectorExpressionTranslator.translate(session, call.getArguments().get(i), metadata, variableMappings, literalEncoder);
        Type expectedType = resolved.getSignature().getArgumentTypes().get(i);
        Type actualType = metadata.getType(argumentTypes.get(i));

        if (!actualType.equals(expectedType)) {
            verify(typeCoercion.canCoerce(actualType, expectedType));
            expression = new Cast(expression, toSqlType(expectedType));
        }

        builder.addArgument(expectedType, expression);
    }

    return Optional.of(builder.build());
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I just made a single change to your suggestion - replaced if (!actualType.equals(expectedType)) { with if (!typeCoercion.isTypeOnlyCoercion(actualType, expectedType)) { because, for example, at
testCrossJoinLargeBuildSideDynamicFiltering() we get $internal$dynamic_filter_function with a varchar(9) argument while its expectedType = varchar. Is this change ok by you?

In addition, are you aware of an equivalently generic way to implement io.trino.sql.planner.ConnectorExpressionTranslator.SqlToConnectorExpressionTranslator#visitFunctionCall which still refers to JoniRegexp and Re2JRegexp specifically?

Comment on lines 129 to 131
List<Expression> arguments = call.getArguments().stream()
.map(argument -> ConnectorExpressionTranslator.translate(session, argument, metadata, literalEncoder, Optional.empty()))
.collect(Collectors.toList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This translation can be done inside the if block below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The translator currently supports many functions.
If I'll do so, I'll need to create a FunctionCall only inside the if, so only regexp_like and LIKE functions will be supported. Therefore, I'll also need to change io.trino.sql.planner.ConnectorExpressionTranslator.SqlToConnectorExpressionTranslator#visitFunctionCall to convert only regexp functions (because otherwise we'll translate only in one direction).
Do you want me to narrow the translation this way?

@jerryleooo
Copy link
Member

This for the hard work, not sure if #9067 belongs to this issue?

@assaf2 assaf2 force-pushed the connector_expression_pushdown branch 2 times, most recently from 82d9db9 to 3d6ce2e Compare February 17, 2022 12:29
@assaf2 assaf2 requested a review from findepi February 17, 2022 12:35
@assaf2 assaf2 force-pushed the connector_expression_pushdown branch 2 times, most recently from 7d9ae24 to 30e2310 Compare February 17, 2022 15:10
@@ -249,14 +269,24 @@ private boolean arePlansSame(FilterNode filter, TableScanNode tableScan, PlanNod
node.isUpdateTarget(),
node.getUseConnectorNodePartitioning());

Expression remainingDecomposedPredicate;
if (connectorExpression.isEmpty() || remainingConnectorExpression.isEmpty() || remainingConnectorExpression.equals(connectorExpression)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If connectorExpression is empty (i.e., either there wasn't a expression in the first place or the translation from Expression->ConnectorExpression failed), it is still possible that remainingConnectorExpression is not empty (i.e., if the connector synthesizes an expression). In that case, we need to combine decomposedPredicate.getRemainingExpression() with the result from applyFilter().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( followed up here: #7994 (comment) )

@assaf2 assaf2 force-pushed the connector_expression_pushdown branch from 30e2310 to f121df1 Compare February 20, 2022 09:43
@assaf2 assaf2 requested a review from martint February 20, 2022 11:14
@assaf2 assaf2 force-pushed the connector_expression_pushdown branch from f121df1 to 0835585 Compare February 20, 2022 11:38
remainingDecomposedPredicate = ExpressionUtils.combineConjuncts(plannerContext.getMetadata(), translatedExpression, decomposedPredicate.getRemainingExpression());
}
else {
remainingDecomposedPredicate = translatedExpression;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion

 Optional<ConnectorExpression> connectorExpression = new ConnectorExpressionTranslator.SqlToConnectorExpressionTranslator(session, remainingExpressionTypes, plannerContext)
                .process(decomposedPredicate.getRemainingExpression());

is lossy.

i.e. decomposedPredicate.getRemainingExpression() contains full information
but then connectorExpression contains partial information
remainingConnectorExpression contains part of connectorExpression, so again not a full information
so then translatedExpression cannot supersede remainingDecomposedPredicate

i think SqlToConnectorExpressionTranslator should work like a DomainTranslator, i..e it should spit out (a) translated result and (b) remaining expression

@martint do you have a suggestion how to handle this for now, so that SqlToConnectorExpressionTranslator change can be a follow up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is lossy.

That should not be the case. The translation should either succeed and be equivalent to the original expression, or fail and return Optional.empty().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed @martint it's not a problem, as it's not lossy.
Although it should be. If a conjunct cannot be translated, we still should push down the others that can. cc @hashhar
But that's something we can follow up on.

@findepi findepi force-pushed the connector_expression_pushdown branch from 1978d9f to 110c2ce Compare March 1, 2022 09:11
@findepi
Copy link
Member

findepi commented Mar 1, 2022

Squashed fixups. Previous CI run https://github.com/trinodb/trino/actions/runs/1909288734 has some failures but apparently not related. let's see CI this time.

@assaf2 assaf2 force-pushed the connector_expression_pushdown branch from 110c2ce to a96f5ee Compare March 1, 2022 15:00
@findepi findepi merged commit 7ed64c7 into trinodb:master Mar 2, 2022
@findepi
Copy link
Member

findepi commented Mar 2, 2022

Merged, thanks!

@findepi findepi mentioned this pull request Mar 2, 2022
@slhmy
Copy link

slhmy commented Mar 2, 2022

Really excieted to see this PR merged.

@github-actions github-actions bot added this to the 372 milestone Mar 2, 2022
Optional<Expression> translatedValue = translate(session, value);
Optional<Expression> translatedPattern = translate(session, pattern);
if (translatedValue.isPresent() && translatedPattern.isPresent()) {
return Optional.of(new LikePredicate(translatedValue.get(), translatedPattern.get(), Optional.empty()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martint Shouldn't this use LikeFunctions#LIKE_PATTERN_FUNCTION_NAME, so that we don't have to re-run DesugarLike?

cc @wendigo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

10 participants