Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL LIKE predicate pushdown for varchar/nvarchar columns. #18441

Merged

Conversation

vlad-lyutenko
Copy link
Contributor

Description

LIKE predicate pushdown for varchar/nvarchar columns.

splited from #18140

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jul 27, 2023
@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch from ee965a1 to e723911 Compare July 27, 2023 14:23
@vlad-lyutenko vlad-lyutenko changed the title LIKE predicate pushdown for varchar/nvarchar columns. MySQL LIKE predicate pushdown for varchar/nvarchar columns. Jul 27, 2023
@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch 2 times, most recently from 3143dc5 to 7321d87 Compare July 28, 2023 11:52
@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch 5 times, most recently from 865b0ea to 2f46f73 Compare July 28, 2023 14:17
@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch from 2f46f73 to 0e730a8 Compare July 31, 2023 09:23
@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch from 0e730a8 to 5cf783e Compare July 31, 2023 12:39
return Optional.of(new ParameterizedExpression(format("%s LIKE %s ESCAPE %s", value.get().expression(), pattern.get().expression(), escape.get().expression()), parameters.build()));
}

return Optional.of(new ParameterizedExpression(format("%s LIKE %s", value.get().expression(), pattern.get().expression()), parameters.build()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we have a String builder which could capture the %s LIKE %s and optionally could add the ESCAPE clause towards the end and return a new ParameterizedExpression.

@@ -283,9 +284,53 @@ public void testAddNotNullColumn()
}
}

@Test
public void testLikePredicatePushdownWithCollation()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to test them on top of a view created with collate ?

Copy link
Contributor Author

@vlad-lyutenko vlad-lyutenko Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this PR (#18140) we have such test, which is executed on top of view, I want to add there such test case, so need to merge it (#18140) first

.with(argumentCount().matching(count -> count == 2 || count == 3))
.with(argument(0).matching(expression().capturedAs(LIKE_VALUE).with(type().matching(VarcharType.class::isInstance))))
.with(argument(1).matching(expression().capturedAs(LIKE_PATTERN).with(type().matching(VarcharType.class::isInstance))));
// TODO add capture on ESCAPE argument and type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be resolved before merging?
Right now we rely on what context.defaultRewrite(escapeExpression) returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We capture first 2 arguments, and third optional we took from arguments.

I cannot add third capture, because in this case - pattern matching will not be able to capture call with 2 arguments.

I know that it's optional -

   public static Property<Call, ?, ConnectorExpression> argument(int argument)
    {
        checkArgument(0 <= argument, "Invalid argument index: %s", argument);
        return Property.optionalProperty(format("argument(%s)", argument), call -> {
            if (argument < call.getArguments().size()) {
                return Optional.of(call.getArguments().get(argument));
            }
            return Optional.empty();
        });
    }

But it didn't work, maybe some bug in parsing.
I am investigating this and will create issue

Copy link
Contributor Author

@vlad-lyutenko vlad-lyutenko Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think I need to remove todo, because it's not a bug, it's just the way how things works:
Inside WithPattern


@Override
    public <C> Stream<Match> accept(Object object, Captures captures, C context)
    {
        //TODO remove cast
        BiFunction<? super T, C, Optional<?>> property = (BiFunction<? super T, C, Optional<?>>) propertyPattern.getProperty().getFunction();
        Optional<?> propertyValue = property.apply((T) object, context);
        return propertyValue.map(value -> propertyPattern.getPattern().match(value, captures, context))
                .orElseGet(Stream::of);
    }

So if propertyValue is empty or propertyPattern.getPattern().match(value, captures, context) didn't match - it's the same, we didn't capture anything and returning empty stream and stop the flow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is no such thing like "optional" capture argument - if you add it it will be captured or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR could be done as follow up - #18481

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simpler solution would be to just split into two rules - one for LIKE and one for LIKE with ESCAPE. Like how we have in the existing DSL based rules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing Pattern so that it can accept partial matches without explicitly asking for partial matches looks risky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it would result in some redundant code right ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR however consider this comment resolved since the result is that the TODO is applicable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hashhar thx! refactored to separate rules
cc @Praveen2112

@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch 3 times, most recently from 1c1522e to dcda949 Compare August 1, 2023 11:24
@@ -30,6 +30,7 @@
import java.sql.PreparedStatement;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests where column is case insensitive and push down does not happen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have it in BaseMysqlConnectorTest:

    @Test
    public void testPredicatePushdown()
    {
        // varchar like
        assertThat(query("SELECT regionkey, nationkey, name FROM nation WHERE name LIKE '%ROM%'"))
                .matches("VALUES (BIGINT '3', BIGINT '19', CAST('ROMANIA' AS varchar(255)))")
                .isNotFullyPushedDown(FilterNode.class);

@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch from dcda949 to 7aaed5e Compare August 1, 2023 12:47
Copy link
Member

@Praveen2112 Praveen2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% changes

import static io.trino.spi.type.BooleanType.BOOLEAN;
import static java.lang.String.format;

public class RewriteLikeEscapeWithCaseSensitivity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to mention CaseSensitivity ? If we are going to use this rule for other JDBC related connectors ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I think it's important for developer to understand looking at name that we rely on case sensitivity in this rule.
He could miss it from code.

@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch from 7aaed5e to c0d111e Compare August 1, 2023 13:28
@vlad-lyutenko vlad-lyutenko force-pushed the vlad-lyutenko/mysql-like-pushdown branch from c0d111e to 7a2300b Compare August 1, 2023 15:43
@Praveen2112 Praveen2112 merged commit a470785 into trinodb:master Aug 2, 2023
@Praveen2112
Copy link
Member

Thanks for working on this

@github-actions github-actions bot added this to the 423 milestone Aug 2, 2023
Comment on lines +66 to +72
ConnectorExpression capturedValue = captures.get(LIKE_VALUE);
if (capturedValue instanceof Variable variable) {
JdbcColumnHandle columnHandle = (JdbcColumnHandle) context.getAssignment(variable.getName());
Optional<CaseSensitivity> caseSensitivity = columnHandle.getJdbcTypeHandle().getCaseSensitivity();
if (caseSensitivity.orElse(CASE_INSENSITIVE) == CASE_INSENSITIVE) {
return Optional.empty();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when ! capturedValue instanceof Variable, we don't know whether it's safe to pushdown and we should not do it

.with(functionName().equalTo(LIKE_FUNCTION_NAME))
.with(type().equalTo(BOOLEAN))
.with(argumentCount().equalTo(3))
.with(argument(0).matching(expression().capturedAs(LIKE_VALUE).with(type().matching(VarcharType.class::isInstance))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per https://github.com/trinodb/trino/pull/18441/files#r1431506745, the pattern should match only variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

6 participants