-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DSL for JDBC expression rewrites #11125
Conversation
cc @slhmy |
@@ -317,7 +316,7 @@ public PostgreSqlClient( | |||
.addDefaultRules() | |||
// TODO allow all comparison operators for numeric types | |||
.add(new RewriteComparison(RewriteComparison.ComparisonOperator.EQUAL, RewriteComparison.ComparisonOperator.NOT_EQUAL)) | |||
.add(new RewriteLike()) | |||
.map("$like_pattern(value: varchar, pattern: varchar): boolean").to("value LIKE pattern") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get away with writing the AST manually in java instead of defining an antlr grammar/using a visitor to create the AST?
In my opinion, this doesn't seem too bad to manually write. But maybe others feel differently 🤷♀️
new CallPattern(
"$like_pattern",
List.of(
new ExpressionCapture("value", type("varchar")),
new ExpressionCapture("pattern", type("varchar"))),
Optional.of(type("boolean"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding a test case for a more complex example would show the benefits of the DSL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DSL syntax here is similar to SQL, so hopefully feels familiar, with notable differences:
- call parameters have explicit types, call result has optional explicit type (this is central for function pattern matching)
- $ sign is allowed in identifiers, so that we don't need quotes
In my view, your example already proves the value.
The had to use CallPattern
, List
, ExpressionCapture
, type
, Optional
for the same effect, without adding any actual value for readability (quite contrary, IMO).
09ebbe7
to
05b99d5
Compare
(rebased) |
05b99d5
to
4d1d095
Compare
4d1d095
to
d05294f
Compare
rebased and fixed codestyle issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything except GenericRewrite#rewrite
looks good to me % comments.
...bc/src/main/java/io/trino/plugin/jdbc/expression/JdbcConnectorExpressionRewriterBuilder.java
Outdated
Show resolved
Hide resolved
...bc/src/main/java/io/trino/plugin/jdbc/expression/JdbcConnectorExpressionRewriterBuilder.java
Show resolved
Hide resolved
* limitations under the License. | ||
*/ | ||
|
||
grammar ConnectorExpressionPattern; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't pretend I'm very familiar with ANTLR - I'll review this in detail later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...bc/src/main/java/io/trino/plugin/jdbc/expression/JdbcConnectorExpressionRewriterBuilder.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/expression/GenericRewrite.java
Show resolved
Hide resolved
return (TypePattern) invokeParser(typePattern, ConnectorExpressionPatternParser::standaloneType); | ||
} | ||
|
||
public Object invokeParser(String input, Function<ConnectorExpressionPatternParser, ParserRuleContext> parseFunction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me + Antlr usage matches existing code in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3c10495
to
7a5f232
Compare
Introduce DSL for building function pushdown rules in JDBC. As we are building up a registry of allowed function pushdowns, the declaration verbosity matters. This commit adds support for defining rewrite rules tersely. As a usage example, a previously hand-written `RewriteLike` rule is now replaced with a mere one-liner: ``` .map("$like_pattern(value: varchar, pattern: varchar): boolean").to("value LIKE pattern") ```
7a5f232
to
a5d3deb
Compare
; | ||
|
||
IDENTIFIER | ||
: (LETTER | '_' | '$') (LETTER | DIGIT | '_' | '$')* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that these are not SQL identifiers (no delimited vs non-delimited), so they should be treated as already-canonicalized. Also, that implies that the character set is potentially anything that can go inside a delimited SQL identifier (i.e., any UTF-8 character). This is going to be hard to do without introducing some kind of escaping for whitespace or other characters.
Concretely, if a SQL query does something like:
SELECT "123 some function"(a, b, c) FROM t
(i.e., it calls a function named 123 some function
)
That name needs to be representable in this grammar. The following pattern would obviously not work due to 1) whitespace 2) leading numbers in the function name:
123 some function(a : varchar, b : varchar, c : varchar) : bigint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that these are not SQL identifiers (no delimited vs non-delimited)
yes, they aren't
. so they should be treated as already-canonicalized.
in a sense of lower/upper-case?
yes, it should be case-sensitive
Also, that implies that the character set is potentially anything that can go inside a delimited SQL identifier (i.e., any UTF-8 character). This is going to be hard to do without introducing some kind of escaping for whitespace or other characters.
For such a case we will introduce quotes to allow a name be anything.
We don't have such need yet, but once we have, we'll do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this is convenience syntax to simplify most of the cases. It doesn't need to support all the cases. For example, it doesn't support varargs today. It can be made to support varargs, and we may or may not choose to add that, depending on demand. "99%" cases are function names that are ordinary words, with fixed number of arguments, potentially with overloads for different types.
Introduce DSL for building function pushdown rules in JDBC.
As we are building up a registry of allowed function pushdowns, the
declaration verbosity matters. This commit adds support for defining
rewrite rules tersely.
As a usage example, a previously hand-written
RewriteLike
rule is nowreplaced with a mere one-liner:
Relates to #18