-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support inserting into table having CHECK constraint in Delta Lake #15396
Conversation
number | ||
: MINUS? INTEGER_VALUE #integerLiteral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about decimals and scientific notation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a TODO comment. Let me know if we should support them in v1.
plugin/trino-delta-lake/src/main/antlr4/io/trino/plugin/deltalake/expression/SparkExpression.g4
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/antlr4/io/trino/plugin/deltalake/expression/SparkExpression.g4
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/antlr4/io/trino/plugin/deltalake/expression/SparkExpression.g4
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/AstVisitor.java
Outdated
Show resolved
Hide resolved
...-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionConverter.java
Outdated
Show resolved
Hide resolved
...-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionConverter.java
Outdated
Show resolved
Hide resolved
...-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionConverter.java
Outdated
Show resolved
Hide resolved
...rino-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/ComparisonExpression.java
Outdated
Show resolved
Hide resolved
...rino-delta-lake/src/test/java/io/trino/plugin/deltalake/expression/TestSparkExpressions.java
Outdated
Show resolved
Hide resolved
What about LIKE and function calls, like substring? |
@findepi Left a TODO comment in g4 file. |
CI hit #13199 |
Sounds good, unless mini-grammar is not the way to go at all. Can we have alternative approach PR where we explore Coral-based translation and see how hard it would be to get it working? |
@findepi Coral doesn't support parsing Spark SQL as far as I know. Are you suggesting to try Hive translation anyway? |
b4fa35c
to
31efb02
Compare
Rebased on upstream to resolve conflicts. |
31efb02
to
596b07d
Compare
Rebased on upstream to include engine and SPI change. |
let's add this as a commit message comment, covering alternatives we considered and rejected
no
feel free to squash the fixups |
596b07d
to
f792ef3
Compare
@findepi Squashed commits and updated the commit message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed Parsing (up to and excluding "SparkExpressionConverter.java").
I think it would be a good idea to have explicit parser tests for String literals (both positive and negative)
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/Identifier.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/Identifier.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/Literal.java
Outdated
Show resolved
Hide resolved
...rino-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/ComparisonExpression.java
Outdated
Show resolved
Hide resolved
...no-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionBuilder.java
Outdated
Show resolved
Hide resolved
...no-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionBuilder.java
Outdated
Show resolved
Hide resolved
...no-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionBuilder.java
Outdated
Show resolved
Hide resolved
...no-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionBuilder.java
Outdated
Show resolved
Hide resolved
...no-delta-lake/src/main/java/io/trino/plugin/deltalake/expression/SparkExpressionBuilder.java
Show resolved
Hide resolved
f792ef3
to
caaa9a3
Compare
public void testUnsupportedStringLiteral() | ||
{ | ||
assertParseFailure("r'raw literal'"); | ||
assertParseFailure("r\"'\\n' represents dnewline character.\""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dnewline -> newline
what's unsupported here? the r
prefix?
for assertParseFailure can you assert on the exact exception message being thrown?
return (SparkExpression) invokeParser(expressionPattern, SparkExpressionParser::standaloneExpression); | ||
} | ||
catch (Exception e) { | ||
throw new ParsingException("Cannot parse Spark expression: " + expressionPattern); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw new ParsingException("Cannot parse Spark expression [%s]: %s".formatted(expressionPattern, firstNonNull(e.getMessage(), e)), e);
@@ -51,9 +52,15 @@ public static String toTrinoExpression(String sparkExpression) | |||
} | |||
} | |||
|
|||
private static SparkExpression createExpression(String expressionPattern) | |||
@VisibleForTesting | |||
static SparkExpression createExpression(String expressionPattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expressionPattern -> expression
(it's definitely not a pattern)
return new Formatter().process(expression, null); | ||
} | ||
|
||
public static class Formatter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private
@Override | ||
protected String visitLogicalExpression(LogicalExpression node, Void context) | ||
{ | ||
return "(%s %s %s)".formatted(process(node.getLeft(), context), node.getOperator().toString(), process(node.getRight(), context)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may be able to avoid some braces on the output by glueing together series of ANDs or ORs
@Override
protected String visitLogicalExpression(LogicalExpression node, Void context)
{
return stream(flatten(node))
.map(expression -> process(expression, context))
.collect(Collectors.joining(node.getOperator().toString(), "(", ")"));
}
private Iterable<SparkExpression> flatten(LogicalExpression root)
{
return Traverser.<SparkExpression>forTree(node -> {
if (node instanceof LogicalExpression logicalExpression && logicalExpression.getOperator() == root.getOperator()) {
return ImmutableList.of(logicalExpression.getLeft(), logicalExpression.getRight());
}
return ImmutableList.of();
})
.depthFirstPreOrder(root);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure it's worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It causes a stack overflow. Let me handle in follow-up.
@Override | ||
protected String visitLongLiteral(LongLiteral node, Void context) | ||
{ | ||
return String.valueOf(node.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd prefer Long.toString
emphasizing the value is known to be long (and thus safe to output as trino exrepssion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, Boolean.toString
above
return new SparkExpressionBuilder().visit(tree); | ||
} | ||
catch (StackOverflowError e) { | ||
throw new IllegalArgumentException("expression pattern is too large (stack overflow while parsing)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm "pattern"
import static org.assertj.core.api.AssertionsForClassTypes.assertThatThrownBy; | ||
import static org.testng.Assert.assertEquals; | ||
|
||
public class TestStringLiteral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a parser test, not a StringLiteral test
TestSparkExpressions (or however the parsing class gets renamed)
|
||
import static io.trino.spi.StandardErrorCode.NOT_SUPPORTED; | ||
|
||
public final class SparkExpressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can call this class SparkExpressionParser
?
the antlr generated class would want to be called eg SparkExpressionBaseParser
(just rename grammar file and update grammar SparkExpression;
line)
assertThatThrownBy(() -> onDelta().executeQuery("INSERT INTO default." + tableName + " VALUES (" + invalidInput + ")")) | ||
.hasMessageMatching("(?s).* CHECK constraint .* violated by row with values.*"); | ||
assertThatThrownBy(() -> onTrino().executeQuery("INSERT INTO delta.default." + tableName + " VALUES (" + invalidInput + ")")) | ||
.hasMessageContaining("Check constraint violation"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the exact failure message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format is "Check constraint violation: " + constraint
(constraint = Trino expression). It's tested in TestCheckConstraint
.
Use ANTLR approach because LinkedIn Coral doesn't support translating Spark SQL yet. Additionally, extract relevant tests from TestDeltaLakeCheckConstraintsCompatibility.
caaa9a3
to
21734a0
Compare
CI hit #14441 |
Description
Supported expressions are intentionally limited in this PR.
The 1st commit came from #14964. Please leave review comments for the 1st commit in #14964.Release notes
(x) Release notes are required, with the following suggested text: