-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug to allow parenthesis query after INSERT INTO
#1157
Conversation
INSERT INTO
Hi @jonathanlehto -- can you please fix the CI tests? I think one needs |
Apologies, I may not be as familiar with MySQL syntax, but it looks like the test case error is due to asserting that the parser will fail the REPLACE statement. From comparing the INSERT and REPLACE docs, it looks like that statement should be succeeding (I could very well be wrong). If that test case should fail, I can get another update in. I just want to make sure we are on the same page first! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bearing with me @jonathanlehto
I left some questions about this PR
src/parser/mod.rs
Outdated
// make sure we are not about to consume a query | ||
let mut after_columns = vec![]; | ||
match self.peek_nth_token(1).token { | ||
Token::Word(w) if w.keyword != Keyword::SELECT => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking for select seems incorrect here
What about a query like
INSERT INTO tbla (cola) (VALUES (1))
The Hive manual https://cwiki.apache.org/confluence/display/hive/languagemanual+dml doesn't appear to say anything about this syntax
Standard syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
Hive extension (multiple inserts):
FROM from_statement
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1
[INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2]
[INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2] ...;
FROM from_statement
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1
[INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2]
[INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2] ...;
Hive extension (dynamic partition inserts):
INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
INSERT INTO TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
I may be missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From reading those docs, it looks like the PARTITION keyword is required. I moved the after_columns parsing into that statement. I don't know why I didn't do that originally, but I think this should resolve the situation. I added another test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have mis-communicated this. I went ahead and confirmed that the failing test case is incorrect.
Test case:
#[test]
fn test_replace_into_select() {
let sql = r#"REPLACE INTO t1 (a, b, c) (SELECT * FROM t2)"#;
assert!(Parser::parse_sql(&GenericDialect {}, sql).is_err());
}
I created a local mysql table and successfully ran these commands.
mysql> INSERT INTO tab (col) (SELECT col FROM tab);
Query OK, 1 row affected (0.02 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> REPLACE INTO tab (col) (SELECT col FROM tab);
Query OK, 2 rows affected (0.02 sec)
Records: 2 Duplicates: 0 Warnings: 0
However, you can not wrap values in mysql, copied below. From what I can tell, only Trino / Athena allow values statement to be wrapped.
mysql> INSERT INTO tab (col) VALUES (1);
Query OK, 1 row affected (0.02 sec)
mysql> INSERT INTO tab (col) (VALUES (1));
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(1))' at line 1
Unfortunately, I don't have the time to check every Dialect that is supported, but I know that Athena / Trino will succeed and at least MySQL will fail.
In this case, do we only want to succeed a parenthesis values statement in common, and fail for every other dialect? There seems to be another bug here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rebased and updated this with what my best guess for a solution would be. I only allow VALUES as a subquery in the GenericDialect. Let me know your thoughts
@@ -106,6 +106,9 @@ fn parse_insert_values() { | |||
} | |||
|
|||
verified_stmt("INSERT INTO customer WITH foo AS (SELECT 1) SELECT * FROM foo UNION VALUES (1)"); | |||
|
|||
// allow parenthesis query after insert into | |||
verified_stmt("INSERT INTO tbla (cola) (SELECT cola FROM tblb)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this syntax mean? Is it the same as
INSERT INTO tbla (cola) SELECT cola FROM tblb
(no parens?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so, well at the least, running an explain statement in Athena (Trino) yields no difference between
INSERT INTO tbla (cola) SELECT cola FROM tblb
and
INSERT INTO tbla (cola) (SELECT cola FROM tblb)
And registers both as valid queries. From what I recall, I think you can generally wrap any select statements in parenthesis
🤔 it seems like the CI is still failing on this PR |
7f8bce9
to
5c946a0
Compare
Pull Request Test Coverage Report for Build 8238393022Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
It appears the tests are still failing on this PR |
I updated the mr to fix the lint and codestyle errors. I was hoping to get your thoughts on this comment, so, if you have some time, I would love to know if you think that implementation is okay! |
|
||
let is_generic = dialect_of!(self is GenericDialect); | ||
if !is_generic { | ||
if let SetExpr::Values(_) = *subquery.body { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is applying a semantic check: https://github.com/sqlparser-rs/sqlparser-rs?tab=readme-ov-file#syntax-vs-semantics
I think any such errors should be generated downstream. Can you please remove this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jonathanlehto -- sorry for the delay in reviewing -- the time I can spend on this crate is quite limited and I don't have the bandwidth to study problems in depth typically.
I took a look over this PR again and left some feedback. Let me know what you think
fn test_insert_into_values() { | ||
let sql = r#"INSERT INTO t1 (a) VALUES(1)"#; | ||
|
||
assert!(Parser::parse_sql(&GenericDialect {}, sql).is_ok()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please put these tests in common with other similar tests, like this:
Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look |
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
Fixing a bug where the columns after hive check would consume a query