-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix status line display of the executed SQL command #432
Conversation
a9dbd7b
to
d1c7950
Compare
return re.findall(r'[\w]+', statement)[0].upper() | ||
statement = to_statement(expression) | ||
return str(statement.token_first(skip_ws=True, skip_cm=True)).upper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the spot where the previous naive way of finding the canonical SQL command of an expression based on regular expressions was replaced by using a more elaborate method now, completely based on sqlparse's capabilities to skip whitespace and comments on behalf of the tokenized SQL statement.
All the other changes in this patch merely revolve around compensating for this improvement.
cmd._exec_and_print = MagicMock() | ||
cmd._exec = Mock() | ||
cmd.process_iterable(sql.splitlines()) | ||
self.assertListEqual(cmd._exec_and_print.mock_calls, [ | ||
self.assertListEqual(cmd._exec.mock_calls, [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That noise, and subsequent changes, have been needed to move the testing probe to a different spot. The corresponding tests verify that what goes in (user input SQL statement) will yield the correct outcome when being submitted to the database.
Previously, the probe was sitting on _exec_and_print
by mocking it and verifying the calls on this mock. Now, that probing moved to the more low-level _exec
method. This changes the execution behavior of the test suite, as the _exec_and_print
method will now be executed fully 1, unlocking the possibility to test its effects, which hasn't been possible before.
Footnotes
-
That is the reason why the test suite also needs to supply a mocked cursor now, because the ingredients of
_exec_and_print
need it. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks okay to me but I think we should look into replacing sqlparse with a parser generated from the CrateDB antlr grammar, to ensure it supports our SQL dialect.
See https://github.com/antlr/antlr4/blob/4.6/doc/python-target.md
Thanks. I've diverted this into GH-433. Please adjust the wording / emphasis where you see applicable. |
`sqlparse.sql.Statement.token_first()` provides the `skip_cm` option already: if *skip_cm* is ``True`` (default: ``False``), comments are ignored too. Using `statement.get_type()` would be even more DWIM: The returned value is a string holding an upper-cased reprint of the first DML or DDL keyword. If the first token in this group isn't a DML or DDL keyword "UNKNOWN" is returned. Whitespaces and comments at the beginning of the statement are ignored. However, that would introduce a regression that the outcome of non-DML/DDL keywords like `BEGIN` would yield `UNKNOWN`.
d1c7950
to
13a6485
Compare
Not sure whether it's related but just in case https://github.com/crate/crate-alerts/issues/626 |
crate/crash#432 causes some undesired output: VALUES (1, 'one'), (2, 'two'), (3, 'three') +------+-------+ | col1 | col2 | +------+-------+ | 1 | one | | 2 | two | | 3 | three | +------+-------+ VALUES (1, 'ONE'), (2, 'TWO'), (3, 'THREE') 3 rows in set (0.004 sec)
Problem
When prefixing an SQL statement with an SQL comment (both per-line
--
and block/* */
), crash mistreats that comment as an SQL command when printing its status line.Full Example:
Solution
Because crash already uses the sqlparse SQL parser, it wasn't too difficult to re-use its tokenizer already for extracting the proper canonical SQL command from the SQL expression.
Review FYI
The fundamental change is very slim, but the patch grew a bit in size because software tests and corresponding mocking techniques needed to be adjusted a bit for better testability, and for backward-compatibility reasons with existing function interfaces (
str
vs.sql.Statement
).At a few of those spots, the diff also looks a bit strange, but the changes should be sound. Please take it into consideration when reviewing. When in doubt, look at the new code, not the diff. Thanks!
Details
sqlparse.sql.Statement.token_first()
provides theskip_cm
option already:Using
statement.get_type()
would be even more DWIM:However, that would introduce a regression that the outcome of parsing non-DML/DDL keywords like
BEGIN
would yieldUNKNOWN
, so we decided against using it for now. If you think it would be the better option, please advise correspondingly./cc @surister, @seut, @mfussenegger