Fix status line display of the executed SQL command #432

amotl · 2024-03-16T15:41:42Z

Problem

When prefixing an SQL statement with an SQL comment (both per-line -- and block /* */), crash mistreats that comment as an SQL command when printing its status line.

Full Example:

$ echo '/*foo*/ select 1' | crash
CONNECT OK
+---+
| 1 |
+---+
| 1 |
+---+
FOO 1 row in set (0.001 sec)

Solution

Because crash already uses the sqlparse SQL parser, it wasn't too difficult to re-use its tokenizer already for extracting the proper canonical SQL command from the SQL expression.

$ echo '/*foo*/ select 1' | crash
CONNECT OK
+---+
| 1 |
+---+
| 1 |
+---+
SELECT 1 row in set (0.003 sec)

Review FYI

The fundamental change is very slim, but the patch grew a bit in size because software tests and corresponding mocking techniques needed to be adjusted a bit for better testability, and for backward-compatibility reasons with existing function interfaces (str vs. sql.Statement).

At a few of those spots, the diff also looks a bit strange, but the changes should be sound. Please take it into consideration when reviewing. When in doubt, look at the new code, not the diff. Thanks!

Details

sqlparse.sql.Statement.token_first() provides the skip_cm option already:

if skip_cm is True (default: False), comments are ignored too.

Using statement.get_type() would be even more DWIM:

The returned value is a string holding an upper-cased reprint of
the first DML or DDL keyword. If the first token in this group
isn't a DML or DDL keyword "UNKNOWN" is returned.

Whitespaces and comments at the beginning of the statement
are ignored.

However, that would introduce a regression that the outcome of parsing non-DML/DDL keywords like BEGIN would yield UNKNOWN, so we decided against using it for now. If you think it would be the better option, please advise correspondingly.

/cc @surister, @seut, @mfussenegger

amotl · 2024-03-16T17:16:20Z

crate/crash/command.py

-    return re.findall(r'[\w]+', statement)[0].upper()
+    statement = to_statement(expression)
+    return str(statement.token_first(skip_ws=True, skip_cm=True)).upper()


That is the spot where the previous naive way of finding the canonical SQL command of an expression based on regular expressions was replaced by using a more elaborate method now, completely based on sqlparse's capabilities to skip whitespace and comments on behalf of the tokenized SQL statement.

All the other changes in this patch merely revolve around compensating for this improvement.

amotl · 2024-03-17T07:38:41Z

tests/test_commands.py

-        cmd._exec_and_print = MagicMock()
+        cmd._exec = Mock()
        cmd.process_iterable(sql.splitlines())
-        self.assertListEqual(cmd._exec_and_print.mock_calls, [
+        self.assertListEqual(cmd._exec.mock_calls, [


That noise, and subsequent changes, have been needed to move the testing probe to a different spot. The corresponding tests verify that what goes in (user input SQL statement) will yield the correct outcome when being submitted to the database.

Previously, the probe was sitting on _exec_and_print by mocking it and verifying the calls on this mock. Now, that probing moved to the more low-level _exec method. This changes the execution behavior of the test suite, as the _exec_and_print method will now be executed fully ¹, unlocking the possibility to test its effects, which hasn't been possible before.

Footnotes

That is the reason why the test suite also needs to supply a mocked cursor now, because the ingredients of _exec_and_print need it. ↩

mfussenegger

looks okay to me but I think we should look into replacing sqlparse with a parser generated from the CrateDB antlr grammar, to ensure it supports our SQL dialect.

See https://github.com/antlr/antlr4/blob/4.6/doc/python-target.md

amotl · 2024-03-18T10:27:00Z

we should look into replacing sqlparse with a parser generated from the CrateDB antlr grammar

Thanks. I've diverted this into GH-433. Please adjust the wording / emphasis where you see applicable.

`sqlparse.sql.Statement.token_first()` provides the `skip_cm` option already: if *skip_cm* is ``True`` (default: ``False``), comments are ignored too. Using `statement.get_type()` would be even more DWIM: The returned value is a string holding an upper-cased reprint of the first DML or DDL keyword. If the first token in this group isn't a DML or DDL keyword "UNKNOWN" is returned. Whitespaces and comments at the beginning of the statement are ignored. However, that would introduce a regression that the outcome of non-DML/DDL keywords like `BEGIN` would yield `UNKNOWN`.

BaurzhanSakhariev · 2024-03-19T07:23:07Z

Not sure whether it's related but just in case https://github.com/crate/crate-alerts/issues/626

crate/crash#432 causes some undesired output: VALUES (1, 'one'), (2, 'two'), (3, 'three') +------+-------+ | col1 | col2 | +------+-------+ | 1 | one | | 2 | two | | 3 | three | +------+-------+ VALUES (1, 'ONE'), (2, 'TWO'), (3, 'THREE') 3 rows in set (0.004 sec)

amotl requested a review from matriv March 16, 2024 15:41

amotl force-pushed the amo/fix-statusline-command branch 3 times, most recently from a9dbd7b to d1c7950 Compare March 16, 2024 17:00

amotl marked this pull request as ready for review March 16, 2024 17:14

amotl commented Mar 16, 2024

View reviewed changes

amotl commented Mar 17, 2024

View reviewed changes

mfussenegger approved these changes Mar 18, 2024

View reviewed changes

amotl mentioned this pull request Mar 18, 2024

Use CrateDB's ANTLR grammar in SQL parser #433

Open

surister approved these changes Mar 18, 2024

View reviewed changes

amotl force-pushed the amo/fix-statusline-command branch from d1c7950 to 13a6485 Compare March 18, 2024 17:14

amotl merged commit c151975 into master Mar 18, 2024
21 checks passed

amotl deleted the amo/fix-statusline-command branch March 18, 2024 17:18

mfussenegger mentioned this pull request Mar 19, 2024

Pin crash <0.31.3 crate/crate#15716

Closed

amotl mentioned this pull request Mar 19, 2024

Fix status line display of the executed SQL command, part 2 #434

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix status line display of the executed SQL command #432

Fix status line display of the executed SQL command #432

amotl commented Mar 16, 2024 •

edited

Loading

amotl Mar 16, 2024 •

edited

Loading

amotl Mar 17, 2024 •

edited

Loading

mfussenegger left a comment

amotl commented Mar 18, 2024 •

edited

Loading

BaurzhanSakhariev commented Mar 19, 2024

Fix status line display of the executed SQL command #432

Fix status line display of the executed SQL command #432

Conversation

amotl commented Mar 16, 2024 • edited Loading

Problem

Solution

Review FYI

Details

amotl Mar 16, 2024 • edited Loading

Choose a reason for hiding this comment

amotl Mar 17, 2024 • edited Loading

Choose a reason for hiding this comment

Footnotes

mfussenegger left a comment

Choose a reason for hiding this comment

amotl commented Mar 18, 2024 • edited Loading

BaurzhanSakhariev commented Mar 19, 2024

amotl commented Mar 16, 2024 •

edited

Loading

amotl Mar 16, 2024 •

edited

Loading

amotl Mar 17, 2024 •

edited

Loading

amotl commented Mar 18, 2024 •

edited

Loading