More dialect checking, fixes, inheritance cleanup #2942

barrywhart · 2022-03-29T10:25:35Z

Brief summary of the change made

Add back some runtime checks for replace() that were deleted in prior PR
Add a check that segment type matches base dialect when replacing a segment
Add a check that if we're replacing a segment that defines both match_grammar and parse_grammar, the replacement must define both of them or neither of them
Manual cleanup of dialects to use inheritance where appropriate

Are there any other side effects of this change that we should be aware of?

Pull Request checklist

Please confirm you have completed any of the necessary steps below.
Included test cases to demonstrate any code changes, which may be one or more of the following:
- .yml rule test cases in test/fixtures/rules/std_rule_cases.
- .sql/.yml parser test cases in test/fixtures/dialects (note YML files can be auto generated with tox -e generate-fixture-yml).
- Full autofix test cases in test/fixtures/linter/autofix.
- Other.
Added appropriate documentation for the change.
Created GitHub issues for any relevant followup/future enhancements if appropriate.

barrywhart · 2022-03-29T10:42:27Z

src/sqlfluff/core/dialects/base.py

@@ -152,7 +152,28 @@ def replace(self, **kwargs: DialectElementType):
        for n in kwargs:
            if n not in self._library:  # pragma: no cover
                raise ValueError(f"{n!r} is not already registered in {self!r}")
-            self._library[n] = kwargs[n]
+
+            # To replace a segment, the replacement must either be a


This code was in the segment() decorator I removed in the previous PR. That code is still useful, so I moved it to the still-used replace() function.

barrywhart · 2022-03-29T12:11:32Z

src/sqlfluff/core/dialects/base.py

+                if self._library[n].type != cls.type:
+                    raise ValueError(  # pragma: no cover
+                        f"Cannot replace {n!r} because 'type' property does not "
+                        f"match: {cls.type} != {self._library[n].type}"
+                    )


These lines are actually new. I had wondered if the type property was always consistent when replacing a parent segment, and I found several cases where it wasn't. This was presumably a mistake/accident. Several, but not all, of these mismatches were in the Exasol dialect.

Note that when using segment inheritance (now the recommended usual practice when replacing segments), the type property can be inherited and thus mismatches avoided.

barrywhart · 2022-03-29T12:12:00Z

src/sqlfluff/dialects/dialect_bigquery.py

-    type = "select_statement"
-    match_grammar = ansi_dialect.get_segment(
-        "UnorderedSelectStatementSegment"
-    ).match_grammar.copy()


With inheritance, no need to copy if not modifying.

barrywhart · 2022-03-29T12:12:39Z

src/sqlfluff/dialects/dialect_bigquery.py

-    match_grammar = ansi_dialect.get_segment(
-        "WildcardExpressionSegment"
-    ).match_grammar.copy(
+    match_grammar = ansi.WildcardExpressionSegment.match_grammar.copy(


This PR eliminates dialects using get_segment() in favor of direct attribute access (cleaner).

barrywhart · 2022-03-29T12:13:37Z

src/sqlfluff/dialects/dialect_postgres.py

@@ -1038,13 +1031,10 @@ class SelectClauseSegment(BaseSegment):
        enforce_whitespace_preceding_terminator=True,
    )

-    parse_grammar = Ref("SelectClauseSegmentGrammar")


This is now inherited from the base class.

barrywhart · 2022-03-29T12:14:11Z

src/sqlfluff/dialects/dialect_postgres.py

-            OneOf("PRECEDING", "FOLLOWING"),
-        ),
-    )
+    _frame_extent = ansi.FrameClauseSegment._frame_extent


I confirmed this is identical to ANSI, so I removed the duplication.

Does it not get it automatically then through inheritance? So is this line needed?

Did you answer this one @barrywhart ? Other than that good to go I think.

It's similar to a question you asked previously. Once the class has been created, FrameClauseSegment._frame_extent exists. But not during the class definition. On line 2754, we need to use it 3 times, so it's shorter to assign it to a variable rather than repeating the full reference 3 times:

OneOf(ansi.FrameClauseSegment._frame_extent, Sequence("BETWEEN", ansi.FrameClauseSegment._frame_extent, "AND", ansi.FrameClauseSegment._frame_extent)),

I think I shared a Stack Overflow link earlier -- if you go back and read that, they explain how this works. It's definitely different than most other OO languages like C++ and Java. The behavior is arguably "simpler", but makes code inside a class more verbose sometimes.

barrywhart · 2022-03-29T12:15:29Z

src/sqlfluff/dialects/dialect_snowflake.py

@@ -1570,16 +1563,6 @@ class AlterWarehouseStatementSegment(BaseSegment):
    )


-class CommentClauseSegment(BaseSegment):


This entire segment definition is identical to the base ANSI dialect, so I removed it.

It has a different comment to show it's not used for views or tables. But yeah agreed not needed.

barrywhart · 2022-03-29T12:18:21Z

src/sqlfluff/dialects/dialect_sparksql.py

@@ -981,7 +972,7 @@ class InsertStatementSegment(BaseSegment):
    https://spark.apache.org/docs/latest/sql-ref-syntax-dml-insert-overwrite-table.html
    """

-    type = "insert_table_statement"
+    type = "insert_statement"


Fixed a type mismatch with the ANSI segment type. Did not use inheritance because ANSI segment uses parse_grammar but this one doesn't.

codecov · 2022-03-29T12:40:17Z

Codecov Report

Merging #2942 (0d2e531) into main (1845353) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main     #2942   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          164       164           
  Lines        11892     11843   -49     
=========================================
- Hits         11892     11843   -49

Impacted Files	Coverage Δ
src/sqlfluff/core/dialects/base.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_ansi.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_bigquery.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_exasol.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_hive.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_mysql.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_oracle.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_postgres.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_redshift.py	`100.00% <100.00%> (ø)`
src/sqlfluff/dialects/dialect_snowflake.py	`100.00% <100.00%> (ø)`
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1845353...0d2e531. Read the comment docs.

…nt has both

src/sqlfluff/dialects/dialect_snowflake.py

tunetheweb · 2022-03-29T14:11:33Z

src/sqlfluff/dialects/dialect_postgres.py

-            OneOf("PRECEDING", "FOLLOWING"),
-        ),
-    )
+    _frame_extent = ansi.FrameClauseSegment._frame_extent


Does it not get it automatically then through inheritance? So is this line needed?

src/sqlfluff/dialects/dialect_redshift.py

src/sqlfluff/dialects/dialect_snowflake.py

tunetheweb · 2022-03-29T14:15:22Z

src/sqlfluff/dialects/dialect_snowflake.py

@@ -1570,16 +1563,6 @@ class AlterWarehouseStatementSegment(BaseSegment):
    )


-class CommentClauseSegment(BaseSegment):


It has a different comment to show it's not used for views or tables. But yeah agreed not needed.

src/sqlfluff/dialects/dialect_tsql.py

test/fixtures/dialects/exasol/AlterTableColumn.yml

test/fixtures/dialects/postgres/postgres_drop_trigger.yml

test/fixtures/dialects/teradata/create_table.yml

barrywhart · 2022-03-29T17:31:44Z

@tunetheweb: Ok, I added type hints throughout the ANSI dialect, which allowed me to remove almost every use of # type: ignore in the other dialects. The only remaining instances are a handful of cases like the one below, all involving situations where we override the base's match_grammar.terminator. I consider that code hacky, so it makes sense to me the need for # type: ignore there. At some point, we could look into providing a core function to override the terminator more cleanly, and eliminate the need for these remaining uses of # type: ignore.

match_grammar.terminator = match_grammar.terminator.copy(  # type: ignore

barrywhart · 2022-03-30T00:18:40Z

@tunetheweb: I added the test we discussed. See test/core/linter_test.py::test_require_match_parse_grammar.

src/sqlfluff/core/dialects/base.py

Add back runtime checks for "replace()" that were deleted in prior PR

3c3732a

barrywhart marked this pull request as draft March 29, 2022 10:25

barrywhart commented Mar 29, 2022

View reviewed changes

Barry Hart added 3 commits March 29, 2022 07:04

Clean up Snowflake dialect

05d829f

Fix inconsistent segment type names

6e51ddd

More tidying

741e01d

barrywhart changed the title ~~Add back runtime checks for replace() that were deleted in prior PR~~ More dialect checking, fixes, inheritance cleanup Mar 29, 2022

Barry Hart added 3 commits March 29, 2022 07:47

Postgres cleanup

891203e

Clean up sparksql dialect

7b329d7

Revert bad change

23395b7

barrywhart commented Mar 29, 2022

View reviewed changes

Remove inheritance, fix segment type

c129e8e

barrywhart commented Mar 29, 2022

View reviewed changes

Barry Hart added 2 commits March 29, 2022 08:19

Add back blank line that was removed

46bbc76

Regenerate YML files

e1b853a

barrywhart marked this pull request as ready for review March 29, 2022 12:51

barrywhart requested a review from tunetheweb March 29, 2022 12:51

Require explicit override of both match_grammar/parse_grammar if pare…

d130bfd

…nt has both

barrywhart marked this pull request as draft March 29, 2022 14:09

Fix bug in PartitionClauseSegment

1f3ca23

barrywhart marked this pull request as ready for review March 29, 2022 14:14

tunetheweb reviewed Mar 29, 2022

View reviewed changes

Add type hints to ANSI dialect so others won't need "type: ignore"

2990d5c

Barry Hart added 4 commits March 29, 2022 13:32

Fix typo

edbb823

More "type: ignore"

d9c4854

Coverage

d3ef783

Add error-check test as suggested in PR review

f0159fb

tunetheweb reviewed Mar 30, 2022

View reviewed changes

src/sqlfluff/core/dialects/base.py Outdated Show resolved Hide resolved

tunetheweb reviewed Mar 30, 2022

View reviewed changes

src/sqlfluff/core/dialects/base.py Outdated Show resolved Hide resolved

Remove no covers

3126512

tunetheweb approved these changes Mar 30, 2022

View reviewed changes

Merge branch 'main' into bhart-more_dialect_cleanup

0d2e531

barrywhart merged commit 05dea5d into sqlfluff:main Mar 30, 2022

This was referenced Apr 2, 2022

Delimited Refactor #2831

Merged

Core enhancement: Improve consistency across dialects, and predictabiilty in handling structures #2977

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More dialect checking, fixes, inheritance cleanup #2942

More dialect checking, fixes, inheritance cleanup #2942

barrywhart commented Mar 29, 2022 •

edited

Loading

barrywhart Mar 29, 2022

barrywhart Mar 29, 2022

barrywhart Mar 29, 2022

barrywhart Mar 29, 2022

barrywhart Mar 29, 2022

barrywhart Mar 29, 2022

tunetheweb Mar 29, 2022

tunetheweb Mar 30, 2022

barrywhart Mar 30, 2022

tunetheweb Mar 30, 2022

barrywhart Mar 29, 2022

tunetheweb Mar 29, 2022

barrywhart Mar 29, 2022

codecov bot commented Mar 29, 2022 •

edited

Loading

tunetheweb Mar 29, 2022

tunetheweb Mar 29, 2022

barrywhart commented Mar 29, 2022 •

edited

Loading

barrywhart commented Mar 30, 2022

		@@ -1570,16 +1563,6 @@ class AlterWarehouseStatementSegment(BaseSegment):
		)


		class CommentClauseSegment(BaseSegment):

More dialect checking, fixes, inheritance cleanup #2942

More dialect checking, fixes, inheritance cleanup #2942

Conversation

barrywhart commented Mar 29, 2022 • edited Loading

Brief summary of the change made

Are there any other side effects of this change that we should be aware of?

Pull Request checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 29, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

barrywhart commented Mar 29, 2022 • edited Loading

barrywhart commented Mar 30, 2022

barrywhart commented Mar 29, 2022 •

edited

Loading

codecov bot commented Mar 29, 2022 •

edited

Loading

barrywhart commented Mar 29, 2022 •

edited

Loading