Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create PostgreSQL parser #327

Merged
merged 51 commits into from
Nov 9, 2024
Merged

Create PostgreSQL parser #327

merged 51 commits into from
Nov 9, 2024

Conversation

loicknuchel
Copy link
Contributor

@loicknuchel loicknuchel commented Oct 17, 2024

Summary by CodeRabbit

  • New Features

    • Enhanced PostgreSQL support for parsing and generating SQL, allowing for improved handling of PostgreSQL syntax.
    • Introduced a new function to filter distinct elements in arrays based on a specified key.
    • Added support for parsing SQL content specifically for the PostgreSQL dialect.
    • New entity social_accounts declared to establish a many-to-many relationship with users.
  • Bug Fixes

    • Updated PostgreSQL converter to enable parsing capabilities, improving input handling.
    • Improved NaN value checking in the AML parser for better accuracy.
  • Documentation

    • Clarified documentation for various entities and types to enhance understanding of the database schema.
  • Tests

    • Added comprehensive test suites for PostgreSQL parsing and building functionalities to ensure robustness.
  • Chores

    • Updated the version number in the configuration and package files to reflect the latest release.

@loicknuchel loicknuchel marked this pull request as ready for review October 29, 2024 21:03
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 21

🧹 Outside diff range and nitpick comments (24)
libs/parser-sql/src/errors.ts (1)

3-4: Improve readability and add documentation.

The function would benefit from:

  1. Multi-line formatting
  2. JSDoc documentation
  3. Input validation

Consider this refactoring:

+/**
+ * Creates a parser error for duplicate definitions in SQL.
+ * @param name - The identifier that is duplicated
+ * @param definedAtLine - The line number where it was first defined (optional)
+ * @param position - The position of the duplicate definition
+ * @returns A parser error with details about the duplication
+ */
 export const duplicated = (name: string, definedAtLine: number | undefined, position: TokenPosition): ParserError =>
-    ({message: `${name} already defined${definedAtLine !== undefined ? ` at line ${definedAtLine}` : ''}`, kind: 'Duplicated', level: ParserErrorLevel.enum.warning, offset: position.offset, position: position.position})
+    {
+        if (!name?.trim()) throw new Error('Name parameter is required')
+        if (!position) throw new Error('Position parameter is required')
+        return {
+            message: `${name} already defined${definedAtLine !== undefined ? ` at line ${definedAtLine}` : ''}`,
+            kind: 'Duplicated',
+            level: ParserErrorLevel.enum.warning,
+            offset: position.offset,
+            position: position.position
+        }
+    }
libs/parser-sql/src/sql.ts (3)

12-13: Consider enhancing error tracking and validation.

The implementation looks good, but consider these improvements:

  1. Add error logging when parsing takes longer than expected
  2. Add validation of the AST structure before building the database
 const start = Date.now()
-if (dialect === DatabaseKind.enum.postgres) return parsePostgresAst(content, opts).flatMap(ast => buildPostgresDatabase(ast, start, Date.now()))
+if (dialect === DatabaseKind.enum.postgres) {
+    const result = parsePostgresAst(content, opts).flatMap(ast => {
+        const duration = Date.now() - start
+        if (duration > 5000) {
+            console.warn(`PostgreSQL parsing took ${duration}ms`)
+        }
+        // Add basic AST validation here
+        return buildPostgresDatabase(ast, start, Date.now())
+    })
+    return result
+}

Line range hint 22-25: Enhance error message and type safety.

The implementation is good, but consider these improvements:

  1. Make the error message more informative by listing supported dialects
  2. Use type narrowing for better type safety
-if (dialect === DatabaseKind.enum.postgres) return generatePostgresDiff(diff)
-return `Can't generate SQL diff for ${dialect} dialect`
+if (dialect === DatabaseKind.enum.postgres) {
+    return generatePostgresDiff(diff)
+}
+const supportedDialects = [DatabaseKind.enum.postgres]
+return `SQL diff generation is only supported for: ${supportedDialects.join(', ')}. Received: ${dialect}`

Line range hint 1-54: Consider future scalability for multiple SQL dialects.

As more SQL dialects are added, this file might become a complex routing point. Consider introducing a dialect registry pattern or strategy pattern to make it more maintainable and scalable.

Example approach:

interface SqlDialect {
  parse(content: string, opts: ParseOptions): ParserResult<Database>
  generate(database: Database): string
  generateDiff(diff: DatabaseDiff): string
}

const dialects: Record<DatabaseKind, SqlDialect> = {
  [DatabaseKind.enum.postgres]: new PostgresDialect(),
  // Add more dialects here
}
libs/aml/resources/full.aml (1)

Documentation needed for empty social_accounts entity

Based on the verification results, the social_accounts entity appears in both the schema and SQL implementations as an empty entity, primarily used for relationship tracking between users. While this pattern is valid and consistently used, adding documentation would improve code clarity.

Suggested improvement:

 namespace social.
 
+# Represents user's social media account connections
+# Currently used only for relationship tracking between users
+# No attributes are stored as account details are managed externally
 social_accounts | entity with no attribute
 
 rel social_accounts <> users | natural many-to-many relation
🔗 Analysis chain

Line range hint 84-88: Consider documenting the purpose of the empty entity.

While the empty entity pattern is valid, it would be helpful to:

  1. Add a documentation comment explaining the purpose of social_accounts
  2. Consider if any common social account attributes should be defined upfront

Here's a suggested improvement:

 namespace social.
 
+# Represents user's social media accounts
+# Currently used only for relationship tracking
 social_accounts | entity with no attribute
 
 rel social_accounts <> users | natural many-to-many relation

Let's verify if this empty entity pattern is used elsewhere in the codebase:


Let me gather more context about the usage and purpose of this entity by searching for related code and documentation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other empty entities to validate this pattern
rg -A 2 "\w+\s+\|\s*entity with no attribute" 

Length of output: 317


Script:

#!/bin/bash
# Search for any comments or documentation related to social_accounts
rg -B 2 "social_accounts" --type-add 'aml:*.aml' --type aml

# Look for any schema definitions or migrations related to social accounts
fd -e sql -e js -e ts . -x rg -l "social.*accounts"

Length of output: 673


Script:

#!/bin/bash
# Check the SQL files for more context about social_accounts
cat ./libs/parser-sql/resources/full.postgres.sql ./demos/ecommerce/source_08_crm_mysql.sql | rg -B 2 -A 2 "social.*accounts"

# Look for any related documentation in markdown files
fd -e md . -x rg -l "social.*accounts"

Length of output: 607


Script:

#!/bin/bash
# Check the documentation in full.md
cat ./libs/aml/resources/full.md | rg -B 2 -A 2 "social.*accounts"

# Look for any test files that might provide context
fd -e test.js -e test.ts -e spec.js -e spec.ts . -x rg -l "social.*accounts"

Length of output: 485

libs/parser-sql/resources/full.postgres.sql (3)

Line range hint 64-71: Consider adding a foreign key constraint for polymorphic associations.

The comments table uses a polymorphic association (item_kind, item_id) without foreign key constraints. This could lead to data integrity issues where item_id references non-existent records.

Consider either:

  1. Adding conditional foreign key constraints using check constraints
  2. Using separate foreign key columns for each referenced table
  3. Creating an intermediate association table

Line range hint 45-52: Inconsistent primary key types between referenced tables.

The post_members table uses UUID for post_id but references cms.posts which uses integer IDs. This will cause a type mismatch error.

Apply this fix:

CREATE TABLE post_members (
-  post_id uuid REFERENCES cms.posts(id),
+  post_id int REFERENCES cms.posts(id),
  user_id int REFERENCES users(id),
  role varchar(10) NOT NULL DEFAULT 'author' CONSTRAINT members_role_chk CHECK (role IN ('author', 'editor')),
  CONSTRAINT post_members_pk PRIMARY KEY (post_id, user_id)
);

Line range hint 1-111: Consider additional performance optimizations.

The schema is well-structured, but consider these improvements:

  1. Add an index on users(email) since it has a UNIQUE constraint
  2. Consider adding an index on comments(created_by) if you frequently query comments by creator
  3. Add an index on post_members(user_id) if you need to find all posts for a user

These indexes would improve query performance for common access patterns. Would you like me to provide the specific SQL statements for these improvements?

libs/models/src/parserResult.ts (1)

64-64: LGTM! Consider adding debug logging.

The additional filter provides protection against malformed token positions, which is good defensive programming. However, silently filtering out positions with undefined offsets might hide underlying parsing issues.

Consider adding debug logging when positions are filtered out to help diagnose potential parsing issues:

-    const pos: TokenPosition[] = positions.filter(isNotUndefined).filter(p => !!p.offset)
+    const pos: TokenPosition[] = positions
+        .filter(isNotUndefined)
+        .filter(p => {
+            if (!p.offset) {
+                console.debug('Filtering out position without offset:', p);
+                return false;
+            }
+            return true;
+        });
libs/aml/resources/full.md (1)

Line range hint 1-168: Consider enhancing documentation clarity and consistency.

A few suggestions to improve the documentation:

  1. Replace informal comments like "looks quite complex but not intended to be used all together" with technical explanations.
  2. Document the usage of unknown type and when it's appropriate.
  3. Standardize schema notation (currently mixing cms.posts and legacy schema.post).
backend/lib/azimutt_web/templates/website/converters/_editors-script.html.heex (1)

Line range hint 1-2: Consider implementing a version management strategy for external dependencies.

The hardcoded versions (@azimutt/aml@0.1.8, @azimutt/parser-sql@0.1.1) ensure stability but might miss important updates. Consider:

  • Using a version range for minor updates (e.g., ^0.1.1)
  • Implementing SRI (Subresource Integrity) hashes for security
  • Setting up automated dependency updates
libs/parser-sql/src/postgresBuilder.test.ts (1)

9-194: Add test cases for error scenarios and edge cases.

Consider adding tests for:

  1. Invalid SQL syntax
  2. Unsupported SQL features
  3. Edge cases like:
    • Very long identifiers
    • Special characters in identifiers
    • Complex nested queries
    • Various JOIN types

Example test cases:

test('invalid SQL syntax', () => {
    expect(() => parse('SELECT * FRO users')).toThrow('Failed to parse SQL')
})

test('identifier length limits', () => {
    const longName = 'a'.repeat(64)
    expect(parse(`CREATE TABLE ${longName} (id int)`)).toEqual(/* ... */)
})
libs/parser-sql/src/postgresAst.ts (3)

25-25: Address the TODO for CreateFunctionStatementAst

Line 25 contains a TODO comment indicating that the type definition for CreateFunctionStatementAst is incomplete. Completing this type will enhance the parser's ability to handle CREATE FUNCTION statements comprehensively.

Would you like assistance in defining the CreateFunctionStatementAst type?


134-147: Ensure consistent naming conventions for enum values

The enum types Operator, OperatorLeft, and OperatorRight mix different naming styles (e.g., symbols like '+' and words like 'Like'). For consistency and readability, consider standardizing the naming conventions.

For example, you could update the Operator enum to use lowercase for SQL keywords:

export type Operator = '+' | '-' | '*' | '/' | '%' | '^' | '&' | '|' | '#' | '<<' | '>>' | '=' | '<' | '>' | '<=' | '>=' | '<>' | '!=' | '||' | '~' | '~*' | '!~' | '!~*' | 'is' | 'like' | 'not like' | 'in' | 'not in' | 'or' | 'and'

This change enhances clarity and aligns with SQL syntax.


150-151: Add documentation for TokenInfo and TokenIssue types

Adding JSDoc comments to TokenInfo and TokenIssue will improve code readability and make it easier for others to understand their purpose and usage.

Suggested additions:

+/**
+ * Contains positional information about a token and any associated parsing issues.
+ */
export type TokenInfo = TokenPosition & { issues?: TokenIssue[] }

+/**
+ * Represents an issue encountered during parsing, such as an error or warning.
+ */
export type TokenIssue = { message: string, kind: string, level: ParserErrorLevel }
libs/parser-sql/src/postgresBuilder.ts (1)

226-226: Typographical error in comment

There's a typo in the comment on line 226. It reads:

// default: AttributeValue.optional(), // now in VIEW

It should likely be:

// default: AttributeValue.optional(), // not in VIEW

This correction clarifies that default values are not applicable in views.

libs/parser-sql/src/postgresParser.test.ts (8)

35-35: Reminder to implement the 'CREATE FUNCTION' test.

There is a TODO comment indicating that the test for 'CREATE FUNCTION' is pending. Implementing this test will help ensure that the parser correctly handles function creation statements.

Would you like assistance in creating this test case?


54-70: Consider implementing and unskipping the 'statements' test.

The 'statements' test is currently skipped, and it contains several TODOs related to complex SQL parsing. Completing and unskipping this test will enhance test coverage and improve parser reliability.

Do you need help addressing the TODOs within this test?


71-79: Consider implementing and unskipping the 'plausible' test.

The 'plausible' test is skipped, and it references SQL scripts from the Plausible Analytics project. Implementing this test will help verify the parser's ability to handle real-world SQL structures.

Would you like assistance in resolving the TODOs and enabling this test?


242-263: Implement tests for 'CREATE FUNCTION' statements.

The createFunction tests are currently skipped and contain placeholder expectations that need to be completed. Implementing these tests will ensure that the parser correctly handles function creation syntax.

Do you need help in writing these tests for different function creation scenarios?


549-559: Simplify helper functions for clarity in 'select' tests.

The helper functions id, int, and op might be unclear to readers unfamiliar with their implementations. Using more descriptive function names or comments can improve code readability.

For example, renaming id to identifier, int to integerLiteral, and op to operation provides immediate clarity on their purposes.


984-1004: Consider using descriptive names for helper functions in complex expression tests.

In the 'complex' expression tests, helper functions like i, n, c, f, p, o, u, g, and l are used. Renaming these functions to more descriptive names would enhance readability and maintainability.

For example:

  • iidentifier
  • nintegerLiteral
  • ccolumnReference
  • ffunctionCall
  • pparameter
  • ooperation
  • uunaryOperation
  • ggrouping
  • llistLiteral

Would you like assistance in refactoring these helper functions?


100-113: Improve readability by formatting long test assertions.

The test assertion for the complex ALTER SEQUENCE statement spans multiple lines with deeply nested objects. Formatting these objects more clearly or using helper functions to construct parts of the expected result can enhance readability.

Consider breaking down the expected result into smaller, well-named constants or using helper functions to build the structure incrementally.


539-544: Add assertions for error cases in 'select' tests.

Currently, the tests for the select statements focus on successful parsing. To improve test coverage, consider adding assertions that check for expected errors when invalid SQL is provided.

For example, testing how the parser handles missing FROM clauses or invalid syntax can help ensure robustness.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 40611cc and 4bef4e1.

📒 Files selected for processing (16)
  • backend/lib/azimutt.ex (1 hunks)
  • backend/lib/azimutt_web/templates/website/converters/_editors-script.html.heex (1 hunks)
  • libs/aml/resources/full.aml (1 hunks)
  • libs/aml/resources/full.json (1 hunks)
  • libs/aml/resources/full.md (1 hunks)
  • libs/models/src/parserResult.ts (1 hunks)
  • libs/parser-sql/resources/complex.postgres.sql (1 hunks)
  • libs/parser-sql/resources/full.postgres.sql (1 hunks)
  • libs/parser-sql/src/errors.ts (1 hunks)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.test.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.ts (1 hunks)
  • libs/parser-sql/src/sql.ts (1 hunks)
  • libs/utils/src/array.ts (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • libs/aml/resources/full.json
  • libs/parser-sql/resources/complex.postgres.sql
🧰 Additional context used
🪛 Biome
libs/parser-sql/src/postgresParser.ts

[error] 107-107: The regular expression includes this negated empty character class.

Negated empty character classes match anything.
If you want to match against [, escape it [.
Otherwise, remove the character class or fill it.

(lint/correctness/noEmptyCharacterClassInRegex)


[error] 111-111: Do not shadow the global "String" property.

Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.

(lint/suspicious/noShadowRestrictedNames)


[error] 159-159: Do not shadow the global "Function" property.

Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.

(lint/suspicious/noShadowRestrictedNames)


[error] 224-224: Do not shadow the global "Set" property.

Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.

(lint/suspicious/noShadowRestrictedNames)


[error] 342-342: This aliasing of this is unnecessary.

Arrow functions inherits this from their enclosing scope.
Safe fix: Use this instead of an alias.

(lint/complexity/noUselessThisAlias)


[error] 1651-1651: isNaN is unsafe. It attempts a type coercion. Use Number.isNaN instead.

See the MDN documentation for more details.
Unsafe fix: Use Number.isNaN instead.

(lint/suspicious/noGlobalIsNan)

🔇 Additional comments (14)
libs/parser-sql/src/sql.ts (2)

7-8: LGTM! Clean import additions for PostgreSQL support.

The new imports follow the existing pattern and are well-organized with related functionality.


Line range hint 17-20: Consider adding database structure validation.

The implementation is clean, but consider adding validation of the database structure before generation to prevent runtime errors.

Let's verify the database structure handling in the generator:

libs/aml/resources/full.aml (1)

82-82: LGTM! Parameter ordering improves readability.

The reordering of parameters in the box type definition (INPUT, OUTPUT, then INTERNALLENGTH) follows a more logical pattern by grouping interface-related parameters first.

libs/parser-sql/resources/full.postgres.sql (3)

Line range hint 110-111: Empty table definition may indicate incomplete implementation.

The social_accounts table is created without any columns. This is unusual and might indicate incomplete implementation.

Let's check if this table is referenced elsewhere:

#!/bin/bash
# Search for references to the social_accounts table
rg -i "from\s+social_accounts|join\s+social_accounts|references\s+social_accounts"

Line range hint 103-103: Incomplete view definition.

The admins view is commented out without a definition. This could cause issues if other parts of the application expect this view to exist.

Let's check for references to this view:

#!/bin/bash
# Search for references to the admins view
rg -i "from\s+admins|join\s+admins"

19-19: Verify the INPUT/OUTPUT functions for the box type.

The box type definition uses lower for both INPUT and OUTPUT functions, which seems incorrect for a geometric type. PostgreSQL typically uses specialized geometric functions for box types.

Let's check the PostgreSQL documentation and other box type usages:

libs/utils/src/array.ts (1)

19-22: Verify integration with SQL parsing.

Since this function is part of a PR introducing PostgreSQL parsing, let's verify its usage in SQL-related functionality.

✅ Verification successful

Integration with SQL parsing is properly implemented

The distinctBy function is correctly integrated within the PostgreSQL parser, specifically in postgresBuilder.ts. It's used to deduplicate attribute paths in SQL expressions, particularly in:

  • Function parameters: e.parameters.flatMap(expressionAttrs)
  • Binary operations: combining attributes from left and right expressions

The implementation is sound as it maintains unique attribute paths by joining them with dots, which is appropriate for SQL column references.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for usage of distinctBy in SQL-related files
echo "Searching for distinctBy usage in SQL parser files..."
rg "distinctBy" --type ts -g "**/parser-sql/**"

Length of output: 611


Script:

#!/bin/bash
# Get more context around the distinctBy usage in postgresBuilder.ts
echo "Getting context around distinctBy usage..."
rg "distinctBy" -B 5 -A 5 libs/parser-sql/src/postgresBuilder.ts

# Check if there are any tests for this functionality
echo -e "\nChecking for related tests..."
rg "expressionAttrs" libs/parser-sql/src/**/*test*

Length of output: 1332

libs/models/src/parserResult.ts (1)

Line range hint 64-71: Verify impact on PostgreSQL parsing.

The position filtering change could affect how PostgreSQL syntax positions are tracked and merged. Let's verify this doesn't impact PostgreSQL parsing accuracy.

✅ Verification successful

Based on the evidence gathered from the shell scripts, I can now provide a final response:

Position filtering change is safe for PostgreSQL parsing

The PostgreSQL parser uses position tracking in a way that's compatible with the position filtering changes:

  • It uses mergePositions from @azimutt/models which already handles undefined positions correctly
  • Position information is created through tokenPosition function which always provides valid offset values
  • The parser wraps results in ParserResult after all position handling is done
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for test files that verify position tracking in PostgreSQL parsing

# Search for PostgreSQL parser test files
echo "Looking for PostgreSQL parser tests:"
fd -e test.ts -e spec.ts postgres

# Search for position merging tests
echo -e "\nLooking for position merging tests:"
rg -l "mergePositions.*postgres" -g "*.{test,spec}.ts"

Length of output: 477


Script:

#!/bin/bash
# Let's examine the PostgreSQL parser tests for position tracking verification

# Check postgresParser.test.ts for position-related tests
echo "=== PostgreSQL Parser Position Tests ==="
rg "position|offset" -C 2 libs/parser-sql/src/postgresParser.test.ts

# Look for actual usage of position tracking in postgres parsing
echo -e "\n=== Position Usage in Postgres Parser ==="
ast-grep --pattern 'function $_($$_) {
  $$$
  position
  $$$
}' libs/parser-sql/src/postgresParser.ts

# Check if the parserResult's position filtering is used in postgres parsing
echo -e "\n=== Position Filtering Usage ==="
rg "filter.*position|filter.*offset" -C 2 libs/parser-sql/src/postgresParser.ts

Length of output: 2435


Script:

#!/bin/bash
# Let's check how positions are actually used in the postgres parser implementation

# Check for position merging or combining in postgres parser
echo "=== Position Handling in Postgres Parser ==="
rg "TokenPosition|ParserResult" -C 3 libs/parser-sql/src/postgresParser.ts

# Look for actual parsing implementations that might use position filtering
echo -e "\n=== Parsing Implementation Details ==="
ast-grep --pattern 'class PostgresParser {
  $$$
}' libs/parser-sql/src/postgresParser.ts

# Check for any test cases that specifically test position information
echo -e "\n=== Position-specific Test Cases ==="
rg "test.*position" -C 2 libs/parser-sql/src/**/*.test.ts

Length of output: 2430

libs/aml/resources/full.md (1)

168-168: LGTM! Parameter reordering is safe.

The reordering of parameters in the box type expression is a cosmetic change that maintains functionality, as PostgreSQL type definitions are order-independent.

Let's verify the consistency of this parameter ordering across the codebase:

✅ Verification successful

Parameter ordering is consistent across the codebase

The search results show that the parameter ordering (INPUT = lower, OUTPUT = lower, INTERNALLENGTH = 16) is consistently used across all relevant files:

  • libs/aml/resources/full.md
  • libs/aml/resources/full.aml
  • libs/aml/resources/full.json
  • libs/parser-sql/resources/full.postgres.sql

The reordering change aligns with the existing pattern in the codebase, making it a safe and consistent modification.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the parameter ordering is consistent across all files
# Test: Search for box type definitions. Expect: Same parameter ordering in all files.
rg -A 3 'box.*(?:EXPRESSION|TYPE).*\('

Length of output: 40


Script:

#!/bin/bash
# Let's try a broader search for type expressions with these parameters
rg -i "(?:INTERNALLENGTH|INPUT|OUTPUT).*(INTERNALLENGTH|INPUT|OUTPUT)" libs/

# Also search specifically in the full.md file to see the context
cat libs/aml/resources/full.md

Length of output: 14584

backend/lib/azimutt_web/templates/website/converters/_editors-script.html.heex (1)

98-98: Verify PostgreSQL dialect support in the SQL parser.

Let's verify that the SQL parser properly supports all PostgreSQL-specific features.

libs/parser-sql/src/postgresBuilder.test.ts (1)

91-98: Document why the full test is skipped and track missing functionality.

The skipped test indicates that some SQL concepts cannot be handled yet. It would be helpful to:

  1. Document which specific SQL concepts are not supported
  2. Create tracking issues for implementing the missing functionality

Would you like me to help create GitHub issues to track the missing SQL functionality?

backend/lib/azimutt.ex (1)

332-332: LGTM! Please verify parser implementation.

The change to enable PostgreSQL parsing capability aligns with the PR objectives.

Let's verify the PostgreSQL parser implementation and tests:

✅ Verification successful

PostgreSQL parser implementation is complete and well-tested

The change to enable PostgreSQL parsing is safe to proceed. The verification shows:

  • Complete parser implementation in libs/parser-sql/src/ with dedicated PostgreSQL modules
  • Comprehensive test coverage with multiple test files:
    • postgresParser.test.ts
    • postgresBuilder.test.ts
    • postgresGenerator.test.ts
    • postgres.test.ts
    • Integration tests in sql.test.ts
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for PostgreSQL parser implementation and tests

# Test 1: Look for PostgreSQL parser implementation
echo "Looking for PostgreSQL parser implementation..."
rg -l "postgres.*parse" --type-add 'elixir:*.{ex,exs}' -g '!lib/azimutt.ex'

# Test 2: Look for PostgreSQL parser tests
echo "Looking for PostgreSQL parser tests..."
rg -l "postgres.*test" --type-add 'elixir:*.{ex,exs}' -g 'test/**/*'

Length of output: 645


Script:

#!/bin/bash
# Let's check the actual implementation and tests with more specific searches

# Check SQL parser implementation for PostgreSQL
echo "=== Checking PostgreSQL parser implementation ==="
rg -A 5 "postgres.*parse" libs/parser-sql/src/sql.ts

# Check test files in libs/parser-sql
echo -e "\n=== Checking PostgreSQL parser tests ==="
fd . libs/parser-sql -e test.ts -x rg -l "postgres"

# Check if there are any PostgreSQL related test files
echo -e "\n=== Looking for PostgreSQL test files ==="
fd test libs/parser-sql -e ts -x rg -l "postgres"

# Check for any PostgreSQL parser related code
echo -e "\n=== Looking for PostgreSQL parser related code ==="
rg -l "PostgreSQL" libs/parser-sql/src/

Length of output: 1518

libs/parser-sql/src/postgresAst.ts (1)

88-88: Clarify the purpose of the array property in ColumnTypeAst

The array property in ColumnTypeAst is optional and of type TokenInfo. It's not clear how this is used to represent array types in PostgreSQL column definitions.

Please ensure that the array property accurately represents array type declarations. If it indicates that the column is an array, consider adding documentation or renaming it for clarity.

libs/parser-sql/src/postgresParser.test.ts (1)

1126-1128: ⚠️ Potential issue

Handle negative decimal numbers correctly in tests.

The test for parsing negative decimal numbers uses -3.14, but the assertion may not account for the negative sign properly.

Ensure that the parser correctly interprets the negative sign and that the expected result reflects the proper token positions and value.

 expect(parseRule(p => p.decimalRule(), '-3.14')).toEqual({
-    result: {...kind('Decimal', 0, 4), value: -3.14}
+    result: {...kind('Decimal', 0, 4), value: -3.14, token: token(0, 4)}
 })

Likely invalid or redundant comment.

libs/parser-sql/src/errors.ts Show resolved Hide resolved
libs/utils/src/array.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresBuilder.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresBuilder.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.ts Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Outside diff range and nitpick comments (5)
libs/parser-sql/src/postgresBuilder.test.ts (2)

90-97: Document reason for skipped test.

The skipped test contains important information about unsupported SQL concepts, but this information is only in a comment. Consider:

  1. Adding a detailed TODO comment explaining which concepts are not supported
  2. Creating tracking issues for the unsupported features
  3. Adding test cases for the supported subset of features

98-192: Add test cases for edge cases and error scenarios.

The current test suite could be enhanced with additional test cases for:

  1. Invalid SQL syntax
  2. Missing or invalid table references
  3. Complex nested subqueries
  4. Error conditions in joins
  5. Column name conflicts
  6. Schema name conflicts

Example test case:

test('invalid SQL syntax', () => {
    expect(() => extract('SELECT * FORM users;', [])).toThrow()
})

test('missing table reference', () => {
    expect(() => extract('SELECT * FROM non_existent_table;', [users])).toThrow()
})
libs/parser-sql/src/postgresBuilder.ts (2)

97-111: Document skipped statement types.

Several statement types are marked with // nothing without explanation. Consider adding comments explaining why these statements are skipped or if they need to be handled in the future.

-    } else if (stmt.kind === 'AlterSchema') { // nothing
+    } else if (stmt.kind === 'AlterSchema') { 
+        // Skipped: Schema alterations don't affect the database structure we're tracking

490-491: Implement entity extraction from additional clauses.

The TODO comment indicates that entity extraction from WHERE, HAVING, and other clauses is missing. This could lead to incomplete analysis of table relationships and column usage.

Would you like help implementing the entity extraction for these additional clauses?

libs/parser-sql/src/postgresAst.ts (1)

4-9: Ensure documentation comments are complete and clear

The comment block detailing conventions is helpful. Please ensure all conventions are thoroughly documented for future maintainability and clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4bef4e1 and f868c8d.

📒 Files selected for processing (4)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.test.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🔇 Additional comments (9)
libs/parser-sql/src/postgresBuilder.test.ts (3)

13-89: Break down complex test case into focused test suites.

The complex test case is still mixing multiple SQL features in a single test. This makes it harder to:

  • Identify which feature is broken when the test fails
  • Maintain and update tests
  • Understand test coverage

The previous review comment about breaking down the test case is still valid and should be addressed.


195-206: Improve error handling in parse function.

The current implementation:

  1. Uses console.error in production code
  2. Mixes performance tracking with parsing logic
  3. Could have more descriptive error messages

The previous review comment about improving error handling is still valid and should be addressed.


208-210: Add type safety to extract function.

The function has potential type safety issues:

  1. No null checking on AST result
  2. Unsafe type casting to SelectStatementInnerAst
  3. No error handling for invalid SQL statements

The previous review comment about adding type safety is still valid and should be addressed.

libs/parser-sql/src/postgresBuilder.ts (4)

112-114: Enhance error handling for unrecognized statements.

Using isNever alone doesn't provide useful runtime feedback.


117-139: LGTM! Well-structured entity and type management.

The functions handle duplicates appropriately with proper error reporting and merging logic.


183-183: Fix nullability logic operator precedence.

The nullability check may not behave as intended.


312-312: Implement NOT NULL and DEFAULT constraint handling.

libs/parser-sql/src/postgresAst.ts (1)

85-85: Issue Resolved: ForeignKeyActionAst now defined

The previous issue regarding the missing ForeignKeyActionAst type has been resolved. The type is now properly defined at line 94.

libs/parser-sql/src/postgresParser.test.ts (1)

958-963: Ensure correct parsing of various literal values

The tests for expressionRule cover literals like strings, integers, decimals, booleans, and nulls. Verify that the parser correctly interprets these literals, including edge cases like negative numbers and different string formats.

Run the following script to check the parser's handling of literals:

✅ Verification successful

Let me analyze the test file to understand how literals are handled in the parser.


Let me try a different approach to analyze the parser implementation and tests.


Based on the output, I can now generate the final response as we have sufficient evidence about the literal parsing implementation and tests.

Literal parsing implementation is comprehensive and well-tested

The test coverage for literal parsing is thorough and includes:

  • Basic literals: strings ('str'), integers (1), decimals (1.2), booleans (true), and null
  • Negative numbers: both integers (-3) and decimals (-3.14)
  • Proper token positioning tracking for all literals
  • Correct AST node generation with appropriate types (StringAst, IntegerAst, DecimalAst, etc.)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the parser correctly handles various literal values.

# Test: Parse different literals and compare the expected results.

declare -a literals=(
    "'test string'"
    "123"
    "-456"
    "78.90"
    "-12.34"
    "true"
    "false"
    "null"
)

for literal in "${literals[@]}"; do
    echo "Testing literal: $literal"
    # Replace parseRule invocation with the appropriate command or script to parse the literal.
    # Expect correct parsing without errors.
    # Note: Replace 'parse_literal' with actual parsing command if available.
    parse_literal "$literal" || echo "Parsing failed for literal: $literal"
done

Length of output: 2308


Script:

#!/bin/bash
# Search for literal parsing tests in the test file
rg -A 5 "expressionRule.*literal" libs/parser-sql/src/postgresParser.test.ts

# Look for the implementation of expressionRule
ast-grep --pattern 'expressionRule() {
  $$$
}'

# Search for other literal-related tests
rg -A 5 "literal.*test" libs/parser-sql/src/postgresParser.test.ts

Length of output: 197


Script:

#!/bin/bash
# Look at the test file content around the literal parsing tests
rg -B 5 -A 10 "'str'" libs/parser-sql/src/postgresParser.test.ts

# Find the implementation of string, integer, decimal, boolean, and null_ functions
ast-grep --pattern 'function string($$$) {
  $$$
}'

ast-grep --pattern 'function integer($$$) {
  $$$
}'

ast-grep --pattern 'function decimal($$$) {
  $$$
}'

# Look for negative number handling
rg -A 3 "parseRule.*-[0-9]" libs/parser-sql/src/postgresParser.test.ts

Length of output: 3368

libs/parser-sql/src/postgresBuilder.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (1)
libs/parser-sql/src/postgresAst.ts (1)

141-143: Extend operator enums to include additional SQL operators

Consider adding operators like BETWEEN, NOT BETWEEN, IS DISTINCT FROM, and IS NOT DISTINCT FROM to the Operator, OperatorLeft, and OperatorRight enums. This will ensure comprehensive parsing and handling of all SQL operations.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f868c8d and daf8532.

📒 Files selected for processing (2)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🔇 Additional comments (2)
libs/parser-sql/src/postgresParser.test.ts (2)

1-32: LGTM! Well-organized imports.

The imports are properly structured, importing necessary testing utilities and type definitions. The use of destructured imports helps keep the code clean and maintainable.


54-70: Consider prioritizing implementation of skipped complex SQL statements.

The TODO comments indicate several complex SQL patterns that are not yet supported:

  • Array operations
  • Case expressions
  • Complex subqueries
  • Transaction isolation levels
  • Array expansion functions

Let's check if these SQL patterns are commonly used in the codebase:

libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (5)
libs/parser-sql/src/postgresBuilder.ts (2)

97-111: Consider implementing statement handlers marked as "nothing".

Several statement types are currently ignored with // nothing comments. While these might not be critical for the current implementation, it would be helpful to:

  1. Document why these statements are ignored
  2. Add TODO comments if implementation is planned for future
  3. Consider logging these statements for debugging purposes
- } else if (stmt.kind === 'AlterSchema') { // nothing
+ } else if (stmt.kind === 'AlterSchema') {
+    // These statements don't affect the schema structure
+    // TODO: Consider logging for debugging
+    console.debug(`Ignoring ${stmt.kind} statement at line ${stmt.token.position.start.line}`)

503-504: Implement entity extraction for WHERE/HAVING clauses.

The TODO comment indicates missing functionality for extracting entities from WHERE/HAVING clauses, which is important for complete dependency analysis.

Would you like help implementing the entity extraction for these clauses? This would involve:

  1. Analyzing predicates in WHERE/HAVING clauses
  2. Extracting table and column references
  3. Adding them to the source tracking
libs/parser-sql/src/postgresParser.test.ts (2)

54-70: Implement missing test cases for complex SQL statements.

The skipped test contains several TODO comments for important SQL features that need testing:

  • Subqueries in expressions
  • Array operations
  • Complex CASE statements
  • Schema-level operations

Would you like help implementing these test cases? I can help create test cases for each TODO item.


71-82: Add test coverage for array type declarations.

The skipped test indicates missing support for various array type declarations:

features character varying(255)[] DEFAULT ARRAY['props'::character varying, 'stats_api'::character varying]
recipients public.citext[] DEFAULT ARRAY[]::public.citext[]

Would you like assistance in implementing these array type tests?

libs/parser-sql/src/postgresAst.ts (1)

119-119: Consider renaming the json property in ColumnAst for clarity

The json property in ColumnAst might be better named to reflect its purpose more clearly. If it represents JSON path operations or operators, consider a name like jsonPath or jsonOperations for improved readability.

Apply this diff to rename the property:

-export type ColumnAst = { kind: 'Column', schema?: IdentifierAst, table?: IdentifierAst, column: IdentifierAst, json?: ColumnJsonAst[] }
+export type ColumnAst = { kind: 'Column', schema?: IdentifierAst, table?: IdentifierAst, column: IdentifierAst, jsonPath?: ColumnJsonAst[] }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between daf8532 and 53b06c9.

📒 Files selected for processing (4)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.test.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • libs/parser-sql/src/postgresBuilder.test.ts
🔇 Additional comments (2)
libs/parser-sql/src/postgresParser.test.ts (1)

1-32: LGTM! Clean imports organization.

The imports are well-structured with clear separation between standard libraries, testing framework, and internal modules.

libs/parser-sql/src/postgresAst.ts (1)

145-145: Verify that all relevant SQL operators are included in the Operator type

Ensure that the Operator type includes all operators used in PostgreSQL. Missing operators could lead to parsing issues or incomplete AST representations.

Run the following script to check for operators used in the codebase that may not be included in the Operator type:

Note: Replace libs/parser-sql/src/postgresAst.ts with the correct path if necessary. This script will list any operators that are used in the parser but are not defined in the Operator type.

✅ Verification successful

This script will help us:

  1. Identify relevant files in the SQL parser
  2. Find operator-related code context
  3. Look for operator kind definitions in the parser implementation

Based on the search results, I can now generate a final response. The codebase shows comprehensive operator coverage in the PostgreSQL parser implementation, including:

All PostgreSQL operators are properly included in the Operator type

The Operator type includes:

  • Arithmetic operators: +, -, *, /, %, ^
  • Bitwise operators: &, |, #, <<, >>
  • Comparison operators: =, <, >, <=, >=, <>, !=
  • String operators: ||, ~, ~*, !~, !~*
  • Logical operators: And, Or
  • Special operators: Is, Like, NotLike, In, NotIn

The implementation in postgresParser.ts shows these operators are actively used in parsing SQL statements, and the test files demonstrate their proper functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find operators used in parsing that are not defined in the `Operator` type.

# Extract operator kinds from the parser implementation.
rg "kind:\s*'(.*?)'" -o -r '$1' | sort | uniq > operators_used.txt

# Extract operator definitions from the `Operator` type in `postgresAst.ts`.
sed -n "/export type Operator =/,/;/p" libs/parser-sql/src/postgresAst.ts | \
grep -o "'[^']\+'" | tr -d "'" | sort | uniq > operators_defined.txt

# Find operators used but not defined.
comm -23 operators_used.txt operators_defined.txt

Length of output: 309


Script:

#!/bin/bash
# Let's try a different approach to find operator usage patterns in the codebase

# First, let's see all files that might contain SQL operator definitions or usage
fd -e ts -e js . libs/parser-sql/src

# Then look for operator-related patterns in these files
rg -A 2 -B 2 "operator|Operator" libs/parser-sql/src

# Also search for specific operator patterns in the parser implementation
rg "kind:\s*['\"].*?['\"]" libs/parser-sql/src

Length of output: 87096

libs/parser-sql/src/postgresBuilder.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Outside diff range and nitpick comments (3)
libs/parser-sql/src/postgresBuilder.ts (2)

97-111: Document rationale for unhandled statement types.

Several statement types are marked with // nothing comments. Consider adding documentation explaining why these statements don't require processing or if they will be implemented in the future.


324-324: Implement handling for NOT NULL and DEFAULT constraints.

The TODO comment indicates missing implementation for dropping NOT NULL and DEFAULT constraints.

Would you like me to help implement the missing constraint handling or create a GitHub issue to track this task?

libs/parser-sql/src/postgresParser.test.ts (1)

54-70: Address skipped tests for complex SQL statements.

The test file contains several TODO comments for complex SQL statements that are not yet handled by the parser. These include:

  • Array expressions
  • Case statements
  • Transaction isolation
  • Complex subqueries

Would you like help implementing the parser logic for these complex SQL statements?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 53b06c9 and 8b0bd91.

📒 Files selected for processing (6)
  • libs/aml/src/amlParser.ts (1 hunks)
  • libs/models/src/parserResult.ts (2 hunks)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
  • libs/utils/src/array.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • libs/aml/src/amlParser.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • libs/utils/src/array.ts
🧰 Additional context used
🪛 Biome
libs/models/src/parserResult.ts

[error] 87-87: isFinite is unsafe. It attempts a type coercion. Use Number.isFinite instead.

See the MDN documentation for more details.
Unsafe fix: Use Number.isFinite instead.

(lint/suspicious/noGlobalIsFinite)


[error] 92-92: isFinite is unsafe. It attempts a type coercion. Use Number.isFinite instead.

See the MDN documentation for more details.
Unsafe fix: Use Number.isFinite instead.

(lint/suspicious/noGlobalIsFinite)

🔇 Additional comments (3)
libs/models/src/parserResult.ts (1)

64-64: Verify the impact of stricter position filtering.

The additional filter filter(p => !!p.offset) makes position merging more defensive by excluding positions with undefined/null offsets. While this prevents runtime errors, it might silently skip invalid positions.

Let's check if there are any cases where positions might have undefined offsets:

✅ Verification successful

Based on the search results, I can now provide a conclusive response about the position filtering change:

The additional offset filter is correct and necessary

The codebase shows that TokenPosition is consistently used across parsers (SQL, AML) where the offset is a required field in the type definition (libs/models/src/parserResult.ts). The filter p => !!p.offset is defensive programming that ensures type safety, as:

  1. The type is defined with a required offset: TokenPosition = z.object({offset: TokenOffset, position: TokenEditor}).strict()
  2. All token position creators in the codebase (tokenPosition helper, parser implementations) always set the offset
  3. The filter prevents runtime errors if any parser implementation accidentally provides malformed positions
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for TokenPosition creation/manipulation to verify offset handling
ast-grep --pattern 'TokenPosition = {
  offset: $_,
  $$$
}'

Length of output: 62


Script:

#!/bin/bash
# Let's try a different approach to find TokenPosition usage and creation
# First, let's find all references to TokenPosition type
rg "TokenPosition" -A 3 -B 3

# Also search for position offset assignments
rg "\.offset\s*=" -A 2 -B 2

# And search for position creation patterns
ast-grep --pattern 'position: {
  $$$
}'

Length of output: 19056

libs/parser-sql/src/postgresBuilder.ts (1)

568-569: Implement clear schema resolution strategy.

The comment "not sure what to do with schema :/" indicates uncertainty in schema handling. This could lead to issues in multi-schema environments.

Consider implementing a clear schema resolution strategy:

  1. Exact schema match first
  2. Default schema fallback
  3. Error/warning for ambiguous cases
libs/parser-sql/src/postgresParser.test.ts (1)

83-116: ⚠️ Potential issue

Improve error handling for external SQL file testing.

The current implementation lacks proper error handling for network requests:

  1. No timeout mechanism
  2. No validation of response status
  3. No error handling for network failures

Apply this improvement:

 await Promise.all(structures.map(async url => {
-    const sql = await fetch(url).then(res => res.text())
+    try {
+        const controller = new AbortController();
+        const timeoutId = setTimeout(() => controller.abort(), 30000);
+        
+        const response = await fetch(url, { 
+            signal: controller.signal,
+            headers: { 'Accept': 'text/plain' }
+        });
+        clearTimeout(timeoutId);
+        
+        if (!response.ok) {
+            console.error(`Failed to fetch ${url}: ${response.status} ${response.statusText}`);
+            return;
+        }
+        
+        const sql = await response.text();
+        const parsed = parsePostgresAst(sql, {strict: true});
+        if ((parsed.errors || []).length > 0) {
+            console.log(`Error in ${url}`);
+            expect(parsed.errors || []).toEqual([]);
+        }
+    } catch (error) {
+        console.error(`Error processing ${url}:`, error);
+        throw error;
+    }
-    const parsed = parsePostgresAst(sql, {strict: true})
-    if ((parsed.errors || []).length > 0) {
-        console.log(`Error in ${url}`)
-        expect(parsed.errors || []).toEqual([])
-    }
 }))

Likely invalid or redundant comment.

libs/models/src/parserResult.ts Outdated Show resolved Hide resolved
libs/models/src/parserResult.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresBuilder.ts Show resolved Hide resolved
libs/parser-sql/src/postgresBuilder.ts Show resolved Hide resolved
libs/parser-sql/src/postgresBuilder.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Outdated Show resolved Hide resolved
export type ExpressionAst = (LiteralAst | ParameterAst | ColumnAst | WildcardAst | FunctionAst | GroupAst | OperationAst | OperationLeftAst | OperationRightAst | ListAst) & { cast?: { token: TokenInfo, type: ColumnTypeAst } }
export type LiteralAst = StringAst | IntegerAst | DecimalAst | BooleanAst | NullAst
export type ColumnAst = { kind: 'Column', schema?: IdentifierAst, table?: IdentifierAst, column: IdentifierAst, json?: ColumnJsonAst[] }
export type ColumnJsonAst = { kind: JsonOp, token: TokenInfo, field: StringAst | ParameterAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Support integer indices in ColumnJsonAst for JSON arrays

Currently, ColumnJsonAst defines field as StringAst | ParameterAst. To fully support JSON array indexing, consider allowing IntegerAst as a possible type for field.

Apply this diff to update the ColumnJsonAst type definition:

 export type ColumnJsonAst = { kind: JsonOp, token: TokenInfo, field: StringAst | ParameterAst
+  | IntegerAst
 }

This change enables parsing of JSON array indices in column expressions.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type ColumnJsonAst = { kind: JsonOp, token: TokenInfo, field: StringAst | ParameterAst }
export type ColumnJsonAst = { kind: JsonOp, token: TokenInfo, field: StringAst | ParameterAst | IntegerAst }

export type SequenceParamAst = { token: TokenInfo, value: IntegerAst }
export type SequenceParamOptAst = { token: TokenInfo, value?: IntegerAst }
export type SequenceOwnedByAst = { token: TokenInfo, owner: { kind: 'None', token: TokenInfo } | { kind: 'Column', schema?: IdentifierAst, table: IdentifierAst, column: IdentifierAst } }
export type FunctionArgumentAst = { mode?: { kind: 'In' | 'Out' | 'InOut' | 'Variadic', token: TokenInfo }, name?: IdentifierAst, type: ColumnTypeAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add support for default values in FunctionArgumentAst

To accurately represent PostgreSQL function arguments, which can have default values, consider extending FunctionArgumentAst to include a default property.

Apply this diff to update the FunctionArgumentAst type definition:

 export type FunctionArgumentAst = { mode?: { kind: 'In' | 'Out' | 'InOut' | 'Variadic', token: TokenInfo }, name?: IdentifierAst, type: ColumnTypeAst
+  , default?: ExpressionAst
 }

This change allows for the parsing of function arguments with default values, enhancing the completeness of the AST.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type FunctionArgumentAst = { mode?: { kind: 'In' | 'Out' | 'InOut' | 'Variadic', token: TokenInfo }, name?: IdentifierAst, type: ColumnTypeAst }
export type FunctionArgumentAst = { mode?: { kind: 'In' | 'Out' | 'InOut' | 'Variadic', token: TokenInfo }, name?: IdentifierAst, type: ColumnTypeAst, default?: ExpressionAst }

export type ForeignKeyAction = 'NoAction' | 'Restrict' | 'Cascade' | 'SetNull' | 'SetDefault'
export type DropObject = 'Index' | 'MaterializedView' | 'Sequence' | 'Table' | 'Type' | 'View'
export type DropMode = 'Cascade' | 'Restrict'
export type CommentObject = 'Column' | 'Constraint' | 'Database' | 'Extension' | 'Index' | 'MaterializedView' | 'Schema' | 'Table' | 'Type' | 'View'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Include 'Function' in CommentObject to support commenting on functions

The CommentObject type currently lacks 'Function', which is necessary to represent COMMENT ON FUNCTION statements in PostgreSQL.

Apply this diff to update the CommentObject type definition:

 export type CommentObject = 'Column' | 'Constraint' | 'Database' | 'Extension' | 'Index' | 'MaterializedView' | 'Schema' | 'Table' | 'Type' | 'View'
+ | 'Function'

This enhancement allows the AST to accurately represent comments on functions.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type CommentObject = 'Column' | 'Constraint' | 'Database' | 'Extension' | 'Index' | 'MaterializedView' | 'Schema' | 'Table' | 'Type' | 'View'
export type CommentObject = 'Column' | 'Constraint' | 'Database' | 'Extension' | 'Index' | 'MaterializedView' | 'Schema' | 'Table' | 'Type' | 'View' | 'Function'

export type CommitStatementAst = { kind: 'Commit', token: TokenInfo, object?: { kind: 'Work' | 'Transaction', token: TokenInfo }, chain?: { token: TokenInfo, no?: TokenInfo } }
export type CreateExtensionStatementAst = { kind: 'CreateExtension', token: TokenInfo, ifNotExists?: TokenInfo, name: IdentifierAst, with?: TokenInfo, schema?: { token: TokenInfo, name: IdentifierAst }, version?: { token: TokenInfo, number: StringAst | IdentifierAst }, cascade?: TokenInfo }
export type CreateFunctionStatementAst = { kind: 'CreateFunction', token: TokenInfo, replace?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, args: FunctionArgumentAst[], returns?: FunctionReturnsAst, language?: { token: TokenInfo, name: IdentifierAst }, behavior?: { kind: 'Immutable' | 'Stable' | 'Volatile', token: TokenInfo }, nullBehavior?: { kind: 'Called' | 'ReturnsNull' | 'Strict', token: TokenInfo }, definition?: { token: TokenInfo, value: StringAst }, return?: { token: TokenInfo, expression: ExpressionAst } }
export type CreateIndexStatementAst = { kind: 'CreateIndex', token: TokenInfo, unique?: TokenInfo, concurrently?: TokenInfo, ifNotExists?: TokenInfo, name?: IdentifierAst, only?: TokenInfo, schema?: IdentifierAst, table: IdentifierAst, using?: { token: TokenInfo, method: IdentifierAst }, columns: IndexColumnAst[], include?: { token: TokenInfo, columns: IdentifierAst[] }, where?: WhereClauseAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add support for storage parameters in CreateIndexStatementAst

To fully represent all options available in CREATE INDEX statements, consider including a with property to capture storage parameters specified using the WITH clause.

Apply this diff to update the CreateIndexStatementAst type definition:

 export type CreateIndexStatementAst = { kind: 'CreateIndex', token: TokenInfo, unique?: TokenInfo, concurrently?: TokenInfo, ifNotExists?: TokenInfo, name?: IdentifierAst, only?: TokenInfo, schema?: IdentifierAst, table: IdentifierAst, using?: { token: TokenInfo, method: IdentifierAst }, columns: IndexColumnAst[], include?: { token: TokenInfo, columns: IdentifierAst[] }, where?: WhereClauseAst
+  , with?: { token: TokenInfo, options: ExpressionAst[] }
 }

This addition enables parsing of storage parameters in index creation statements.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type CreateIndexStatementAst = { kind: 'CreateIndex', token: TokenInfo, unique?: TokenInfo, concurrently?: TokenInfo, ifNotExists?: TokenInfo, name?: IdentifierAst, only?: TokenInfo, schema?: IdentifierAst, table: IdentifierAst, using?: { token: TokenInfo, method: IdentifierAst }, columns: IndexColumnAst[], include?: { token: TokenInfo, columns: IdentifierAst[] }, where?: WhereClauseAst }
export type CreateIndexStatementAst = { kind: 'CreateIndex', token: TokenInfo, unique?: TokenInfo, concurrently?: TokenInfo, ifNotExists?: TokenInfo, name?: IdentifierAst, only?: TokenInfo, schema?: IdentifierAst, table: IdentifierAst, using?: { token: TokenInfo, method: IdentifierAst }, columns: IndexColumnAst[], include?: { token: TokenInfo, columns: IdentifierAst[] }, where?: WhereClauseAst, with?: { token: TokenInfo, options: ExpressionAst[] } }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Outside diff range and nitpick comments (5)
libs/parser-sql/src/postgresParser.test.ts (2)

54-70: Address skipped tests for complex SQL statements.

The test.skip('statements', ...) block contains several TODO comments for complex SQL parsing scenarios that are currently not implemented. Consider prioritizing these to ensure comprehensive parser coverage.

Would you like help implementing these test cases? I can assist with:

  • Array expressions
  • Case statements
  • Transaction isolation levels
  • Complex subqueries

411-424: Enhance test coverage for materialized views.

The current test only covers a simple SELECT statement. Consider adding test cases for:

  • Complex queries with JOINs
  • Aggregate functions
  • Window functions
  • CTEs (Common Table Expressions)

Add these test cases:

test('complex materialized view', () => {
    expect(parsePostgresAst(`
        CREATE MATERIALIZED VIEW user_stats AS
        WITH monthly_stats AS (
            SELECT user_id, date_trunc('month', created_at) as month,
                   count(*) as action_count
            FROM user_actions
            GROUP BY user_id, date_trunc('month', created_at)
        )
        SELECT u.id, u.name,
               sum(ms.action_count) OVER (PARTITION BY u.id) as total_actions
        FROM users u
        LEFT JOIN monthly_stats ms ON ms.user_id = u.id;
    `)).toEqual({/* add expected AST */})
})
libs/parser-sql/src/postgresAst.ts (3)

Line range hint 165-165: Define TokenIssueKind as an enum for better type safety.

The kind property in TokenIssue is currently typed as string. Defining it as an enum enhances type safety and consistency.

Apply this diff to define TokenIssueKind:

 export type TokenIssue = { message: string, kind: string, level: ParserErrorLevel }
+export enum TokenIssueKind {
+  SyntaxError = 'SyntaxError',
+  LexicalError = 'LexicalError',
+  SemanticError = 'SemanticError',
+  Warning = 'Warning'
+}
+export type TokenIssue = { message: string, kind: TokenIssueKind, level: ParserErrorLevel }

This change restricts kind to known values, reducing potential errors.


145-145: Add missing operators to Operator type.

The Operator type is missing some PostgreSQL operators like ILIKE and NOT ILIKE.

Apply this diff to include the missing operators:

 export type Operator = '+' | '-' | '*' | '/' | '%' | '^' | '&' | '|' | '#' | '<<' | '>>' | '=' | '<' | '>' | '<=' | '>=' | '<>' | '!=' | '||' | '~' | '~*' | '!~' | '!~*' | 'Is' | 'Like' | 'NotLike' | 'In' | 'NotIn' | 'Or' | 'And'
+ | 'ILike' | 'NotILike'

This change ensures that all comparison operators are represented in the AST.


15-20: Consider adding support for additional SQL statements.

The StatementAst currently covers many statements but lacks support for statements like COPY, GRANT, and REVOKE.

Apply this diff to include these statements:

 export type StatementAst = { meta: TokenInfo } & (
   AlterSchemaStatementAst | AlterSequenceStatementAst |
   AlterTableStatementAst | BeginStatementAst | CommentOnStatementAst | CommitStatementAst |
   CreateExtensionStatementAst | CreateFunctionStatementAst | CreateIndexStatementAst |
   CreateMaterializedViewStatementAst | CreateSchemaStatementAst | CreateSequenceStatementAst |
   CreateTableStatementAst | CreateTypeStatementAst | CreateViewStatementAst | DeleteStatementAst | DropStatementAst |
   InsertIntoStatementAst | SelectStatementAst | SetStatementAst | ShowStatementAst | UpdateStatementAst
+  | CopyStatementAst | GrantStatementAst | RevokeStatementAst
 )

And define the new statement types accordingly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8b0bd91 and a339635.

📒 Files selected for processing (3)
  • libs/models/src/parserResult.ts (2 hunks)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • libs/models/src/parserResult.ts
🔇 Additional comments (1)
libs/parser-sql/src/postgresParser.test.ts (1)

1-33: LGTM! Well-organized imports and type declarations.

The file follows TypeScript best practices by using type imports and has a clear organization of dependencies.

libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
export type CreateTableModeAst = ({ kind: 'Unlogged', token: TokenInfo }) | ({ kind: 'Temporary', token: TokenInfo, scope?: { kind: 'Local' | 'Global', token: TokenInfo } })
export type UpdateColumnAst = { column: IdentifierAst, value: ExpressionAst | { kind: 'Default', token: TokenInfo } }
export type SelectStatementInnerAst = SelectStatementMainAst & SelectStatementResultAst
export type SelectStatementMainAst = SelectClauseAst & { from?: FromClauseAst, where?: WhereClauseAst, groupBy?: GroupByClauseAst, having?: HavingClauseAst, window?: WindowClauseAst[] }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add support for WITH clauses in SelectStatementMainAst.

The current SelectStatementMainAst definition does not handle Common Table Expressions (CTEs) using the WITH keyword.

Apply this diff to include a with property:

 export type SelectStatementMainAst = 
+  { with?: WithClauseAst }
   & SelectClauseAst 
   & { from?: FromClauseAst, where?: WhereClauseAst, groupBy?: GroupByClauseAst, having?: HavingClauseAst, window?: WindowClauseAst[] }

+export type WithClauseAst = { token: TokenInfo, recursive?: TokenInfo, ctes: CommonTableExpressionAst[] }
+export type CommonTableExpressionAst = { name: IdentifierAst, columns?: IdentifierAst[], query: SelectStatementInnerAst }

This addition allows the AST to represent queries using CTEs.

Committable suggestion skipped: line range outside the PR's diff.

libs/parser-sql/src/postgresAst.ts Outdated Show resolved Hide resolved
export type CreateMaterializedViewStatementAst = { kind: 'CreateMaterializedView', token: TokenInfo, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, columns?: IdentifierAst[], query: SelectStatementInnerAst, withData?: { token: TokenInfo, no?: TokenInfo } }
export type CreateSchemaStatementAst = { kind: 'CreateSchema', token: TokenInfo, ifNotExists?: TokenInfo, schema?: IdentifierAst, authorization?: { token: TokenInfo, role: SchemaRoleAst } }
export type CreateSequenceStatementAst = { kind: 'CreateSequence', token: TokenInfo, mode?: SequenceModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, as?: SequenceTypeAst, start?: SequenceParamAst, increment?: SequenceParamAst, minValue?: SequenceParamOptAst, maxValue?: SequenceParamOptAst, cache?: SequenceParamAst, ownedBy?: SequenceOwnedByAst }
export type CreateTableStatementAst = { kind: 'CreateTable', token: TokenInfo, mode?: CreateTableModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, columns?: TableColumnAst[], constraints?: TableConstraintAst[] }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Include INHERITS option in CreateTableStatementAst.

The CreateTableStatementAst currently lacks support for the INHERITS clause, which allows a table to inherit from one or more parent tables.

Apply this diff to add the inherits property:

 export type CreateTableStatementAst = {
   kind: 'CreateTable',
   token: TokenInfo,
   mode?: CreateTableModeAst,
   ifNotExists?: TokenInfo,
   schema?: IdentifierAst,
   name: IdentifierAst,
   columns?: TableColumnAst[],
   constraints?: TableConstraintAst[]
+  , inherits?: { token: TokenInfo, tables: (IdentifierAst & { schema?: IdentifierAst })[] }
 }

This change enables the AST to represent table inheritance structures.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type CreateTableStatementAst = { kind: 'CreateTable', token: TokenInfo, mode?: CreateTableModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, columns?: TableColumnAst[], constraints?: TableConstraintAst[] }
export type CreateTableStatementAst = { kind: 'CreateTable', token: TokenInfo, mode?: CreateTableModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, columns?: TableColumnAst[], constraints?: TableConstraintAst[], inherits?: { token: TokenInfo, tables: (IdentifierAst & { schema?: IdentifierAst })[] } }

export type HavingClauseAst = { token: TokenInfo, predicate: ExpressionAst }
export type WindowClauseAst = { token: TokenInfo, name: IdentifierAst } & WindowClauseContentAst
export type WindowClauseContentAst = { partitionBy?: { token: TokenInfo, columns: ExpressionAst[] }, orderBy?: OrderByClauseAst }
export type UnionClauseAst = { kind: 'Union' | 'Intersect' | 'Except', token: TokenInfo, mode: { kind: 'All' | 'Distinct', token: TokenInfo }, select: SelectStatementInnerAst } // TODO: VALUES also allowed
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Include support for VALUES in UnionClauseAst.

The current definition of UnionClauseAst only supports SelectStatementInnerAst. According to the TODO comment, VALUES clauses are also allowed in union operations in PostgreSQL.

Apply this diff to update the type definition:

-export type UnionClauseAst = { kind: 'Union' | 'Intersect' | 'Except', token: TokenInfo, mode: { kind: 'All' | 'Distinct', token: TokenInfo }, select: SelectStatementInnerAst } // TODO: VALUES also allowed
+export type UnionClauseAst = { kind: 'Union' | 'Intersect' | 'Except', token: TokenInfo, mode: { kind: 'All' | 'Distinct', token: TokenInfo }, select: SelectStatementInnerAst | ValuesStatementAst }

And define the ValuesStatementAst type:

+export type ValuesStatementAst = { kind: 'Values', token: TokenInfo, rows: (ExpressionAst[])[] }

This change allows the AST to represent VALUES clauses in union operations.

Committable suggestion skipped: line range outside the PR's diff.

export type AddConstraintAst = { kind: 'AddConstraint', token: TokenInfo, constraint: TableConstraintAst, notValid?: TokenInfo }
export type DropConstraintAst = { kind: 'DropConstraint', token: TokenInfo, ifExists?: TokenInfo, constraint: IdentifierAst }
export type AlterColumnActionAst = AlterColumnDefaultAst | AlterColumnNotNullAst
export type AlterColumnDefaultAst = { kind: 'Default', action: { kind: 'Set' | 'Drop', token: TokenInfo }, token: TokenInfo, expression?: ExpressionAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Extend AlterColumnActionAst to support data type changes.

The AlterColumnActionAst lacks an option to change the data type of a column, which is a common operation in ALTER TABLE statements.

Apply this diff to include Type action:

 export type AlterColumnActionAst = AlterColumnDefaultAst | AlterColumnNotNullAst
+ | AlterColumnTypeAst

+export type AlterColumnTypeAst = { kind: 'Type', token: TokenInfo, using?: { token: TokenInfo, expression: ExpressionAst }, dataType: ColumnTypeAst }

This addition allows the AST to represent column type alterations accurately.

Committable suggestion skipped: line range outside the PR's diff.

export type CreateIndexStatementAst = { kind: 'CreateIndex', token: TokenInfo, unique?: TokenInfo, concurrently?: TokenInfo, ifNotExists?: TokenInfo, name?: IdentifierAst, only?: TokenInfo, schema?: IdentifierAst, table: IdentifierAst, using?: { token: TokenInfo, method: IdentifierAst }, columns: IndexColumnAst[], include?: { token: TokenInfo, columns: IdentifierAst[] }, where?: WhereClauseAst }
export type CreateMaterializedViewStatementAst = { kind: 'CreateMaterializedView', token: TokenInfo, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, columns?: IdentifierAst[], query: SelectStatementInnerAst, withData?: { token: TokenInfo, no?: TokenInfo } }
export type CreateSchemaStatementAst = { kind: 'CreateSchema', token: TokenInfo, ifNotExists?: TokenInfo, schema?: IdentifierAst, authorization?: { token: TokenInfo, role: SchemaRoleAst } }
export type CreateSequenceStatementAst = { kind: 'CreateSequence', token: TokenInfo, mode?: SequenceModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, as?: SequenceTypeAst, start?: SequenceParamAst, increment?: SequenceParamAst, minValue?: SequenceParamOptAst, maxValue?: SequenceParamOptAst, cache?: SequenceParamAst, ownedBy?: SequenceOwnedByAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Include CYCLE option in CreateSequenceStatementAst.

The CreateSequenceStatementAst currently lacks the CYCLE option, which determines whether the sequence restarts after reaching the maximum value.

Apply this diff to add the cycle property:

 export type CreateSequenceStatementAst = {
   kind: 'CreateSequence',
   token: TokenInfo,
   mode?: SequenceModeAst,
   ifNotExists?: TokenInfo,
   schema?: IdentifierAst,
   name: IdentifierAst,
   as?: SequenceTypeAst,
   start?: SequenceParamAst,
   increment?: SequenceParamAst,
   minValue?: SequenceParamOptAst,
   maxValue?: SequenceParamOptAst,
   cache?: SequenceParamAst,
   ownedBy?: SequenceOwnedByAst
+  , cycle?: { kind: 'Cycle' | 'NoCycle', token: TokenInfo }
 }

This addition ensures sequences with cycling behavior are accurately represented.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type CreateSequenceStatementAst = { kind: 'CreateSequence', token: TokenInfo, mode?: SequenceModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, as?: SequenceTypeAst, start?: SequenceParamAst, increment?: SequenceParamAst, minValue?: SequenceParamOptAst, maxValue?: SequenceParamOptAst, cache?: SequenceParamAst, ownedBy?: SequenceOwnedByAst }
export type CreateSequenceStatementAst = { kind: 'CreateSequence', token: TokenInfo, mode?: SequenceModeAst, ifNotExists?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, as?: SequenceTypeAst, start?: SequenceParamAst, increment?: SequenceParamAst, minValue?: SequenceParamOptAst, maxValue?: SequenceParamOptAst, cache?: SequenceParamAst, ownedBy?: SequenceOwnedByAst, cycle?: { kind: 'Cycle' | 'NoCycle', token: TokenInfo } }

export type CommentOnStatementAst = { kind: 'CommentOn', token: TokenInfo, object: { token: TokenInfo, kind: CommentObject }, schema?: IdentifierAst, parent?: IdentifierAst, entity: IdentifierAst, comment: StringAst | NullAst }
export type CommitStatementAst = { kind: 'Commit', token: TokenInfo, object?: { kind: 'Work' | 'Transaction', token: TokenInfo }, chain?: { token: TokenInfo, no?: TokenInfo } }
export type CreateExtensionStatementAst = { kind: 'CreateExtension', token: TokenInfo, ifNotExists?: TokenInfo, name: IdentifierAst, with?: TokenInfo, schema?: { token: TokenInfo, name: IdentifierAst }, version?: { token: TokenInfo, number: StringAst | IdentifierAst }, cascade?: TokenInfo }
export type CreateFunctionStatementAst = { kind: 'CreateFunction', token: TokenInfo, replace?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, args: FunctionArgumentAst[], returns?: FunctionReturnsAst, language?: { token: TokenInfo, name: IdentifierAst }, behavior?: { kind: 'Immutable' | 'Stable' | 'Volatile', token: TokenInfo }, nullBehavior?: { kind: 'Called' | 'ReturnsNull' | 'Strict', token: TokenInfo }, definition?: { token: TokenInfo, value: StringAst }, return?: { token: TokenInfo, expression: ExpressionAst } }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add support for SECURITY options in CreateFunctionStatementAst.

The current CreateFunctionStatementAst does not include SECURITY DEFINER and SECURITY INVOKER options, which are important for function security contexts.

Apply this diff to include security property:

 export type CreateFunctionStatementAst = {
   kind: 'CreateFunction',
   token: TokenInfo,
   replace?: TokenInfo,
   schema?: IdentifierAst,
   name: IdentifierAst,
   args: FunctionArgumentAst[],
   returns?: FunctionReturnsAst,
   language?: { token: TokenInfo, name: IdentifierAst },
   behavior?: { kind: 'Immutable' | 'Stable' | 'Volatile', token: TokenInfo },
   nullBehavior?: { kind: 'Called' | 'ReturnsNull' | 'Strict', token: TokenInfo },
   definition?: { token: TokenInfo, value: StringAst },
   return?: { token: TokenInfo, expression: ExpressionAst }
+  , security?: { kind: 'Definer' | 'Invoker', token: TokenInfo }
 }

This addition allows the AST to represent the security context of functions accurately.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type CreateFunctionStatementAst = { kind: 'CreateFunction', token: TokenInfo, replace?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, args: FunctionArgumentAst[], returns?: FunctionReturnsAst, language?: { token: TokenInfo, name: IdentifierAst }, behavior?: { kind: 'Immutable' | 'Stable' | 'Volatile', token: TokenInfo }, nullBehavior?: { kind: 'Called' | 'ReturnsNull' | 'Strict', token: TokenInfo }, definition?: { token: TokenInfo, value: StringAst }, return?: { token: TokenInfo, expression: ExpressionAst } }
export type CreateFunctionStatementAst = { kind: 'CreateFunction', token: TokenInfo, replace?: TokenInfo, schema?: IdentifierAst, name: IdentifierAst, args: FunctionArgumentAst[], returns?: FunctionReturnsAst, language?: { token: TokenInfo, name: IdentifierAst }, behavior?: { kind: 'Immutable' | 'Stable' | 'Volatile', token: TokenInfo }, nullBehavior?: { kind: 'Called' | 'ReturnsNull' | 'Strict', token: TokenInfo }, definition?: { token: TokenInfo, value: StringAst }, return?: { token: TokenInfo, expression: ExpressionAst }, security?: { kind: 'Definer' | 'Invoker', token: TokenInfo } }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
libs/parser-sql/src/postgresParser.test.ts (3)

4-31: Consider using type-only imports for AST types.

Using type-only imports can help improve code organization and clarity, while also potentially reducing bundle size in the final build.

-import type {
+import {
     AliasAst,
     BooleanAst,
     ColumnAst,
     // ... other types
     WildcardAst
-} from "./postgresAst";
+} from "./postgresAst" assert { type: "types" };

54-70: Implement missing test cases for complex SQL statements.

The skipped tests contain important SQL constructs that should be supported:

  • Array operations
  • Complex CASE expressions
  • Transaction isolation levels
  • Schema information queries

Would you like help implementing these test cases and the corresponding parser functionality?


71-82: Add support for array type declarations and triggers.

The skipped Plausible tests indicate missing support for:

  • Array type declarations with default values
  • Trigger creation statements

Would you like assistance in implementing these features?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4fb1a5b and 05927bb.

📒 Files selected for processing (1)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🔇 Additional comments (1)
libs/parser-sql/src/postgresParser.test.ts (1)

83-115: ⚠️ Potential issue

Improve error handling in external SQL file fetching.

The current implementation lacks:

  1. Timeout handling for fetch operations
  2. Proper validation of response status
  3. Error handling for network failures

Apply this improvement:

 await Promise.all(structures.map(async url => {
-    const sql = await fetch(url).then(res => res.text())
+    try {
+        const controller = new AbortController();
+        const timeoutId = setTimeout(() => controller.abort(), 30000);
+        
+        const response = await fetch(url, { 
+            signal: controller.signal,
+            headers: { 'Accept': 'text/plain' }
+        });
+        clearTimeout(timeoutId);
+        
+        if (!response.ok) {
+            console.error(`Failed to fetch ${url}: ${response.status} ${response.statusText}`);
+            return;
+        }
+        
+        const sql = await response.text();
+        const parsed = parsePostgresAst(sql, {strict: true});
+        if ((parsed.errors || []).length > 0) {
+            console.log(`Error in ${url}`);
+            expect(parsed.errors || []).toEqual([]);
+        }
+    } catch (error) {
+        console.error(`Error processing ${url}:`, error);
+        throw error;
+    }
-    const parsed = parsePostgresAst(sql, {strict: true})
-    if ((parsed.errors || []).length > 0) {
-        console.log(`Error in ${url}`)
-        expect(parsed.errors || []).toEqual([])
-    }
 }))

Likely invalid or redundant comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (9)
libs/parser-sql/src/postgresAst.ts (2)

89-89: Simplify foreign key action structure

The onUpdate and onDelete properties in both TableColumnFkAst and TableFkAst have redundant token information:

onUpdate?: ForeignKeyActionAst & {token: TokenInfo}

Consider moving the token into the ForeignKeyActionAst type itself.

Apply this diff to simplify the structure:

-export type ForeignKeyActionAst = { action: { kind: ForeignKeyAction, token: TokenInfo }, columns?: IdentifierAst[] }
+export type ForeignKeyActionAst = { token: TokenInfo, kind: ForeignKeyAction, columns?: IdentifierAst[] }

Also applies to: 94-94


163-163: Enhance type safety of TokenIssue

The kind property in TokenIssue is typed as string, which could lead to inconsistencies. Consider using an enum to ensure type safety.

Apply this diff:

+export enum TokenIssueKind {
+  SyntaxError = 'SyntaxError',
+  LexicalError = 'LexicalError',
+  SemanticError = 'SemanticError',
+  Warning = 'Warning'
+}

-export type TokenIssue = { message: string, kind: string, level: ParserErrorLevel }
+export type TokenIssue = { message: string, kind: TokenIssueKind, level: ParserErrorLevel }
libs/parser-sql/src/postgresBuilder.ts (3)

193-223: Reduce duplication in view creation logic.

The createView and createMaterializedView functions share significant code. Consider extracting the common logic into a shared helper function:

+function createBaseView(index: number, stmt: { schema?: IdentifierAst, name: IdentifierAst, query: SelectStatementInnerAst, columns?: IdentifierAst[] }, kind: string, entities: Entity[]): Entity {
+    return removeEmpty({
+        schema: stmt.schema?.value,
+        name: stmt.name.value,
+        kind,
+        def: selectInnerToString(stmt.query),
+        attrs: buildViewAttrs(index, stmt.query, stmt.columns, entities),
+        extra: {line: stmt.token.position.start.line, statement: index},
+    })
+}

 function createView(index: number, stmt: CreateViewStatementAst, entities: Entity[]): Entity {
-    return removeEmpty({
-        schema: stmt.schema?.value,
-        name: stmt.name.value,
-        kind: 'view' as const,
-        def: selectInnerToString(stmt.query),
-        attrs: buildViewAttrs(index, stmt.query, stmt.columns, entities),
-        extra: {line: stmt.token.position.start.line, statement: index},
-    })
+    return createBaseView(index, stmt, 'view', entities)
 }

 function createMaterializedView(index: number, stmt: CreateMaterializedViewStatementAst, entities: Entity[]): Entity {
-    return removeEmpty({
-        schema: stmt.schema?.value,
-        name: stmt.name.value,
-        kind: 'materialized view' as const,
-        def: selectInnerToString(stmt.query),
-        attrs: buildViewAttrs(index, stmt.query, stmt.columns, entities),
-        extra: {line: stmt.token.position.start.line, statement: index},
-    })
+    return createBaseView(index, stmt, 'materialized view', entities)
 }

506-507: Consider implementing entity extraction in WHERE and HAVING clauses.

The TODO comment indicates missing functionality for extracting entities from WHERE and HAVING clauses. This could lead to incomplete analysis of table relationships and column usage.

Would you like me to help implement this feature or create a GitHub issue to track this task?


56-70: Enhance error handling and reporting.

While the code collects errors during parsing, it could benefit from more detailed error messages and context. Consider:

  1. Adding error codes for different types of failures
  2. Including more context in error messages (e.g., full statement text for parsing errors)
  3. Implementing a structured error reporting system

Also applies to: 72-115

libs/parser-sql/src/postgresParser.test.ts (4)

4-31: Consider using type-only imports for AST types.

Group the AST type imports into a single type import statement to improve code organization and clarity.

-import type {
+import {
     AliasAst,
     BooleanAst,
     ColumnAst,
     // ... other AST types
     WildcardAst
-} from "./postgresAst";
+} from "./postgresAst" assert { type: "type" };

54-70: Implement tests for complex SQL statements.

The skipped test contains TODO comments for various complex SQL statements that need to be implemented, including:

  • Parameter expressions
  • Array operations
  • Case statements
  • JSON operations
  • Transaction isolation
  • Array expansion

Would you like help implementing these test cases to improve coverage?


71-76: Add missing tests for trigger creation.

The skipped test for 'plausible' indicates that trigger creation functionality needs to be implemented and tested.

Would you like assistance in implementing the trigger creation tests?


1215-1237: Enhance string handling test cases.

Consider adding tests for these edge cases:

  • Unicode characters
  • Special characters
  • Very long strings
  • Mixed escaped and unescaped quotes
test('unicode and special characters', () => {
    expect(parseRule(p => p.stringRule(), "'Hello 世界\\n\\t!'")).toEqual({
        result: string("Hello 世界\\n\\t!", 0)
    })
    expect(parseRule(p => p.stringRule(), "E'Hello\\u0020World'")).toEqual({
        result: {...string("Hello\\u0020World", 0), escaped: true}
    })
})
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 05927bb and 9dc8e7e.

📒 Files selected for processing (3)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🔇 Additional comments (11)
libs/parser-sql/src/postgresAst.ts (5)

1-10: LGTM! Well-structured file with clear conventions.

The file starts with clean imports and provides clear documentation about the AST structure conventions.


28-28: Skip - Security options in CreateFunctionStatementAst

A previous review already suggested adding SECURITY DEFINER and SECURITY INVOKER options.


153-153: Skip - Add Function to CommentObject

A previous review already suggested adding 'Function' to the CommentObject type.


120-120: 🛠️ Refactor suggestion

Enhance JSON field support in ColumnJsonAst

The field property in ColumnJsonAst only supports string and parameter types. Consider adding support for integer indices to handle JSON array access.

Apply this diff:

-export type ColumnJsonAst = { kind: JsonOp, token: TokenInfo, field: StringAst | ParameterAst }
+export type ColumnJsonAst = { kind: JsonOp, token: TokenInfo, field: StringAst | ParameterAst | IntegerAst }

Likely invalid or redundant comment.


149-149: 🛠️ Refactor suggestion

Add support for JSON path operators

The TODO comment indicates missing JSON path operators (#> and #>>). These operators are essential for PostgreSQL JSON path operations.

Apply this diff:

-export type JsonOp = '->' | '->>' // TODO: | '#>' | '#>>'
+export type JsonOp = '->' | '->>' | '#>' | '#>>'

Likely invalid or redundant comment.

libs/parser-sql/src/postgresBuilder.ts (4)

56-70: LGTM! Well-structured database builder function.

The function properly initializes the database, processes statements, and includes metadata in the extra field.


72-115: Consider using a pattern matching approach for better maintainability.

The long if-else chain could be refactored using either a switch statement or a command pattern for better maintainability. However, since there's a past review comment about this, I'll skip suggesting the specific implementation.


141-177: Consider breaking down the table creation logic.

The createTable function handles multiple concerns (PK, indexes, checks, relations). A past review comment already suggested breaking this down into smaller functions.


179-191: Fix the nullability assignment logic.

The expression assigning the null property needs parentheses for correct operator precedence as noted in a past review comment.

libs/parser-sql/src/postgresParser.test.ts (2)

1265-1267: 🛠️ Refactor suggestion

Enhance type safety in helper functions.

The operator function could benefit from stronger type safety:

-function op(kind: Operator, start: number, end?: number): OperatorAst {
+type OperatorKind = '=' | '<' | '>' | '+' | '-' | '*' | '/' | '%';
+function op(kind: OperatorKind, start: number, end?: number): OperatorAst {
     return {kind, token: token(start, end === undefined ? start + kind.length - 1 : end)}
 }

Likely invalid or redundant comment.


102-109: ⚠️ Potential issue

Improve error handling in external SQL file fetching.

The current implementation lacks proper error handling for network requests:

  1. No timeout mechanism
  2. No validation of response status
  3. No error handling for network failures

Apply this improvement:

 await Promise.all(structures.map(async url => {
-    const sql = await fetch(url).then(res => res.text())
+    try {
+        const controller = new AbortController();
+        const timeoutId = setTimeout(() => controller.abort(), 30000);
+        
+        const response = await fetch(url, { 
+            signal: controller.signal,
+            headers: { 'Accept': 'text/plain' }
+        });
+        clearTimeout(timeoutId);
+        
+        if (!response.ok) {
+            console.error(`Failed to fetch ${url}: ${response.status} ${response.statusText}`);
+            return;
+        }
+        
+        const sql = await response.text();
+        const parsed = parsePostgresAst(sql, {strict: true});
+        if ((parsed.errors || []).length > 0) {
+            console.log(`Error in ${url}`);
+            expect(parsed.errors || []).toEqual([]);
+        }
+    } catch (error) {
+        console.error(`Error processing ${url}:`, error);
+        throw error;
+    }
-    const parsed = parsePostgresAst(sql, {strict: true})
-    if ((parsed.errors || []).length > 0) {
-        console.log(`Error in ${url}`)
-        expect(parsed.errors || []).toEqual([])
-    }
 }))

Likely invalid or redundant comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (5)
libs/parser-sql/src/postgresBuilder.ts (2)

97-112: Document ignored statement types.

Several statement types are marked with // nothing comments. Consider adding explanatory comments about why these statements are intentionally ignored.

Apply this diff to improve documentation:

-    } else if (stmt.kind === 'AlterSchema') { // nothing
-    } else if (stmt.kind === 'AlterSequence') { // nothing
-    } else if (stmt.kind === 'Begin') { // nothing
+    } else if (stmt.kind === 'AlterSchema') { // Ignored: Schema alterations don't affect the entity model
+    } else if (stmt.kind === 'AlterSequence') { // Ignored: Sequences are not part of the entity model
+    } else if (stmt.kind === 'Begin') { // Ignored: Transaction controls don't affect the entity model

590-604: Enhance type safety in columnSources function.

The function could benefit from better type narrowing and explicit handling of each expression kind.

Consider using a type discriminator pattern:

function columnSources(c: ExpressionAst, sources: SelectSource[]): SelectColumnSource[] {
+    switch (c.kind) {
+        case 'Column':
+            return handleColumnExpression(c, sources)
+        case 'Wildcard':
+            return []
+        case 'Function':
+            return c.parameters.flatMap(p => columnSources(p, sources))
+        // ... handle other cases
+        default:
+            return []
+    }
}
libs/parser-sql/src/postgresParser.test.ts (3)

4-31: Consider using type-only imports for AST types.

Since these are TypeScript type definitions, using type-only imports would be more explicit and could potentially improve build times:

-import type {
+import {
     AliasAst,
     BooleanAst,
     // ... other AST types
     WildcardAst
-} from "./postgresAst";
+} from "./postgresAst" assert { type: "ts" };

54-70: Implement missing test cases for complex SQL statements.

The skipped test suite contains important test cases for complex SQL statements that are currently not implemented:

  • Array operations
  • Case expressions
  • JSON operations
  • Transaction isolation
  • Complex subqueries

Would you like help implementing these test cases and the corresponding parser functionality?


1254-1276: Expand string handling test cases.

Consider adding tests for these edge cases:

  • Unicode characters
  • Special characters
  • Very long strings
  • Mixed escaped and unescaped quotes

Add these test cases:

test('unicode and special characters', () => {
    expect(parseRule(p => p.stringRule(), "'Hello 世界\\n\\t!'")).toEqual({
        result: string("Hello 世界\\n\\t!", 0)
    })
    expect(parseRule(p => p.stringRule(), "E'Hello\\u0020World'")).toEqual({
        result: {...string("Hello\\u0020World", 0), escaped: true}
    })
})
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc8e7e and d8e8ac0.

📒 Files selected for processing (4)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
  • libs/parser-sql/src/sql.test.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • libs/parser-sql/src/sql.test.ts
🔇 Additional comments (13)
libs/parser-sql/src/postgresAst.ts (7)

1-10: LGTM! Well-documented conventions.

The file structure is clean, and the conventions are clearly documented, providing good guidance for maintaining consistency across the codebase.


11-44: LGTM! Comprehensive statement type definitions.

The statement types provide a thorough representation of PostgreSQL's AST structure.


45-118: LGTM! Well-structured clause type definitions.

The clause types effectively model PostgreSQL's syntax structure.


119-139: LGTM! Comprehensive basic type definitions.

The basic parts effectively model PostgreSQL's fundamental structures.


140-149: LGTM! Complete element type definitions.

The element types comprehensively cover all PostgreSQL data types and special cases.


150-165: LGTM! Comprehensive enum definitions.

The enums effectively capture PostgreSQL's various operators and types.


166-168: LGTM! Well-structured helper types.

The helper types effectively support error tracking and token positioning.

libs/parser-sql/src/postgresBuilder.ts (6)

325-325: Implement handling for NOT NULL and DEFAULT constraints.

The TODO comment indicates missing implementation for dropping NOT NULL and DEFAULT constraints.

Would you like assistance in implementing the missing constraint handling, or should we create a GitHub issue to track this task?


499-506: LGTM! Well-structured type definitions.

The type definitions are clear, well-organized, and properly represent the SQL query structure.


72-116: ⚠️ Potential issue

Improve error handling for unrecognized statement types.

The current implementation uses isNever(stmt) for unhandled cases, which may not provide helpful feedback during runtime.

Apply this diff to improve error handling:

    } else {
-       isNever(stmt)
+       errors.push({
+           message: `Unhandled statement kind: ${stmt.kind}`,
+           line: stmt.token.position.start.line,
+           detail: stmt
+       })
    }

142-178: 🛠️ Refactor suggestion

Break down complex table creation logic.

The createTable function handles multiple concerns (PK, indexes, checks, relations) making it hard to maintain.

Consider breaking it down into smaller, focused functions as suggested in the past review.


572-573: ⚠️ Potential issue

Improve schema resolution strategy.

The comment "not sure what to do with schema :/" indicates uncertainty in schema handling.

Implement a clear schema resolution strategy:

  1. Exact schema match first
  2. Default schema fallback
  3. Error/warning for ambiguous cases

433-437: 🛠️ Refactor suggestion

Improve handling of complex expressions.

Wrapping complex expressions in backticks may not be the best approach for handling these values.

Consider implementing proper type conversion or validation as suggested in the past review.

libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Outside diff range and nitpick comments (7)
libs/parser-sql/src/postgresBuilder.test.ts (3)

10-12: Consider adding more edge cases to basic tests.

The empty input test is good, but consider adding tests for:

  • SQL with only whitespace/comments
  • Invalid SQL syntax
  • Multiple empty statements

92-99: Document unsupported SQL concepts.

The skipped test mentions that some concepts can't be handled, but doesn't specify which ones. Please document:

  1. List of unsupported SQL features
  2. Tracking issue/ticket for implementing these features
  3. Expected timeline for support

100-194: Add error cases to selectEntities tests.

The test suite thoroughly covers successful scenarios but lacks tests for error cases such as:

  • Invalid column references
  • Ambiguous column names in joins
  • Type mismatches in expressions
  • Invalid function calls
libs/parser-sql/src/postgresBuilder.ts (2)

118-128: Enhance error messages for duplicate entities and types.

The current error messages could be more informative. Consider including more context about the conflicting definitions:

-errors.push(duplicated(`Entity ${entityToId(entity)}`, prev.extra?.line ? prev.extra.line : undefined, pos))
+errors.push(duplicated(
+    `Entity ${entityToId(entity)} conflicts with existing definition`,
+    prev.extra?.line ? `Previous definition at line ${prev.extra.line}` : undefined,
+    pos
+))

Also applies to: 130-140


507-512: Implement entity extraction for WHERE and HAVING clauses.

The TODO comment indicates that entity extraction from WHERE and HAVING clauses is missing. This could lead to incomplete schema analysis.

Would you like help implementing the entity extraction for these clauses? I can provide a solution that:

  1. Traverses WHERE/HAVING expressions
  2. Extracts table and column references
  3. Updates the SelectEntities result accordingly
libs/parser-sql/src/postgresParser.test.ts (1)

1304-1306: Enhance type safety in helper functions.

The helper functions for creating AST nodes could benefit from stronger type safety:

-function op(kind: Operator, start: number, end?: number): OperatorAst {
+type OperatorKind = '=' | '<' | '>' | '+' | '-' | '*' | '/' | '%';
+function op(kind: OperatorKind, start: number, end?: number): OperatorAst {
     return {kind, token: token(start, end === undefined ? start + kind.length - 1 : end)}
 }

-function identifier(value: string, start: number, end?: number): IdentifierAst {
+function identifier(value: string, start: number, end?: number): IdentifierAst {
+    if (value.trim().length === 0) {
+        throw new Error('Identifier value cannot be empty');
+    }
     return {kind: 'Identifier', value, token: token(start, end === undefined ? start + value.length - 1 : end)}
 }

Also applies to: 1366-1368

libs/parser-sql/src/postgresAst.ts (1)

147-147: Add support for subqueries in ExpressionAst

To handle expressions that include subqueries, such as IN (SELECT ...), ExpressionAst should include SubqueryAst.

Apply this diff to include SubqueryAst in ExpressionAst:

 export type ExpressionAst = (LiteralAst | ParameterAst | ColumnAst | WildcardAst | FunctionAst | GroupAst | OperationAst | OperationLeftAst | OperationRightAst | ArrayAst | ListAst
+ | SubqueryAst
 ) & { cast?: { token: TokenInfo, type: ColumnTypeAst } }

+export type SubqueryAst = { kind: 'Subquery', select: SelectInnerAst }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d8e8ac0 and 56da7c6.

📒 Files selected for processing (4)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.test.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🔇 Additional comments (7)
libs/parser-sql/src/postgresBuilder.test.ts (3)

13-91: Break down the complex test case.

The complex test case is still monolithic and covers multiple SQL features in a single test.


197-208: Improve error handling in parse function.

The function still uses console.error and needs better error handling structure.


210-212: Add type safety to extract function.

The function still has unsafe type casting and needs proper type guards.

libs/parser-sql/src/postgresBuilder.ts (2)

194-224: LGTM! Well-structured view creation logic.

The view creation functions are well-organized with clear separation between regular and materialized views. The code effectively reuses common logic through the buildViewAttrs function.


433-437: ⚠️ Potential issue

Potential SQL injection risk in expression handling.

Wrapping complex expressions in backticks could be dangerous if these values are later used in SQL queries. Consider using parameterized queries or proper escaping mechanisms.

libs/parser-sql/src/postgresParser.test.ts (1)

1-33: LGTM! Well-organized imports and type definitions.

The imports are properly structured with clear type imports separation and necessary testing utilities.

libs/parser-sql/src/postgresAst.ts (1)

179-180: Verify the correctness of including 'Interval' in OperatorLeft

The inclusion of 'Interval' in OperatorLeft might be inappropriate since INTERVAL is typically used as a type or function rather than a unary operator in PostgreSQL.

Run the following script to check the usage of INTERVAL as an operator:

libs/parser-sql/src/postgresBuilder.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Outdated Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Show resolved Hide resolved
export type NameAst = { token: TokenInfo, name: IdentifierAst }
export type FilterAst = { token: TokenInfo, condition: ExpressionAst }
export type ObjectNameAst = { schema?: IdentifierAst, name: IdentifierAst }
export type ExpressionAst = (LiteralAst | ParameterAst | ColumnAst | WildcardAst | FunctionAst | GroupAst | OperationAst | OperationLeftAst | OperationRightAst | ArrayAst | ListAst) & { cast?: { token: TokenInfo, type: ColumnTypeAst } }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add support for CASE expressions in ExpressionAst

CASE expressions are commonly used in SQL and need to be supported in the AST. Currently, ExpressionAst does not include a CaseAst type.

Apply this diff to include CaseAst in ExpressionAst:

 export type ExpressionAst = (LiteralAst | ParameterAst | ColumnAst | WildcardAst | FunctionAst | GroupAst | OperationAst | OperationLeftAst | OperationRightAst | ArrayAst | ListAst
+ | CaseAst
 ) & { cast?: { token: TokenInfo, type: ColumnTypeAst } }

+export type CaseAst = { kind: 'Case', token: TokenInfo, whenClauses: WhenClauseAst[], else?: ExpressionAst }
+export type WhenClauseAst = { token: TokenInfo, condition: ExpressionAst, result: ExpressionAst }

Committable suggestion skipped: line range outside the PR's diff.

// enums
export type Operator = '+' | '-' | '*' | '/' | '%' | '^' | '&' | '|' | '#' | '<<' | '>>' | '=' | '<' | '>' | '<=' | '>=' | '<>' | '!=' | '||' | '~' | '~*' | '!~' | '!~*' | 'Is' | 'IsNot' | 'Like' | 'NotLike' | 'In' | 'NotIn' | 'DistinctFrom' | 'NotDistinctFrom' | 'Or' | 'And'
export type OperatorLeft = 'Not' | 'Interval' | '~'
export type OperatorRight = 'IsNull' | 'NotNull'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Extend OperatorRight to include additional IS operators

OperatorRight currently only includes IsNull and NotNull. To fully support PostgreSQL's IS operators, consider adding IsTrue, IsNotTrue, IsFalse, IsNotFalse, IsUnknown, and IsNotUnknown.

Apply this diff to extend OperatorRight:

-export type OperatorRight = 'IsNull' | 'NotNull'
+export type OperatorRight = 'IsNull' | 'NotNull' | 'IsTrue' | 'IsNotTrue' | 'IsFalse' | 'IsNotFalse' | 'IsUnknown' | 'IsNotUnknown'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type OperatorRight = 'IsNull' | 'NotNull'
export type OperatorRight = 'IsNull' | 'NotNull' | 'IsTrue' | 'IsNotTrue' | 'IsFalse' | 'IsNotFalse' | 'IsUnknown' | 'IsNotUnknown'

export type SelectStatementAst = { kind: 'Select' } & SelectInnerAst
export type SetStatementAst = { kind: 'Set', token: TokenInfo, scope?: SetModeAst, parameter: IdentifierAst, equal?: SetAssignAst, value: SetValueAst }
export type ShowStatementAst = { kind: 'Show', token: TokenInfo, name: IdentifierAst }
export type UpdateStatementAst = { kind: 'Update', token: TokenInfo, only?: TokenAst, schema?: IdentifierAst, table: IdentifierAst, descendants?: TokenAst, alias?: AliasAst, columns: ColumnUpdateAst[], where?: WhereClauseAst, returning?: SelectClauseAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add support for FROM clause in UpdateStatementAst

The current definition of UpdateStatementAst lacks a from property, which is essential for supporting updates using joins or subqueries via the FROM clause in PostgreSQL.

Apply this diff to include the from clause:

 export type UpdateStatementAst = { kind: 'Update', token: TokenInfo, only?: TokenAst, schema?: IdentifierAst, table: IdentifierAst, descendants?: TokenAst, alias?: AliasAst, columns: ColumnUpdateAst[], where?: WhereClauseAst, returning?: SelectClauseAst
+  , from?: FromClauseAst
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export type UpdateStatementAst = { kind: 'Update', token: TokenInfo, only?: TokenAst, schema?: IdentifierAst, table: IdentifierAst, descendants?: TokenAst, alias?: AliasAst, columns: ColumnUpdateAst[], where?: WhereClauseAst, returning?: SelectClauseAst }
export type UpdateStatementAst = { kind: 'Update', token: TokenInfo, only?: TokenAst, schema?: IdentifierAst, table: IdentifierAst, descendants?: TokenAst, alias?: AliasAst, columns: ColumnUpdateAst[], where?: WhereClauseAst, returning?: SelectClauseAst, from?: FromClauseAst }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (1)
libs/parser-sql/src/postgresParser.test.ts (1)

34-113: Consider grouping test files by complexity level.

The test organization is good, but could be improved by:

  1. Grouping simple tests first (empty, basic syntax)
  2. Following with moderate complexity tests (single statements)
  3. Ending with complex tests (multiple statements, edge cases)
  4. Moving external file tests to a separate test file

This would make it easier to:

  • Quickly identify failing tests
  • Maintain test cases
  • Add new test cases
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 56da7c6 and 9bb3a4a.

📒 Files selected for processing (3)
  • libs/parser-sql/src/postgresAst.ts (1 hunks)
  • libs/parser-sql/src/postgresBuilder.ts (1 hunks)
  • libs/parser-sql/src/postgresParser.test.ts (1 hunks)
🔇 Additional comments (7)
libs/parser-sql/src/postgresBuilder.ts (7)

56-70: LGTM! Well-structured database initialization with performance tracking.

The function properly handles database initialization, statement processing, and includes useful metadata about parsing performance.


72-116: Consider using a pattern matching approach for better maintainability.

The current if-else chain with many no-op cases makes the code harder to maintain. A pattern matching approach would improve readability and maintainability.


118-140: LGTM! Robust entity and type management.

The functions properly handle:

  • Duplicate detection and merging
  • Error reporting with position tracking
  • Consistent implementation between entity and type handling

142-178: Break down complex table creation logic for better maintainability.

The function handles multiple concerns (PK, indexes, checks, relations) making it hard to maintain.


325-325: Implement handling for NOT NULL and DEFAULT constraints.

The TODO comment indicates missing implementation for these constraints.


573-574: ⚠️ Potential issue

Improve schema resolution strategy.

The comment indicates uncertainty in schema handling which could lead to issues in multi-schema environments.


434-438: 🛠️ Refactor suggestion

Improve handling of complex expressions.

Wrapping complex expressions in backticks may not be the best approach. Consider implementing proper type conversion or validation.

libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresParser.test.ts Show resolved Hide resolved
libs/parser-sql/src/postgresAst.ts Outdated Show resolved Hide resolved
@loicknuchel loicknuchel merged commit 57feaf1 into main Nov 9, 2024
4 of 5 checks passed
@loicknuchel loicknuchel deleted the sql-parser branch November 9, 2024 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant