Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AML #328

Merged
merged 6 commits into from
Oct 22, 2024
Merged

Improve AML #328

merged 6 commits into from
Oct 22, 2024

Conversation

loicknuchel
Copy link
Contributor

@loicknuchel loicknuchel commented Oct 22, 2024

Summary by CodeRabbit

  • New Features

    • Updated the Azimutt AML library to version 0.1.8, enhancing functionality and performance.
    • Introduced new types for attribute constraints, improving clarity in attribute definitions.
  • Bug Fixes

    • Enhanced error handling for duplicated entities and relations, providing more detailed error messages.
  • Documentation

    • Improved clarity and consistency in error reporting within test cases.
  • Refactor

    • Restructured parser rules and token definitions for better readability and maintainability.
    • Refined handling of attributes and constraints in the AML parser and builder.

@loicknuchel loicknuchel marked this pull request as ready for review October 22, 2024 19:59
Copy link

coderabbitai bot commented Oct 22, 2024

Walkthrough

The changes in this pull request involve updates to the Azimutt AML library, including a version increment from 0.1.7 to 0.1.8 in related files. Key modifications include renaming properties in type definitions, enhancing type specificity, and restructuring the Abstract Syntax Tree (AST) representation. The parser and builder files have been adjusted to accommodate these new definitions, improving error handling and the overall clarity of the code. Test files have also been updated to reflect these changes, ensuring consistency in error reporting and token management.

Changes

File Path Change Summary
backend/lib/azimutt_web/templates/website/converters/_editors-script.html.heex Updated script source for Azimutt AML library from version 0.1.7 to 0.1.8.
libs/aml/package.json Updated version from 0.1.7 to 0.1.8.
libs/aml/src/amlAst.ts Renamed statement to kind, enhanced type specificity, and restructured various types and properties.
libs/aml/src/amlBuilder.ts Modified type checks to use kind, refined attribute constraint handling, and improved error handling.
libs/aml/src/amlGenerator.test.ts Updated error handling and token management, replaced tokenPosition with token.
libs/aml/src/amlParser.test.ts Restructured AST representation, updated parsing rules, and refined attribute handling.
libs/aml/src/amlParser.ts Renamed tokens, restructured parser rules, and improved error handling and utility functions.

Poem

In the garden of code, where the rabbits play,
New versions hop in, brightening the day.
With kind in the air, and types all aligned,
The AST blooms, with clarity combined.
So let’s celebrate changes, both big and small,
For in this code burrow, we’re having a ball! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Outside diff range and nitpick comments (5)
libs/aml/src/amlGenerator.test.ts (1)

Line range hint 432-509: Consider grouping legacy syntax errors by type

The legacy syntax error tests contain multiple similar cases. Consider grouping them by error type (e.g., foreign key syntax, schema notation) to improve test organization and maintenance.

describe('legacy syntax errors', () => {
  describe('foreign key notation', () => {
    test('fk keyword', () => {
      // Test cases for 'fk' -> '->'
    });
    test('table.column notation', () => {
      // Test cases for 'table.column' -> 'table(column)'
    });
  });
});
libs/aml/src/amlBuilder.ts (2)

304-304: Address TODO: Implement error handling for invalid relations

A TODO comment indicates that invalid relations are being ignored. It's important to handle this scenario to prevent silent failures.

Would you like assistance in implementing proper error handling by reporting an error instead of returning undefined? I can help modify the function to add errors to the errors array.


288-291: Improve Readability of 'buildIndexes' Function

The buildIndexes function has complex logic that could be simplified for better readability.

Consider adding inline comments or breaking down complex expressions to enhance understanding.

libs/aml/src/amlParser.test.ts (2)

704-706: Improve error messages in integerRule and decimalRule tests

The tests for integerRule and decimalRule correctly expect errors when parsing invalid inputs like '1.2' and 'bad'. However, the error messages could be more descriptive to aid in debugging.

Consider enhancing the parser to provide clearer error messages specifying why the input is invalid. This can help developers quickly identify and fix issues during testing.


834-838: Extend columnValueRule tests with additional value types

The current tests for columnValueRule cover integers, decimals, and identifiers. To ensure comprehensive coverage, consider adding test cases for other value types such as booleans, nulls, and expressions.

This addition will strengthen the test suite by verifying the parser's ability to handle all possible value types.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between fbb1f52 and 4c4406b.

📒 Files selected for processing (7)
  • backend/lib/azimutt_web/templates/website/converters/_editors-script.html.heex (1 hunks)
  • libs/aml/package.json (1 hunks)
  • libs/aml/src/amlAst.ts (1 hunks)
  • libs/aml/src/amlBuilder.ts (14 hunks)
  • libs/aml/src/amlGenerator.test.ts (15 hunks)
  • libs/aml/src/amlParser.test.ts (4 hunks)
  • libs/aml/src/amlParser.ts (5 hunks)
✅ Files skipped from review due to trivial changes (2)
  • backend/lib/azimutt_web/templates/website/converters/_editors-script.html.heex
  • libs/aml/package.json
🧰 Additional context used
🪛 Biome
libs/aml/src/amlParser.ts

[error] 69-69: The regular expression includes this negated empty character class.

Negated empty character classes match anything.
If you want to match against [, escape it [.
Otherwise, remove the character class or fill it.

(lint/correctness/noEmptyCharacterClassInRegex)


[error] 78-78: Do not shadow the global "String" property.

Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.

(lint/suspicious/noShadowRestrictedNames)


[error] 422-422: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)


[error] 154-154: This aliasing of this is unnecessary.

Arrow functions inherits this from their enclosing scope.
Safe fix: Use this instead of an alias.

(lint/complexity/noUselessThisAlias)


[error] 670-670: isNaN is unsafe. It attempts a type coercion. Use Number.isNaN instead.

See the MDN documentation for more details.
Unsafe fix: Use Number.isNaN instead.

(lint/suspicious/noGlobalIsNan)

🔇 Additional comments (20)
libs/aml/src/amlGenerator.test.ts (5)

3-3: LGTM: Import renamed from tokenPosition to TokenPosition

The import change from tokenPosition to TokenPosition follows TypeScript naming conventions for type imports.


179-181: LGTM: Updated error handling for duplicate inline type

The test case correctly validates the warning message for duplicate type definitions using the new token position format.


193-194: LGTM: Updated error handling for mismatched tokens

The test case properly validates error messages for mismatched tokens with accurate position information.


667-669: LGTM: Well-structured token position helper function

The token function provides a clean interface for creating token positions with optional line and column information. The default values for line and column are sensibly chosen.


580-585: LGTM: Comprehensive legacy warning tests

The test cases thoroughly validate the warning messages for legacy syntax, ensuring proper migration guidance is provided to users.

libs/aml/src/amlAst.ts (4)

10-12: LGTM!

The changes to AmlAst and StatementAst correctly reflect the updated statement types and improve the overall readability of the code.


20-28: LGTM!

The updates to the attribute AST types enhance clarity by using more descriptive names and structures. The separation of attribute constraints into specific types improves type safety and code maintainability.


40-41: LGTM!

The revisions to NamespaceRefAst and EntityRefAst align with the new IdentifierAst type, ensuring consistency across the codebase.


49-51: LGTM!

The adjustments to PropertyAst, PropertyValueAst, and DocAst appropriately utilize IdentifierAst and other new AST types, improving the consistency of the property definitions.

libs/aml/src/amlBuilder.ts (8)

94-99: Ensure Consistent Handling of Statement Kinds

The code checks for stmt.kind to process different statement types.

The changes correctly update the checks from s.statement to s.kind, ensuring consistency with the updated AST structure.


64-69: Efficient Filtering and Processing of Statements

The filtering and processing of AST statements are correctly updated to use s.kind.

These changes improve the clarity and correctness of the code.


351-354: Ensure All Type Kinds Are Handled in 'buildTypeContent'

The function covers all defined TypeContentAst kinds.

The exhaustive checks using if statements and isNever ensure type safety.


367-371: Consistent Handling of Property Values

The buildPropValue function handles different PropertyValueAst kinds consistently.

The code correctly maps each kind to the appropriate return value.


Line range hint 304-315: Consider Additional Validation in 'buildRelation' Function

Ensure that all necessary validations are performed before building a relation.

Would you like to verify if there are other potential edge cases that need handling in this function?


213-213: ⚠️ Potential issue

Syntax Error: Replace 'and' with '&&' in logical conditions

In TypeScript, the logical AND operator is &&, not and. Using and will cause a syntax error.

Apply this diff to fix the syntax error:

- def: e.properties?.flatMap(p => p.key.value === 'view' and p.value and !Array.isArray(p.value) and p.value.kind === 'Identifier' ? [p.value.value.replaceAll(/\\n/g, '\n')] : [])[0],
+ def: e.properties?.flatMap(p => p.key.value === 'view' && p.value && !Array.isArray(p.value) && p.value.kind === 'Identifier' ? [p.value.value.replaceAll(/\\n/g, '\n')] : [])[0],

Likely invalid or redundant comment.


235-235: ⚠️ Potential issue

Syntax Error: Replace 'and' with '&&' in conditional statements

The logical AND operator should be && in TypeScript. Using and will result in a syntax error.

Apply this diff to correct the operator:

- const numType = a.enumValues and a.enumValues.length <= 2 and a.enumValues.every(v => v.kind === 'Integer') ? '(' + a.enumValues.map(stringifyAttrValue).join(',') + ')' : '' // types with num parameter (varchar(10), decimal(2,3)...)
+ const numType = a.enumValues && a.enumValues.length <= 2 && a.enumValues.every(v => v.kind === 'Integer') ? '(' + a.enumValues.map(stringifyAttrValue).join(',') + ')' : '' // types with num parameter (varchar(10), decimal(2,3)...)

Likely invalid or redundant comment.


297-301: Verify Correctness of Entity Reference Building

Ensure that buildEntityRef correctly handles aliases and namespaces.

Run the following script to check all usages of buildEntityRef:

✅ Verification successful

Let me gather more information about the buildEntityRef function and its usage context.

The previous output showed that buildEntityRef is used in three places:

  1. Building source entity reference in relation building
  2. Building reference entity in relation building
  3. The function implementation itself

Let me get more context about these usages and the data structures involved to verify its correctness.


Let me try a different approach to get the relevant code context.

The previous output showed that all the code is in amlBuilder.ts. Let's get more context about the function implementation and its type dependencies to verify its correctness in handling aliases and namespaces.


Let me fix the regex and try one more time to get the type definitions.

From the previous output, we can see that buildEntityRef handles three cases correctly:

  1. Full qualification (database/catalog/schema) - returns the full path without alias
  2. Aliased entities - returns the resolved entity reference with the alias
  3. Default case - uses the provided namespace with the entity name

Let me get the type definitions to confirm the implementation matches the type contracts.


Based on the code analysis, I can now generate the final response.

buildEntityRef Implementation Verified as Correct

The function correctly handles all three cases for entity references:

  1. Fully qualified paths (with database/catalog/schema) are preserved without aliases
  2. Aliased entities are properly resolved using the aliases dictionary
  3. Default case correctly combines the provided namespace with the entity name

The implementation aligns with its usage in both relation building contexts:

  • Source entity references (with namespace inheritance)
  • Reference entity lookups (with empty namespace, which is intentional)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all calls to 'buildEntityRef' and verify parameters.

rg 'buildEntityRef\(' -A 1

Length of output: 1127


Script:

#!/bin/bash
# Get the implementation of buildEntityRef function
ast-grep --pattern 'function buildEntityRef($_) {
  $$$
}'

# Get the EntityRefAst interface definition
ast-grep --pattern 'interface EntityRefAst {
  $$$
}'

# Get any tests related to buildEntityRef
rg 'buildEntityRef' -l

Length of output: 6701


Script:

#!/bin/bash
# Get the content of amlBuilder.ts focusing on buildEntityRef and related code
rg -A 20 -B 5 'function buildEntityRef' libs/aml/src/amlBuilder.ts

# Get the type definitions and interfaces used by buildEntityRef
rg 'type (Entity|EntityRef|Namespace) =' libs/aml/src/amlBuilder.ts
rg 'interface.*Entity.*{' -A 5 libs/aml/src/amlBuilder.ts

Length of output: 1597


Script:

#!/bin/bash
# Get type definitions with fixed regex
rg 'type (Entity|EntityRef|Namespace) = ' libs/aml/src/amlBuilder.ts

# Get interface definitions with fixed regex
rg -A 5 'interface.*Entity' libs/aml/src/amlBuilder.ts

# Get the usage context of EntityRef
rg 'EntityRef' -A 2 -B 2 libs/aml/src/amlBuilder.ts

Length of output: 3845

libs/aml/src/amlParser.test.ts (3)

723-724: Enhance error handling in commentRule parsing

The commentRule test for input 'bad' expects an error due to incorrect syntax. Verify that the error message is informative and guides the developer towards the correct syntax.

Ensure that the parser provides a helpful error message indicating that the expected token is a comment (e.g., # a comment) and not arbitrary text.


845-895: Validate nestAttributes function behavior with nested attributes

The nestAttributes function is being tested with nested attributes. Review the test cases to ensure they cover various nesting scenarios and edge cases.

Ensure that the function correctly nests attributes at different depths and that path constructions are accurate. Consider adding more complex nesting cases if necessary.


28-37: Ensure consistent use of 'kind' property in AST nodes

The changes replace the statement property with kind in the AST nodes within the test cases. Please verify that all occurrences of statement have been updated to kind throughout the codebase to maintain consistency.

Run the following script to check for any remaining instances of statement in AST node definitions:

✅ Verification successful

Let me try a different approach to verify this change by searching for both the old and new patterns in the codebase.


Based on the extensive search results, I can now provide a definitive response about the usage of the kind property in the codebase.

The change from statement to kind in AST nodes is consistent with the codebase

The search results show that:

  1. The kind property is widely used throughout the codebase as a discriminator for different types of AST nodes and other entities
  2. There are no remaining instances of statement being used as a property for AST node type discrimination
  3. The statement property appears in some places but is used for different purposes (like line numbering or SQL statements)

The change aligns with the codebase's conventions where kind is consistently used to identify the type of nodes in AST structures, making the code more uniform and maintainable.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find usages of 'statement' property in AST nodes.

rg --type=typescript "statement\s*:" --glob "!node_modules/*"

Length of output: 98


Script:

#!/bin/bash
# Search for AST nodes with 'statement' or 'kind' property in TypeScript and JavaScript files
rg "(?:statement|kind)\s*:" --type-add 'ts:*.{ts,tsx}' --type=ts -A 2

# Also search for type definitions that might contain these properties
ast-grep --pattern 'interface $_ {
  $$$
  statement: $_
  $$$
}'

ast-grep --pattern 'interface $_ {
  $$$
  kind: $_
  $$$
}'

# Look for test files that might contain AST assertions
fd ".*\.test\.(ts|js|tsx|jsx)$" --exec grep -l "statement\|kind"

Length of output: 174336

Comment on lines +266 to +298
result: {entities: [users, {name: 'posts', extra: {line: 4, statement: 2}}], extra: {}},
errors: [{message: "Expecting: one of these possible Token sequences:\n 1. [Dash]\n 2. [LowerThan]\n 3. [GreaterThan]\nbut found: '\n'", kind: 'NoViableAltException', level: 'error', ...token(39, 39, 5, 5, 15, 15)}]
})
expect(parseAmlTest('users\n id int pk\n\nposts\n author int ->\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}], extra: {}},
errors: [{message: "Expecting token of type --> Identifier <-- but found --> '\n' <--", kind: 'MismatchedTokenException', level: 'error', ...tokenPosition(40, 40, 5, 16, 5, 16)}]
errors: [{message: "Expecting token of type --> Identifier <-- but found --> '\n' <--", kind: 'MismatchedTokenException', level: 'error', ...token(40, 40, 5, 5, 16, 16)}]
})
expect(parseAmlTest('users\n id int pk\n\nposts\n author int -> users\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}], relations: [{src: {entity: 'posts', attrs: [['author']]}, ref: {entity: 'users', attrs: [['id']]}, extra: {line: 5, statement: 2, natural: 'ref', inline: true}}], extra: {}},
})
expect(parseAmlTest('users\n id int pk\n\nposts\n author int -> users(\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}], extra: {}},
errors: [{message: "Expecting: expecting at least one iteration which starts with one of these possible Token sequences::\n <[WhiteSpace] ,[Identifier]>\nbut found: '\n'", kind: 'EarlyExitException', level: 'error', ...tokenPosition(47, 47, 5, 23, 5, 23)}]
result: {entities: [users, {name: 'posts', extra: {line: 4, statement: 2}}], extra: {}},
errors: [{message: "Expecting: expecting at least one iteration which starts with one of these possible Token sequences::\n <[WhiteSpace] ,[Identifier]>\nbut found: '\n'", kind: 'EarlyExitException', level: 'error', ...token(47, 47, 5, 5, 23, 23)}]
})
expect(parseAmlTest('users\n id int pk\n\nposts\n author int -> users(id\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}], relations: [{src: {entity: 'posts', attrs: [['author']]}, ref: {entity: 'users', attrs: [['id']]}, extra: {line: 5, statement: 2, inline: true}}], extra: {}},
errors: [{message: "Expecting token of type --> RParen <-- but found --> '\n' <--", kind: 'MismatchedTokenException', level: 'error', ...tokenPosition(49, 49, 5, 25, 5, 25)}]
errors: [{message: "Expecting token of type --> ParenRight <-- but found --> '\n' <--", kind: 'MismatchedTokenException', level: 'error', ...token(49, 49, 5, 5, 25, 25)}]
})
expect(parseAmlTest('users\n id int pk\n\nposts\n author int -> users(id)\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}], relations: [{src: {entity: 'posts', attrs: [['author']]}, ref: {entity: 'users', attrs: [['id']]}, extra: {line: 5, statement: 2, inline: true}}], extra: {}},
})

expect(parseAmlTest('users\n id int pk\n\nposts\n author int - users(id)\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}], relations: [{src: {entity: 'posts', attrs: [['author']]}, ref: {entity: 'users', attrs: [['id']]}, extra: {line: 5, statement: 2, inline: true}}], extra: {}},
errors: [{message: "Expecting: one of these possible Token sequences:\n 1. [Dash]\n 2. [LowerThan]\n 3. [GreaterThan]\nbut found: ' '", kind: 'NoViableAltException', level: 'error', ...tokenPosition(39, 39, 5, 15, 5, 15)}]
errors: [{message: "Expecting: one of these possible Token sequences:\n 1. [Dash]\n 2. [LowerThan]\n 3. [GreaterThan]\nbut found: ' '", kind: 'NoViableAltException', level: 'error', ...token(39, 39, 5, 5, 15, 15)}]
})
expect(parseAmlTest('users\n id int pk\n\nposts\n author int users(id)\n')).toEqual({
result: {entities: [users, {name: 'posts', attrs: [{name: 'author', type: 'int'}], extra: {line: 4, statement: 2}}, {name: 'id', extra: {line: 5, statement: 3}}], extra: {}},
// TODO handle error better to not generate a fake entity (id)
errors: [
{message: "Expecting token of type --> NewLine <-- but found --> 'users' <--", kind: 'MismatchedTokenException', level: 'error', ...tokenPosition(39, 43, 5, 15, 5, 19)},
{message: "Expecting token of type --> NewLine <-- but found --> '(' <--", kind: 'MismatchedTokenException', level: 'error', ...tokenPosition(44, 44, 5, 20, 5, 20)},
{message: "Expecting token of type --> NewLine <-- but found --> ')' <--", kind: 'MismatchedTokenException', level: 'error', ...tokenPosition(47, 47, 5, 23, 5, 23)}
{message: "Expecting token of type --> NewLine <-- but found --> 'users' <--", kind: 'MismatchedTokenException', level: 'error', ...token(39, 43, 5, 5, 15, 19)},
{message: "Expecting token of type --> NewLine <-- but found --> '(' <--", kind: 'MismatchedTokenException', level: 'error', ...token(44, 44, 5, 5, 20, 20)},
{message: "Expecting token of type --> NewLine <-- but found --> ')' <--", kind: 'MismatchedTokenException', level: 'error', ...token(47, 47, 5, 5, 23, 23)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using constants for error messages

The error messages are duplicated across multiple test cases. Consider extracting them into constants to improve maintainability and reduce the risk of typos.

const ERROR_MESSAGES = {
  NO_VIABLE_ALT: "Expecting: one of these possible Token sequences:\n  1. [Dash]\n  2. [LowerThan]\n  3. [GreaterThan]\nbut found:",
  MISMATCHED_TOKEN: "Expecting token of type --> NewLine <-- but found -->"
} as const;

Comment on lines +13 to +17
export type NamespaceStatement = { kind: 'Namespace', line: number, schema?: IdentifierAst, catalog?: IdentifierAst, database?: IdentifierAst } & ExtraAst
export type EntityStatement = { kind: 'Entity', name: IdentifierAst, view?: TokenInfo, alias?: IdentifierAst, attrs?: AttributeAstNested[] } & NamespaceRefAst & ExtraAst
export type RelationStatement = { kind: 'Relation', src: AttributeRefCompositeAst, srcCardinality: RelationCardinalityAst, polymorphic?: RelationPolymorphicAst, refCardinality: RelationCardinalityAst, ref: AttributeRefCompositeAst } & ExtraAst & { warning?: TokenInfo }
export type TypeStatement = { kind: 'Type', name: IdentifierAst, content?: TypeContentAst } & NamespaceRefAst & ExtraAst
export type EmptyStatement = { kind: 'Empty', comment?: CommentAst }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider including ExtraAst in EmptyStatement for consistency.

All other statement types include & ExtraAst to incorporate additional properties like properties, doc, and comment. Including ExtraAst in EmptyStatement would ensure a consistent structure across all statement types and allow for future extensions.

export type AttributeRefAst = EntityRefAst & { attr: AttributePathAst, warning?: TokenInfo }
export type AttributeRefCompositeAst = EntityRefAst & { attrs: AttributePathAst[], warning?: TokenInfo }
export type AttributeValueAst = NullToken | DecimalToken | IntegerToken | BooleanToken | ExpressionToken | IdentifierToken // TODO: add date
export type AttributeValueAst = NullAst | DecimalAst | IntegerAst | BooleanAst | ExpressionAst | IdentifierAst // TODO: add date
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Address the TODO: Add DateAst to AttributeValueAst.

There's a TODO comment indicating that DateAst should be added to the AttributeValueAst union type. If date values are intended to be supported, consider defining DateAst and including it to handle date attributes appropriately.

Would you like assistance in implementing DateAst and integrating it into AttributeValueAst?

Comment on lines +54 to +60
export type ExpressionAst = { kind: 'Expression', token: TokenInfo, value: string }
export type IdentifierAst = { kind: 'Identifier', token: TokenInfo, value: string, quoted?: boolean }
export type IntegerAst = { kind: 'Integer', token: TokenInfo, value: number }
export type DecimalAst = { kind: 'Decimal', token: TokenInfo, value: number }
export type BooleanAst = { kind: 'Boolean', token: TokenInfo, value: boolean }
export type NullAst = { kind: 'Null', token: TokenInfo }
export type CommentAst = { kind: 'Comment', token: TokenInfo, value: string }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider introducing a base interface for AST nodes to reduce duplication.

Currently, each AST node type includes kind: string and token: TokenInfo. To enhance code maintainability and reduce repetition, consider creating a base interface:

interface BaseAst {
  kind: string;
  token: TokenInfo;
}

Then, extend this interface in your AST node types:

export type ExpressionAst = BaseAst & { value: string };
export type IdentifierAst = BaseAst & { value: string; quoted?: boolean };
export type IntegerAst = BaseAst & { value: number };
// ...and so on for other AST node types.

This approach ensures consistency across your AST nodes and facilitates future extensions.

Comment on lines +338 to +339
type: {...namespace, name: a.type.value, values: a.enumValues.map(stringifyAttrValue), extra: {line: a.enumValues[0].token.position.start.line, statement, inline: true}},
position: mergePositions(a.enumValues.map(v => v.token))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential Undefined Value: Check if 'a.enumValues' is defined before accessing

There is a risk of a.enumValues being undefined, which would cause a runtime error when accessing a.enumValues[0].

Consider adding a null check or using optional chaining:

- type: {...namespace, name: a.type.value, values: a.enumValues.map(stringifyAttrValue), extra: {line: a.enumValues[0].token.position.start.line, statement, inline: true}},
+ type: {...namespace, name: a.type.value, values: a.enumValues?.map(stringifyAttrValue), extra: {line: a.enumValues?.[0]?.token.position.start.line, statement, inline: true}},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type: {...namespace, name: a.type.value, values: a.enumValues.map(stringifyAttrValue), extra: {line: a.enumValues[0].token.position.start.line, statement, inline: true}},
position: mergePositions(a.enumValues.map(v => v.token))
type: {...namespace, name: a.type.value, values: a.enumValues?.map(stringifyAttrValue), extra: {line: a.enumValues?.[0]?.token.position.start.line, statement, inline: true}},
position: mergePositions(a.enumValues.map(v => v.token))


// values
const Null = createToken({ name: 'Null', pattern: /null/i })
const Decimal = createToken({ name: 'Decimal', pattern: /\d+\.\d+/ })
const Integer = createToken({ name: 'Integer', pattern: /\d+/, longer_alt: Decimal })
const String = createToken({ name: 'String', pattern: /'([^\\']|\\\\|\\')*'/ })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Rename the 'String' token variable to avoid shadowing the global 'String' object.

Declaring a variable named String shadows the global String constructor, which can cause confusion and potential bugs. It's advisable to rename it to avoid this shadowing.

Apply this diff to rename the variable:

-const String = createToken({ name: 'String', pattern: /'([^\\']|\\\\|\\')*'/ })
+const StringToken = createToken({ name: 'String', pattern: /'([^\\']|\\\\|\\')*'/ })

Remember to update all references to String throughout the code.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const String = createToken({ name: 'String', pattern: /'([^\\']|\\\\|\\')*'/ })
const StringToken = createToken({ name: 'String', pattern: /'([^\\']|\\\\|\\')*'/ })
🧰 Tools
🪛 Biome

[error] 78-78: Do not shadow the global "String" property.

Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.

(lint/suspicious/noShadowRestrictedNames)

const WhiteSpace = createToken({name: 'WhiteSpace', pattern: /[ \t]+/})
const Identifier = createToken({ name: 'Identifier', pattern: /\b[a-zA-Z_][a-zA-Z0-9_#]*\b|"([^\\"]|\\\\|\\"|\\n)*"/ })
const Expression = createToken({ name: 'Expression', pattern: /`[^`]+`/ })
const Comment = createToken({ name: 'Comment', pattern: /#[^\n]*/ })
const Doc = createToken({ name: 'Doc', pattern: /\|(\s+"([^\\"]|\\\\|\\")*"|([^ ]#|[^#\n])*)/ }) // # is included in doc if not preceded by a space
const DocMultiline = createToken({ name: 'DocMultiline', pattern: /\|\|\|[^]*?\|\|\|/, line_breaks: true })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the regular expression in 'DocMultiline' token to avoid the negated empty character class.

The regular expression /\|\|\|[^]*?\|\|\|/ includes a negated empty character class [^], which matches any character and can lead to unexpected behavior. It's recommended to replace [^]*? with [\s\S]*? to correctly match any character, including newlines.

Apply this diff to fix the regular expression:

-const DocMultiline = createToken({ name: 'DocMultiline', pattern: /\|\|\|[^]*?\|\|\|/, line_breaks: true })
+const DocMultiline = createToken({ name: 'DocMultiline', pattern: /\|\|\|[\s\S]*?\|\|\|/, line_breaks: true })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const DocMultiline = createToken({ name: 'DocMultiline', pattern: /\|\|\|[^]*?\|\|\|/, line_breaks: true })
const DocMultiline = createToken({ name: 'DocMultiline', pattern: /\|\|\|[\s\S]*?\|\|\|/, line_breaks: true })
🧰 Tools
🪛 Biome

[error] 69-69: The regular expression includes this negated empty character class.

Negated empty character classes match anything.
If you want to match against [, escape it [.
Otherwise, remove the character class or fill it.

(lint/correctness/noEmptyCharacterClassInRegex)

integerRule: () => IntegerAst
decimalRule: () => DecimalAst
booleanRule: () => BooleanAst
nullRule: () => NullAst

constructor(tokens: TokenType[], recovery: boolean) {
super(tokens, {recoveryEnabled: recovery})
const $ = this

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unnecessary aliasing of this to $ in the constructor.

In the constructor of AmlParser, aliasing this to $ is unnecessary because arrow functions inherit this from their enclosing scope. You can use this directly instead.

Apply this diff to remove the aliasing:

-const $ = this

And update all occurrences of $ to this within the class.

Committable suggestion was skipped due to low confidence.

Comment on lines +613 to +616
nesting: {depth: 0, token: token(0, 1)},
name: identifier('name', 2),
type: identifier('varchar', 7),
enumValues: [integer(12, 15)]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Clarify handling of varchar(12) in attribute type test

In the attribute type legacy test, the attribute name uses varchar(12) which is valid in both v1 and v2 but has different meanings. This can lead to confusion and misinterpretation in the test case.

Consider adding comments or adjusting the test to clarify the intent and ensure that developers understand the difference in handling varchar(12) between versions. This will improve maintainability and reduce potential misunderstandings.

Comment on lines +396 to +421
nesting: {token: token(0, 1), depth: 0},
name: identifier('item_id', 2),
type: identifier('int', 10),
nullable: token(14, 21),
constraints: [
{kind: 'Index', token: token(23, 27)},
{kind: 'Index', token: token(29, 33), name: identifier('idx', 35)},
{kind: 'Check', token: token(39, 43), predicate: expression('item_id > 0', 45)},
{kind: 'Check', token: token(60, 64), predicate: expression('item_id < 0', 66)},
{
kind: 'Relation',
token: token(81, 92),
refCardinality: {kind: '1', token: token(81, 81)},
polymorphic: {attr: identifier('kind', 82), value: identifier('users', 87)},
srcCardinality: {kind: 'n', token: token(92, 92)},
ref: {schema: identifier('public', 94), entity: identifier('users', 101), attrs: [identifier('id', 107)]}
},
{
kind: 'Relation',
token: token(111, 122),
refCardinality: {kind: '1', token: token(111, 111)},
polymorphic: {attr: identifier('kind', 112), value: identifier('posts', 117)},
srcCardinality: {kind: 'n', token: token(122, 122)},
ref: {entity: identifier('posts', 124), attrs: [identifier('id', 130)]}
}
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Conflicting check constraints in attribute definition

The attribute item_id has two conflicting check constraints:

  • check(\item_id > 0`)`
  • check(\item_id < 0`)`

These constraints cannot both be true simultaneously, as item_id cannot be both greater than 0 and less than 0 at the same time. This logical error could cause issues during schema validation.

To resolve this conflict, determine the correct constraint for item_id. If the intention is to ensure that item_id is non-zero, consider using item_id != 0. Apply the following diff to correct the constraint:

- {kind: 'Check', token: token(60, 64), predicate: expression('item_id < 0', 66)},
+ {kind: 'Check', token: token(60, 64), predicate: expression('item_id != 0', 66)},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
nesting: {token: token(0, 1), depth: 0},
name: identifier('item_id', 2),
type: identifier('int', 10),
nullable: token(14, 21),
constraints: [
{kind: 'Index', token: token(23, 27)},
{kind: 'Index', token: token(29, 33), name: identifier('idx', 35)},
{kind: 'Check', token: token(39, 43), predicate: expression('item_id > 0', 45)},
{kind: 'Check', token: token(60, 64), predicate: expression('item_id < 0', 66)},
{
kind: 'Relation',
token: token(81, 92),
refCardinality: {kind: '1', token: token(81, 81)},
polymorphic: {attr: identifier('kind', 82), value: identifier('users', 87)},
srcCardinality: {kind: 'n', token: token(92, 92)},
ref: {schema: identifier('public', 94), entity: identifier('users', 101), attrs: [identifier('id', 107)]}
},
{
kind: 'Relation',
token: token(111, 122),
refCardinality: {kind: '1', token: token(111, 111)},
polymorphic: {attr: identifier('kind', 112), value: identifier('posts', 117)},
srcCardinality: {kind: 'n', token: token(122, 122)},
ref: {entity: identifier('posts', 124), attrs: [identifier('id', 130)]}
}
]
nesting: {token: token(0, 1), depth: 0},
name: identifier('item_id', 2),
type: identifier('int', 10),
nullable: token(14, 21),
constraints: [
{kind: 'Index', token: token(23, 27)},
{kind: 'Index', token: token(29, 33), name: identifier('idx', 35)},
{kind: 'Check', token: token(39, 43), predicate: expression('item_id > 0', 45)},
{kind: 'Check', token: token(60, 64), predicate: expression('item_id != 0', 66)},
{
kind: 'Relation',
token: token(81, 92),
refCardinality: {kind: '1', token: token(81, 81)},
polymorphic: {attr: identifier('kind', 82), value: identifier('users', 87)},
srcCardinality: {kind: 'n', token: token(92, 92)},
ref: {schema: identifier('public', 94), entity: identifier('users', 101), attrs: [identifier('id', 107)]}
},
{
kind: 'Relation',
token: token(111, 122),
refCardinality: {kind: '1', token: token(111, 111)},
polymorphic: {attr: identifier('kind', 112), value: identifier('posts', 117)},
srcCardinality: {kind: 'n', token: token(122, 122)},
ref: {entity: identifier('posts', 124), attrs: [identifier('id', 130)]}
}
]

@loicknuchel loicknuchel merged commit b4706a6 into main Oct 22, 2024
6 checks passed
@loicknuchel loicknuchel deleted the aml-fix branch October 22, 2024 20:14
@coderabbitai coderabbitai bot mentioned this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant