Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser skips tables containing indexes with > 2 included columns #180

Closed
mrmonkington opened this issue Jan 9, 2023 · 6 comments
Closed

Comments

@mrmonkington
Copy link

Describe the bug

simple-ddl-parser==0.29.0

The parser seems to ignore any table with a multicolumn index where the number of included cols > 2.

To Reproduce

Sample program test.py:

import sys
from simple_ddl_parser import DDLParser  # type: ignore
from pprint import pprint

parse_results = DDLParser(sys.stdin.read()).run()
pprint(parse_results)

Input file input.sql:

CREATE TABLE a1 (
  b int,
  c int,
  d int,
  KEY e(b,c)
);
CREATE TABLE a2 (
  b int,
  c int,
  d int,
  KEY e(b,c,d)
);

cat input.sql | python test.py

Outputs:

[{'alter': {},
  'checks': [],
  'columns': [{'check': None,
               'default': None,
               'name': 'b',
               'nullable': True,
               'references': None,
               'size': None,
               'type': 'int',
               'unique': False},
              {'check': None,
               'default': None,
               'name': 'c',
               'nullable': True,
               'references': None,
               'size': None,
               'type': 'int',
               'unique': False},
              {'check': None,
               'default': None,
               'name': 'd',
               'nullable': True,
               'references': None,
               'size': None,
               'type': 'int',
               'unique': False},
              {'check': None,
               'default': None,
               'name': 'KEY',
               'nullable': True,
               'references': None,
               'size': None,
               'type': 'e',
               'unique': False}],
  'constraints': {'checks': None, 'references': None, 'uniques': None},
  'index': [],
  'partitioned_by': [],
  'primary_key': [],
  'schema': None,
  'table_name': 'a1',
  'tablespace': None}]

Note that a2 is not present.

Expected behavior

Two entries in the tables list, e.g.:

[{...
  'table_name': 'a1',
  ...},
{...
  'table_name': 'a2',
  ...},
]

Environment:

  • simple-ddl-parser==0.29.0
  • Python 3.10.9
  • Fedora 37 (kernel 6.0.13)
@xnuinside
Copy link
Owner

xnuinside commented Jan 9, 2023

@mrmonkington hi, thanks for reporting the issue, I think problem in statement KEY, because it is usually PRIMARY KEY (https://www.w3schools.com/sql/sql_primarykey.ASP ). Can you share a doc for DB where is KEY used? Without PRIMARY

in output you can see, that KEY was recognized as a column, not a valid statement {'check': None,
'default': None,
'name': 'KEY',

@mrmonkington
Copy link
Author

Hi @xnuinside

Yes I just noticed myself that the parser was interpreting it a regular column of type e - apologies!

However MySQL allows KEY as a synonym of INDEX and for some reason all of the schemas I'm parsing with your library use this form :(

https://dev.mysql.com/doc/refman/8.0/en/create-table.html#:~:text=KEY%20%7C%20INDEX,other%20database%20systems.

create_definition: {
    col_name column_definition
  | {INDEX | KEY} [index_name] [index_type] (key_part,...)
      [index_option] ...
  | {FULLTEXT | SPATIAL} [INDEX | KEY] [index_name] (key_part,...)
      [index_option] ...
  | [CONSTRAINT [symbol]] PRIMARY KEY
      [index_type] (key_part,...)
      [index_option] ...
  | [CONSTRAINT [symbol]] UNIQUE [INDEX | KEY]
      [index_name] [index_type] (key_part,...)
      [index_option] ...
  | [CONSTRAINT [symbol]] FOREIGN KEY
      [index_name] (col_name,...)
      reference_definition
  | check_constraint_definition
}

Hope that helps!

@xnuinside
Copy link
Owner

got it, I will add support for this statement

@mrmonkington
Copy link
Author

@xnuinside I have noticed that neither KEY nor INDEX is supported in TABLE definitions (i.e. no inline indexes as supported by mySQL and I believe recent-ish msSQL) so I guess this is more work than simply adding a token synonym to the lexer.

I would really like to help implement this, but I am unsure what the output should look like for the purposes of writing some tests.

Please can you confirm for me that it would be correct if an inline index of form:

CREATE TABLE t1 (
  val INT,
  INDEX idx1(val);

...would output the same parsed form as:

CREATE TABLE t1 (
  val INT,
);
CREATE INDEX idx1 ON t1(val);

i.e.

[{'alter': {},
  'checks': [],
  'columns': [{'check': None,
               'default': None,
               'name': 'val',
               'nullable': True,
               'references': None,
               'size': None,
               'type': 'INT',
               'unique': False}],
  'index': [{'columns': ['val'],
             'detailed_columns': [{'name': 'val',
                                   'nulls': 'LAST',
                                   'order': 'ASC'}],
             'index_name': 'idx1',
             'unique': False}],
  'partitioned_by': [],
  'primary_key': [],
  'schema': None,
  'table_name': 't1',
  'tablespace': None}]

?

Thank you!

@cfhowes
Copy link
Contributor

cfhowes commented Jan 24, 2024

@mrmonkington Can you check if this now works after #219 was merged (0.31.3 and later)? I was using MySQL as well and patched up some key parsing. not sure if i fully got your cases though.

@xnuinside
Copy link
Owner

everything works (all samples from issue) on 1.3.1 version, tested, I will close the issues, if needed something else - feel free to open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants