Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: slow parsing of certain SQL files #3647

Closed
techee opened this issue Feb 15, 2023 · 0 comments · Fixed by #3654
Closed

SQL: slow parsing of certain SQL files #3647

techee opened this issue Feb 15, 2023 · 0 comments · Fixed by #3654

Comments

@techee
Copy link
Contributor

techee commented Feb 15, 2023

We got a bug report here geany/geany-osx#42 (comment) in Geany where Geany stopped responding for very long time for not so large SQL file (~16000 LOCs). I bisected this problem to this commit:

4dbda62

Basically the problem isn't the commit itself but rather that string which is passed to this function and which should contain just the current token seems to contain a huge amount of the SQL code and not just a single token (I haven't checked the size but it certainly is hundreds of characters long). As this string is getting bigger, lookupCaseKeyword() is called every time a charactger is added in

if (empty_tag
    && KEYWORD_inquiry_directive == lookupCaseKeyword (vStringValue (string), Lang_sql))

which consumes a lot of time. A simple workaround which reduces the parsing time significantly is to check the size of the string against some constant corresponding to the longest keyword:

if (empty_tag && string->length < 20
    && KEYWORD_inquiry_directive == lookupCaseKeyword (vStringValue (string), Lang_sql))

Still, it would be good to investigate what the reason behind the long token is and fix it properly.

I assume that @cloudis-ild could send the file causing this issue privately or I could send it as well if he agrees.

techee added a commit to techee/ctags that referenced this issue Mar 2, 2023
$$ typed anywhere in the code, e.g. by accident, makes the rest of the
code appear to the parser as a dollar-quoted string which can be thousands
of bytes long. In this case lookupCaseKeyword() is called repeatedly on
this ever increasing string which consumes a lot of time and makes the
parser appear completely unresponsive for large files.

This patch adds a sanity check to perform lookupCaseKeyword() only for
strings of length smaller than 30 (currently the longest inquiry directive
keyword is 21 characters long so there should be some safe extra margin
even for longer keywords if added in the future).

Fixes universal-ctags#3647.
techee added a commit to techee/ctags that referenced this issue Mar 13, 2023
$$ typed anywhere in the code, e.g. by accident, makes the rest of the
code appear to the parser as a dollar-quoted string which can be thousands
of bytes long. In this case lookupCaseKeyword() is called repeatedly on
this ever increasing string which consumes a lot of time and makes the
parser appear completely unresponsive for large files.

This patch adds a sanity check to perform lookupCaseKeyword() only for
strings of length smaller than 30 (currently the longest inquiry directive
keyword is 21 characters long so there should be some safe extra margin
even for longer keywords if added in the future).

Fixes universal-ctags#3647.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant