-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Improvements #1439
Performance Improvements #1439
Conversation
Parallel Test execution Gradle Caching Explicitly request for latest JavaCC 7.0.10
Parallel Test execution Gradle Caching Explicitly request for latest JavaCC 7.0.10
Simplify the Primary Expression Production Try to simple parse without Complex Expressions first, before parsing complex and slow (if supported by max nesting depth) Add Test cases for issues JSQLParser#1397 and JSQLParser#1438 Update Libraries to its latest version Remove JUnit 4 from Gradle
Let Parser timeout after 6 seconds and fail gently Add a special test verifying the clean up after timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem lies deeper. So this PR does somehow covers the symptoms but not the reasons. So we need to dig deeper.
id 'pmd' | ||
id 'checkstyle' | ||
|
||
// download the RR tools which have no Maven Repository | ||
id "de.undercouch.download" version "4.1.2" | ||
id "de.undercouch.download" version "latest.release" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious. Does this work the same way maven dependency versioning works like [1.0,2.0)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know about Maven does (In fact I hate Maven. You can't even clean
a project while offline. I went straight from ANT to Gradle, which "just understands and does what I want" instead of forcing me into any dogma). This latest.release
automatically retrieves the Latest Release of a Gradle Plugin.
build.gradle
Outdated
testImplementation 'org.apache.commons:commons-lang3:3.10' | ||
testImplementation 'com.h2database:h2:1.4.200' | ||
testImplementation 'commons-io:commons-io:2.11.0+' | ||
testImplementation 'org.mockito:mockito-core:4.1.0+' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this version annotation 4.1.0+ mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is Gradle Sugar, which allows to use the latest update of a Library following "semantic versioning". In fact I applied it wrong. It should be either testImplementation 'commons-io:commons-io:2.11.+' for fetching latest release of the
2.11minor version or alternatively
testImplementation 'commons-io:commons-io:2.+' for fetching the latest minor version of the 2
major version.
Found that also very handy for quickly updating Log4J2 recently :)
|
||
// first, try to parse fast and simple | ||
try { | ||
CCJSqlParser parser = newParser(sql).withAllowComplexParsing(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If complex parsing is initially switched off, it should not be switched on in this method. Right? However, this is only some kind of hack. The grammar itself is flawed if we need to do something like this. The porblem is the excessive use of Lookups, but if the grammar would be designed that a fixed lookup would be enough, then we do not need a switch like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct in your observation, but I respectfully do not share your conclusion:
-
It is not a
hack
but actually a completely valid approach (to my knowledge the only available method) to implement aIF ... THEN ...
condition in the parser -
While the grammar certainly could be rewritten/optimized, I still claim that SQL is a bit exceptional:
a) we try to be RDBMS/Dialect agnostic and so have to deal with a great flexibility in the Grammar
b) SQL lacks reliableterminated Blocks
, which makes it difficult to defineStop
tokens. What do I mean with this: Obviously everything inside brackets(
)
is a block and the closing brackets)
terminates this block and we know that we can stop parsing an expression at the closing bracket. Unfortunately SQL was "designed" for assistant secretaries, who were supposed to query data without programming skills.
Best illustration is issue #1444:
select doc.fullName from XWikiDocument doc where doc.fullName like :fullName escape '/' and doc.author like :author escape '!'
Is this
select doc.fullName from XWikiDocument doc where ( doc.fullName like :fullName escape '/' )
and ( doc.author like :author escape '!' )
Or is this
select doc.fullName from XWikiDocument doc where doc.fullName like :fullName escape ( '/'
and doc.author like :author escape '!' )
JavaCC by its nature lacks a lot of functionality to deal with those requirements. JavaCC 21 added some features for a reason and I believe ANTLR is there for a reason too.
Long story short:
Yes, you are right: too many lookup
harms us badly. But I am also right: we do need those for parsing certain valid statements (look how functionality has grown since 4.0 and how the "new issue" rate went down since!)
So the suggest approach, while not a beauty at all solves that in a pragmatic way:
It tries to parse first without those slow lookup
and only when this fails, it would activate the lookup
-- because they are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, if anyone can point me on how to avoid those lookup
without losing functionality, I am all in!
# Conflicts: # src/test/java/net/sf/jsqlparser/statement/select/SelectTest.java
Totally unrelated: We pull the complete ANTLR as dependency for PMD, while we work with JavaCC in order to not use ANTLR -- I find that quite funny :) pmd - The PMD libraries to be used for this project. \--- net.sourceforge.pmd:pmd-java:6.41.0 +--- net.sourceforge.pmd:pmd-core:6.41.0 | +--- org.antlr:antlr4-runtime:4.7.2 | +--- com.beust:jcommander:1.48 | +--- commons-io:commons-io:2.6 | +--- net.sourceforge.saxon:saxon:9.1.0.8 | +--- org.apache.commons:commons-lang3:3.8.1 | +--- org.ow2.asm:asm:9.2 | \--- com.google.code.gson:gson:2.8.5 +--- net.sourceforge.saxon:saxon:9.1.0.8 +--- org.ow2.asm:asm:9.2 +--- commons-io:commons-io:2.6 \--- org.apache.commons:commons-lang3:3.8.1 |
src/test/resources/net/sf/jsqlparser/statement/select/oracle-tests/cast_multiset13.sql
Outdated
Show resolved
Hide resolved
Unused imports
# Conflicts: # build.gradle # src/test/java/net/sf/jsqlparser/statement/select/SelectTest.java
# Conflicts: # build.gradle # src/main/jjtree/net/sf/jsqlparser/parser/JSqlParserCC.jjt # src/test/java/net/sf/jsqlparser/statement/select/SelectTest.java
Reduce test loops to fit intothe timeout
Fixes #1397
Fixes #1438
Try to parse simple and fast first, before parsing complex
Update libraries to its latest versions
Remove JUnit4 Leftover
Add a 6 second timeout to the Parser and let it fail gracefully when a Statement can't be parsed within that time
Add a special test assuring the cleanup of the Parser objects after the timeout (no stale background processes)