-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SQL Formatter #426
Comments
I'm in complete agreement. I implemented a very simple version of this in my own version control script years ago. I wrote about it here: Exporting Queries for Version Control. Here's the code (it's actually VBScript):
It's very much an 80/20 solution. I think a more polished SQL formatter is called for with this project, but I wanted to validate and second your concerns. By far the most important thing (in my mind) is to get the SQL broken up into separate lines as much as possible, as that's what helps the most when viewing diffs. |
This (work in progress) is a VBA port of https://github.com/doctrine/sql-formatter intended to format SQL queries with better indenting and layout to improve readability in version control. #426
one thing to note: for SQL that I've found measurably improve the portability and troubleshooting (and enhance alteration speed), is to put the comma at the start of the new column, or row. We also indent and start all An example of what we do, and it's made code a bit longer, but clarity upped by many times. Eg: SELECT
MSysNavPaneGroupCategories.Name AS CategoryName
, MSysNavPaneGroupCategories.Position AS CategoryPosition
, MSysNavPaneGroupCategories.Flags AS CategoryFlags
, MSysNavPaneGroups.Name AS GroupName
, MSysNavPaneGroups.Flags AS GroupFlags
, MSysNavPaneGroups.Position AS GroupPosition
, MSysObjects.Type AS ObjectType
, MSysObjects.Name AS ObjectName
, MSysNavPaneGroupToObjects.Flags AS ObjectFlags
, MSysNavPaneGroupToObjects.Icon AS ObjectIcon
, MSysNavPaneGroupToObjects.Position AS ObjectPosition
, MSysNavPaneGroupToObjects.Name AS NameInGroup
, MSysNavPaneGroupCategories.Id AS CategoryID
, MSysNavPaneGroups.Id AS GroupID
, MSysNavPaneGroupToObjects.Id AS LinkID
FROM MSysNavPaneGroupCategories
INNER JOIN (
MSysNavPaneGroups
LEFT JOIN (
MSysNavPaneGroupToObjects
LEFT JOIN MSysObjects ON MSysNavPaneGroupToObjects.ObjectID = MSysObjects.Id
) ON MSysNavPaneGroups.Id = MSysNavPaneGroupToObjects.GroupID
) ON MSysNavPaneGroupCategories.Id = MSysNavPaneGroups.GroupCategoryID
WHERE (
((MSysNavPaneGroups.Name) IS NOT NULL)
AND ((MSysNavPaneGroupCategories.Type) = 4)
)
ORDER BY
MSysNavPaneGroupCategories.Name
, MSysNavPaneGroups.Name
, MSysObjects.Type
, MSysObjects.Name; What this does, is while we're troubleshooting, say we want to remove one condition for a moment. We only need to comment out that one item, and don't need to comma track, except the very first line. Makes for a much easier to deal with cases. Also yes I know this code probably doesn't work, it's illustrative in nature. Example comments: SELECT
MSysNavPaneGroupCategories.Name AS CategoryName
, MSysNavPaneGroupCategories.Position AS CategoryPosition
, MSysNavPaneGroupCategories.Flags AS CategoryFlags
, MSysNavPaneGroups.Name AS GroupName
-- , MSysNavPaneGroups.Flags AS GroupFlags -- We don't want this yet...example.
, MSysNavPaneGroups.Position AS GroupPosition
, MSysObjects.Type AS ObjectType
, MSysObjects.Name AS ObjectName
, MSysNavPaneGroupToObjects.Flags AS ObjectFlags
, MSysNavPaneGroupToObjects.Icon AS ObjectIcon
, MSysNavPaneGroupToObjects.Position AS ObjectPosition
, MSysNavPaneGroupToObjects.Name AS NameInGroup
, MSysNavPaneGroupCategories.Id AS CategoryID
, MSysNavPaneGroups.Id AS GroupID
, MSysNavPaneGroupToObjects.Id AS LinkID
FROM MSysNavPaneGroupCategories
INNER JOIN (
MSysNavPaneGroups
LEFT JOIN (
MSysNavPaneGroupToObjects
-- LEFT JOIN MSysObjects ON MSysNavPaneGroupToObjects.ObjectID = MSysObjects.Id
) ON MSysNavPaneGroups.Id = MSysNavPaneGroupToObjects.GroupID
) ON MSysNavPaneGroupCategories.Id = MSysNavPaneGroups.GroupCategoryID
WHERE (
((MSysNavPaneGroups.Name) IS NOT NULL)
AND ((MSysNavPaneGroupCategories.Type) = 4)
)
ORDER BY
MSysNavPaneGroupCategories.Name
, MSysNavPaneGroups.Name
-- , MSysObjects.Type -- we don't know if this is causing issues, so leaving it out for now.
, MSysObjects.Name; I know Access's SQL doesn't allow comments, (but it also seems to sometimes, namely pass-through queries), but this is an example of why moving the comma and indent control makes a difference this way (for us). |
Worked through several things yesterday on the SQL Formatter class. (Still a work in progress) #426
I am working to replace slower RegEx functions with more performant alternatives. (We will still probably use RegEx for some of the more complex work, but we can improve the performance quite a bit by optimizing some of the simple ones like finding whitespace or boundary characters.) #426
The overhead on performance timing is very low, but adding an option to disable it when an instance of the class is used internally for testing SQL formatting. #426
Worked through the implementation bugs using some test queries, and ready to integrate this into the add-in! #426
Formatting of queries is now available as an option in the add-in. #426
I finished building out the formatter today, and I am very happy with the performance. After the initial port, I was able to optimize the logic to use faster I have also added some additional functions to verify the tokenization and output layout so we can confirm that everything is working as intended if we make adjustments to class. At present it produces identical output to the sample query on the original project's home page. Here is what our internal Access query looks like after formatting, right out of the box: SELECT
MSysNavPaneGroupCategories.Name AS CategoryName,
MSysNavPaneGroupCategories.Position AS CategoryPosition,
MSysNavPaneGroupCategories.Flags AS CategoryFlags,
MSysNavPaneGroups.Name AS GroupName,
MSysNavPaneGroups.Flags AS GroupFlags,
MSysNavPaneGroups.Position AS GroupPosition,
MSysObjects.Type AS ObjectType,
MSysObjects.Name AS ObjectName,
MSysNavPaneGroupToObjects.Flags AS ObjectFlags,
MSysNavPaneGroupToObjects.Icon AS ObjectIcon,
MSysNavPaneGroupToObjects.Position AS ObjectPosition,
MSysNavPaneGroupToObjects.Name AS NameInGroup,
MSysNavPaneGroupCategories.Id AS CategoryID,
MSysNavPaneGroups.Id AS GroupID,
MSysNavPaneGroupToObjects.Id AS LinkID
FROM
(
MSysNavPaneGroupCategories
INNER JOIN MSysNavPaneGroups ON MSysNavPaneGroupCategories.Id = MSysNavPaneGroups.GroupCategoryID
)
LEFT JOIN (
MSysNavPaneGroupToObjects
LEFT JOIN MSysObjects ON MSysNavPaneGroupToObjects.ObjectID = MSysObjects.Id
) ON MSysNavPaneGroups.Id = MSysNavPaneGroupToObjects.GroupID
WHERE
(
(
(MSysNavPaneGroups.Name) Is Not Null
)
AND (
(
MSysNavPaneGroupCategories.Type
)= 4
)
)
ORDER BY
MSysNavPaneGroupCategories.Name,
MSysNavPaneGroups.Name,
MSysObjects.Type,
MSysObjects.Name; If there was a way we could run some test queries against the original project and compare the output, that would help ensure that we are coming up with the same results. Someone probably has a Web implementation of sql-formatter out there, but I wasn't finding anything right off in my initial searches... 🤔 This should be ready to build from the |
Just for ref I’ve used VB - SQL 'Select' statement formatter/checker in case if it’s of any use. |
This has been implemented, and seems to be working fine. Closing this issue as completed, and we can open new issues if we encounter any bugs going forward. |
One of the challenges with managing queries in Microsoft Access is that they don't support comments or formatting. The
.SQL
property returns a large blob of sql code.I can compare diffs of changes and get a general idea of what has changed, but this would be a whole lot easier to read if it was formatted something like this:
I don't feel like we need to create a massive 10K line project to support every SQL dialect and every formatting option out there. But I do feel like I would benefit from a basic implementation of a SQL formatter that would make queries much more readable at the source level. I started digging into this a couple years ago, but didn't get very far.
Last week I started researching this again, and found what seems to be a happy medium of pretty wide dialect support and simple enough to port into a single VBA class. I am using the php Doctrine project's sql-formatter which was forked from another project with very similar goals.
I currently have this about 80% implemented in VBA, and I am about ready to go ahead and move it into this project as I continue to work through the debugging and testing process. (I won't turn it on until I am pretty confident in the output.) I just figured I would create an issue for it where we can discuss any implementation details or other considerations at the outset.
My plan is to add an option to format the SQL output in the
*.sql
files. I am hoping the performance impact will be pretty minimal, but the readability improvement in the source files will be a definite help for some of my more complex projects.The text was updated successfully, but these errors were encountered: