Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support formatting Trino queries #297

Merged
merged 4 commits into from
Jul 16, 2022
Merged

Conversation

verhovsky
Copy link
Contributor

No description provided.

*/
// TODO
// https://github.com/trinodb/trino/blob/432d2897bdef99388c1a47188743a061c4ac1f34/core/trino-parser/src/main/antlr4/io/trino/sql/parser/SqlBase.g4#L41
const reservedCommands = [
Copy link
Contributor Author

@verhovsky verhovsky Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs more work, I mostly just went through the Spark formatter and kept the ones Trino shares with SparkSQL

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the relationship between Spark and Trino?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't one, I chose it pretty arbitrarily.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case it would be better to base it on the sql.formatter.ts which mostly follows standard SQL.

Copy link
Contributor Author

@verhovsky verhovsky Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my first comment, I am only referring to reservedCommands, the other variables I did by replacing the array completely with only Trino specific functions and keywords. I also added the commands I ran on the Trino code base to extract these lists, in hopes that you would double check my work before merging it :)

I guess I simplified what I said about what I did for reservedCommands, I didn't just remove the ones I didn't see in Trino. If I saw the "ALTER" keyword I grep'd for "ALTER" in the grammar and also added other Trino "ALTER" statements that SparkSQL doesn't have. On your suggestion, I just read through reservedCommands in sql.formatter.ts and all the statements in there I have already added. The only question I have is about SET and SET SCHEMA, what are those for? Trino has a bunch of "SET" like SET AUTHORIZATION, SET PROPERTIES, SET ROLE etc. and even UPDATE qualifiedName SET updateAssignment. Should all those be included in this list? I have no idea what makes sense here.

What really needs to be done is reading through Trino's grammar definition of statement

https://github.com/trinodb/trino/blob/c7b26825218d5d11e9469984977dee6856f362ff/core/trino-parser/src/main/antlr4/io/trino/sql/parser/SqlBase.g4#L41-L172

and figuring out where the formatter should split. Should I just list every top level keyword in that rule and any keywords that are are on their own in each rule? So like for

https://github.com/trinodb/trino/blob/c7b26825218d5d11e9469984977dee6856f362ff/core/trino-parser/src/main/antlr4/io/trino/sql/parser/SqlBase.g4#L98-L100

    | CREATE ROLE name=identifier
        (WITH ADMIN grantor)?
        (IN catalog=identifier)?                                       #createRole

I should add CREATE ROLE, WITH ADMIN and IN? Maybe not the last one (IN) though?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, don't worry too much about these less used statements like CREATE ROLE, SET AUTHORIZATION etc. The formatter is currently lacking in its support for lots of these utility statements in various dialects. There are plans to review all these for all the dialects. So, feel free to either add them all or none of them.

The SET is for the SET in UPDATE table SET assignments.

Looks like Trino doesn't support SET SCHEMA, so don't add it :)

IN is best not to be added to the reservedCommands list. It would conflict with expressions like NOT IN.

@verhovsky verhovsky changed the title Support formatting Trino/Presto queries Support formatting Trinoqueries Jul 9, 2022
@verhovsky verhovsky changed the title Support formatting Trinoqueries Support formatting Trino queries Jul 9, 2022
@inferrinizzard
Copy link
Collaborator

Additionally, can you also add language tests (and documentation, if possible)

const language = 'trino';
const format: FormatFn = (query, cfg = {}) => originalFormat(query, { ...cfg, language });

// behavesLikeSqlFormatter(format);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a reason why these tests are commented out ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because they fail

@nene nene merged commit 41ca97a into sql-formatter-org:master Jul 16, 2022
@nene
Copy link
Collaborator

nene commented Jul 16, 2022

As this PR seemed to have stalled a little and at the same time master has moved forward quite a bit, I decided to accelerate things a bit and merge this in.

I did some cleanup and merged with latest changes in master in #325.

It's not perfect, but neither is the support for all other dialects. A general support for Trino dialect is now present. Further improvements can be made in future PRs.

@nene
Copy link
Collaborator

nene commented Jul 16, 2022

Thanks for the effort you've put into this. 👍

@nene
Copy link
Collaborator

nene commented Jul 16, 2022

This is now released in 9.0.0-beta3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants