-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoughts on SQL Extensions #104
Comments
Welcome! We don't have specific proposals at hand on tackling the extensibility/dialects problem. I know @andygrove intended to work on this, but he's been quite busy outside the project recently. Glad to see you thinking about this! I thought I'd share my thoughts here, in the hope it will help clarify the proposal. I hope that @benesch will comment too, but I guess we'll need Andy's sign-off before settling on a decision. I'd like to learn a bit about your perspective. For example, my interest is in analysing complex sets of queries written in various existing dialects. For that reason I'm interested in a dialect-agnostic parser, where dialects provide a way to pick between conflicting parsing rules. But I believe that in order to be maintainable and usable for other purposes (like for validation or in a query engine) the project needs to grow better support for dialects. As for the prototype you shared, I can't yet say I understand the idea very well or how it'd apply to existing dialect-specific code, for example:
|
The company I work for analyzes complex queries all written in the same dialect and it was pretty trivial to add Going back to the example…
If I were to implement a PR implementing an extension trait and the modifications to be able to parse them, I think it would be necessary to implement the existing dialect specific sql in the new way. First because it'll make testing the ergonomics of the implementation a lot easier, but also because it would be really confusing for a new developer if there were two ways that dialect specific stuff was implemented. After the dialect-specific code was moved into the dialects themselves, I guess the dialects could also be moved out of mainline eventually.
I think I may have over-simplified there, assuming that I could get away with not defining the parsing mechanism on 😄 [edit] |
Thanks, I think our goals are quite aligned!
I wanted to find and re-read the previous discussions on this, in which a number of good points have already been made, before sharing more ideas of my own. It will take a while, because reviewing PRs is higher priority for me.
Yep, that's an option. The question then is how to integrate the extensions in the AST, as Overriding parser's methods also involves copying much of the shared logic - in the joins example you'd copy a page of code just to add one additional join type. Yet, unless we freeze the DialectParser trait, the dialects will not really be independent from the maintenance perspective... We could introduce additional extension points in such cases, but it might be easier to just have some dialect-specific parsing in the mainline parser, enabled conditionally. |
I think in the case of |
Hi @ChristopherMacGown Thanks for suggesting an approach to support specific dialects. It's good to see a discussion happening on this. It certainly is challenging making this extensible without the need to duplicate code. To add to the use case discussion I would imagine many users just want a parser for a specific dialect they are working with (HiveQL in my case). |
I've been playing around with a trait based solution to allowing dialects to carry their own extension definitions. Would something along these lines be accepted?
Obviously the example is a toy, and there would need to be some additional functions defined on the trait to be able to extract the AST from the boxed Trait, but I thought it was better to get feedback before rushing off and implementing.
The text was updated successfully, but these errors were encountered: