Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an AST visitor #78

Merged
merged 1 commit into from
Jun 11, 2019
Merged

Add an AST visitor #78

merged 1 commit into from
Jun 11, 2019

Conversation

benesch
Copy link
Contributor

@benesch benesch commented May 22, 2019

AST visitors can make traversal of an AST much simpler when only part of the AST is interesting. E.g., it's quite easy to write a visitor that answers a question like "are there any aggregate functions in this query?".

This commit is pretty dependent on the exact structure of the AST, so I think it depends on basically all the other PRs I just submitted (#72, #73, #74, #75, #76, and #77). The relevant diff is in the last commit: andygrove@c7c89bd

@coveralls
Copy link

coveralls commented May 22, 2019

Pull Request Test Coverage Report for Build 297

  • 624 of 667 (93.55%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.4%) to 91.835%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/sqlast/visit.rs 624 667 93.55%
Totals Coverage Status
Change from base Build 287: 0.4%
Covered Lines: 4375
Relevant Lines: 4764

💛 - Coveralls

@nickolay
Copy link
Contributor

nickolay commented Jun 3, 2019

I don't have a firm opinion about this:

  1. on one hand this increases the burden on the authors of the future PRs to sqlparser
  2. on the other hand, it probably makes it easier to walk the AST for some consumers

For my own use-case I convert the AST to JSON via serde and analyze it outside of rust, so I can't feel the benefits of (2) on my own, and I didn't find time to get familiar with the work @cswinter and @andygrove do based on sqlparser to see if this would be beneficial to them.

To evaluate how much of a burden is (1) I'd have to implement a new feature on top of this. I expect the type system will make it trivial, and the hardest part would be to update the testcase, so I'm leaning towards giving it a try.

@benesch benesch force-pushed the visitor branch 2 times, most recently from b516313 to fd83f71 Compare June 5, 2019 17:58
@benesch
Copy link
Contributor Author

benesch commented Jun 5, 2019

Yep, those are exactly the tradeoffs! It is kind of a pain to update the visitor as well as the AST, so I'd be happy to wait on this until the API is more stable. (You may have noticed I was particularly slow to rebase this PR.) I think at some point the scales will tip firmly in favor of bundling the visitor with the library, but I'm not totally convinced we've hit that point yet.

/// writing code to traverse the entire AST.
pub trait Visit<'ast> {
fn visit_statement(&mut self, statement: &'ast SQLStatement) {
visit_statement(self, statement)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the default implementation ... it looks like it recurses straight back into itself. I'm sure I'm missing something obvious here but could you explain this?

Copy link
Contributor Author

@benesch benesch Jun 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! I stole this pattern from the syn crate's visitor. The goal is to minimize the amount of code that the consumer needs to write.

Basically, there's a trait method with a default implementation that calls a free function (with the same name, hence the confusion), and that free function that does the actual walking of the AST. Here's the first few lines of visit_statement, for example:

pub fn visit_statement<'ast, V: Visit<'ast> + ?Sized>(
     visitor: &mut V,
     statement: &'ast SQLStatement,
 ) {
     match statement {
         SQLStatement::SQLQuery(query) => visitor.visit_query(query),
         SQLStatement::SQLInsert {
             table_name,
             columns,
             source,
         } => visitor.visit_insert(table_name, columns, source),
         // ...
     }
 }

Here's a more concrete example of a visitor that makes use of the free function to avoid reimplementing the AST-walking code:

struct MyVisitor {
    // arbitrary state
}

impl<'ast> sqlparser::visit::Visit<'ast> for MyVisitor<'ast> {
    fn visit_function(&mut self, func: &'ast SQLFunction) {
        // pre-visit
        sqlparser::visit::visit_function(func);
        // post-visit
    }
}

Basically, overriding trait methods is really convenient, because you only have to override the methods whose behavior you want to change. But sometimes you don't want to override the method entirely; you just want to tack on some behavior to the beginning or end. Having a public free function makes this quite ergonomic. You override the trait method, do whatever you need to, and then call the free function to keep recursing when you're ready—or perhaps not at all, if you've already found what you need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for explaining that to me.

@andygrove andygrove merged commit 7ed6e03 into apache:master Jun 11, 2019
@andygrove
Copy link
Member

Sorry @benesch this somehow broke the master build and I had to revert

@benesch benesch mentioned this pull request Jun 11, 2019
@benesch
Copy link
Contributor Author

benesch commented Jun 11, 2019

No problem, @andygrove—I fixed the merge skew and resubmitted as #114.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants