-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/ast: provide AST Rewrite functionality (ast.Walk is not good enough) #17108
Comments
cc: @alandonovan |
This seems like a good addition. The reflect-based implementations in gofmt (and eg) is fearsomely complex, and there are other times I have wanted to use this function. The index is not enough to uniquely identify a subtree of a node. For example, ast.CaseClause contains two slices of Nodes, and ast.BinOp contains two Exprs. I think we need to identify the field too. The most obvious way to do that is by its name. Field numbers, or field offsets, might be marginally more efficient, but would certainly be harder to read, and I'm users would thank us for choosing strings when they're debugging. I'm not sure you need the index (or index + field name) in the Apply function, only in ApplyFunc. |
I've also wanted this. Index is sufficient I think if it counts across all children. E.g., in AssignStmt, it counts from 0 to len(Lhs)+len(Rhs)-1. I've had cases where I want to be able to walk up the AST more than one Node, so I maintain a slice of (Parent, Index) tuples. Aside: Index technically isn't necessary, and Java's Compiler API omits it from com.sun.source.util.TreePath. But I think it's handy to have. I'm not sure I understand what addr's dynamic type will be from the description. I'm guessing though that if I'm visiting a *ast.Ident, then it could be either a **ast.Ident (e.g., the name for a ValueSpec) or *ast.Expr (e.g., an identifier in an expression context)? |
The dynamic type would be a pointer to the variable (field or slice element), whatever that may be: Node, Expr, *Ident. |
Here's a slightly more thought-through API that provides explicit names for fields. Nice, but definitively somewhat costly:
|
@mdempsky points out that alternatively, index could simply be monotonically increasing. The index of a list element would be the provided index minus the index of the list. A utility function could provide the name of a field for a given index if necessary. That function could be reflection-based since it's probably not speed-critical. |
We probably still want the field name though, so that you can locate the subtree easily in cases like SliceExpr without having to do four separate comparisons---and arguably more importantly, without having to remember to do four separate comparisons. I have found several bugs caused by assuming that a given node was the "correct" subtree of the parent node, when in fact it was another one. Passing the field name will help to make it obvious that this is something you need to think about. |
I think the easier solution is to still reserve indices for nil fields. So for SliceExpr you'd always have 0==X, 1==Low, 2 == High, 3==Max. |
You can't use the same number to indicate both a field index and a slice index. |
I played with this a bit and I have a prototype that could work. Observations: Passing an address to a field is problematic: Many fields that we are interested in are (ast.)Expr, Stmt, and Decl fields, which are interfaces. To "unpack" them we have to type-switch on the address, then deref the address and then type-switch again on the contents; thus requiring two type switches. Here's a better approach: If we have the parent, field name (or field index), plus slice index if needed, we have all that is necessary because we can use reflect to set a field with this information. Changing/rewriting fields is (probably) much less common then reading (traversing) the tree, thus the more costly set field operation is ok. In turn, the API can be closer to Walk and thus easier to use:
Open questions:
|
On 14 September 2016 at 23:52, Robert Griesemer notifications@github.com
Sounds good. It's interesting that Apply is now a primitive interface for
|
@alandonovan ast.Walk was written w/o a very good understanding of use cases; in fact, I introduced ast.Inspect later because it was much easier to use in many situations. We cannot remove these functions but we could mark them as deprecated. ast.Apply should cover both use cases nicely. |
I didn't mean you should remove or deprecate them, only that you should go through the exercise of implementing the old functions in terms of the new, since it might be revealing. |
@alandonovan I understand. I think it can be done but it will be difficult to guarantee 100% semantic equivalence w/o manually checking each case, or an extensive test suite testing each node type and exit scenario. For one, Walk doesn't invoke the visitor if a node is nil, while Apply probably should (and my prototype currently does) so that the pre/post closures have a chance to rewrite/set the node (that feature will make it possible to implement Walk, while Walk could not be used to implement Apply since it misses nodes). |
I think we're talking past each other. Examples of how I use a single index to identify child nodes:
That said, I'm not opposed to field names. |
On 15 September 2016 at 12:48, Matthew Dempsky notifications@github.com
Ah, so you would flatten out the scalar fields and the slices into one |
For a concrete implementation, see: |
I recently started on a tool for vim-go and wanted to easily rewrite AST as well. I've seen this issue and really looking forward what we get out from this. Meanwhile, I've copied Also I think, we can introduce a |
@fatih Any reason the version I mentioned above ( griesemer/dotGo2016@f0f16c2 ) wouldn't have worked for you? |
@griesemer at first the API was a little bit confusing for me. So I didn't tried to investigate in depth. But I've tried to give it a look now that you asked for it. There are more steps involved changing a node with src := `package main
type Foo struct{}`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "foo.go", src, parser.ParseComments) I've tried to rename the type name from 1. Apply:applyFunc := func(parent ast.Node, name string, index int, n ast.Node) bool {
spec, ok := parent.(*ast.GenDecl)
if !ok {
return true
}
x, ok := n.(*ast.TypeSpec)
if !ok {
return true
}
x.Name.Name = "Bar"
spec.Specs[index] = x
return true
}
rewritten := ast.Apply(file, applyFunc, nil) 2. Walk() from astrewrite:rewriteFunc := func(n ast.Node) (ast.Node, bool) {
x, ok := n.(*ast.TypeSpec)
if !ok {
return n, true
}
x.Name.Name = "Bar"
return x, true
}
rewritten := astrewrite.Walk(file, rewriteFunc) With -- Note that my current use case is very simple. There might be other use cases that |
@fatih Thanks for looking into this. A few observations:
To summarize: If you only want to change the name of things, you may not need a rewrite at all. Secondly, while Anyway, thanks again for investigating! |
@griesemer Thanks a lot for the detailed explanation. I've didn't know that I could just use I'll try to use |
@griesemer it'd be handy if that code was go-gettable, so it is easier to use and evolve. Since you wrote it, perhaps you want to put it up on github under your name? Or I could put it up somewhere for you; let me know. I've pulled it out of package ast for you: https://gist.github.com/josharian/ff74a8451d7d4e7f062d7b9b04c87eac. Not tested yet, but it compiles, and the only modification was adding |
Indeed. Although maybe if we documented that the lifetime of an ApplyCursor is a single call to pre/post, then we could allocate a single ApplyCursor and reuse it. The non-allocation part of the setup of the ApplyCursor should be pretty cheap, I think. One other thing I learned while using the API above is that frequently I don't want to walk the node inserted by InsertAfter. We might want to change the behavior to non-walk (since the user can manually walk it if desired) or add a bool parameter to control the behavior. |
@josharian Yes, if you pass in the ApplyCursor info down it's like passing the apply parameters via the ApplyCursor struct, and it's only allocated once. I like that. Care making this an official CL as solution for this issue (assign to me for review)? |
Will do; expect it to take a few days. |
Thanks - no rush. |
(still working on this, not forgotten) |
Change https://golang.org/cl/55790 mentions this issue: |
We don't use the issue tracker for questions; in the future please ask one the mailing lists or chat boards.
The go/ast documentation groups the nodes in Decl, Expr, and Stmt, and the grouping is pretty clearly telling you which node fits where. It's also trivial to find out empirically: if you can assign a node x to an ast.Stmt, then x is a statement, etc. |
@griesemer sorry. I deleted the question. will try to find help on the boards |
Is there a reason why the proposed ast.Apply returns a (possibly) modified ast.Node? This seems like a departure from ast.Walk, and I don't understand why it is necessary to return a Node instead of modifying it in place. |
@smasher164 Because ast.Apply may change the type of the node which cannot be done by modifying a node in place. For instance, an ast.Apply working on an constant expression tree may replace that tree with a single node which is the constant result. |
Change https://golang.org/cl/74930 mentions this issue: |
Based on work by Robert Griesemer. Fixes golang#17108 DO NOT REVIEW Work in progress. Notes: Needs more tests, particularly of interactions between modifications, like Delete then InsertAfter, Delete then InsertBefore, and so on. Even just the few tests I've added so far have helped to clarify the API. I've changed Apply to not walk the node inserted by InsertAfter. Optimize: Need to allocate a single ApplyCursor and reuse it. TODO: Review API, consider shrinking it a bit. Do we need ApplyCursor.IsFile to be exported? ApplyCursor.Name? And so on. Should Apply do anything with comments and/or CommentMaps? Should Apply.Name be renamed Apply.FieldName? Change-Id: I291bb3f8aba85abdeb728714c08702c082617f54
Change https://golang.org/cl/77811 mentions this issue: |
This has come up again and again. While it's easy to traverse the AST (ast.Walk, ast.Inspect), it's not easily possible to use those for general tree rewrites w/o significant amount of work.
For instance, if we wanted to rewrite all expressions (ast.Expr) of a certain kind into something else, ast.Walk and ast.Inspect would need to consider every node that contains ast.Expr fields separately, rather than just look at nodes that satisfy ast.Expr.
AST rewriters exist in other packages; most notably perhaps in gofmt (for gofmt -r); that one is reflection-based. Reflection-based approaches tend to be general, but also hard to understand, and slower than necessary.
API starting point:
The text was updated successfully, but these errors were encountered: