Skip to content

x/tools/go/ast/inspector: add Cursor, to enable partial and multi-level traversals #70859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
adonovan opened this issue Dec 15, 2024 · 16 comments

Comments

@adonovan
Copy link
Member

adonovan commented Dec 15, 2024

Background: The inspector package enables efficient repeated traversal over the list of ast.Files that make up a package. However, unlike ast.Inspect, it provides no way to inspect just some subtree, which is a common and recurring need. In particular, it is common to need a two-level traversal, in which one first inspects to find each node of type A, then conditionally visits some subtrees of it looking for nodes of type B.

This need could be addressed if the inspector could iterate over a sequence of abstract cursor positions from which the actual ast.Node could be queried. A cursor can then be used as the basis for a new traversal of the subtree rooted at that node, with a different type filter.

As a bonus, the WithStack operation would not be needed because the stack can be derived relatively efficiently from the cursor position by walking back from the current point, skipping from pop to push nodes.

Unfortunately the inspector as defined is stateless, so we can't simply add a Current() Cursor method to it, nor sneak a Cursor among the arguments to the existing callback functions. Cursor would need all the same visitation methods as Inspector, and Inspector's methods would conceptually delegate to the "root" Cursor. However, we can do better than simply duplicating the old API.

Prototype implementation at https://pkg.go.dev/golang.org/x/tools/internal/astutil/cursor.

Proposal: We propose to add the inspector.Cursor type and related methods, plus the supporting edge.Kind enum type.

Summary:

package inspector // golang.org/x/tools/go/ast/inspector

type Cursor struct{...}
func (c Cursor) Enclosing(types ...ast.Node) iter.Seq[Cursor]
func (c Cursor) Child(n ast.Node) Cursor
func (c Cursor) Children() iter.Seq[Cursor]
func (c Cursor) Contains(Cursor) bool 
func (c Cursor) FindNode(n ast.Node) (Cursor, bool)
func (c Cursor) FindByPos(start, end token.Pos) (Cursor, bool)
func (c Cursor) FirstChild() (Cursor, bool)
func (c Cursor) Index() int32
func (c Cursor) Inspect(types []ast.Node, f func(c Cursor, push bool) (descend bool))
func (c Cursor) LastChild() (Cursor, bool)
func (c Cursor) NextSibling() (Cursor, bool)
func (c Cursor) Node() ast.Node
func (c Cursor) Parent() Cursor
func (c Cursor) ParentEdge() (edge.Kind, int)
func (c Cursor) Preorder(types ...ast.Node) iter.Seq[Cursor]
func (c Cursor) PrevSibling() (Cursor, bool)
func (c Cursor) Stack(stack []Cursor) []Cursor
func (c Cursor) String() string

package edge // golang.org/x/tools/go/ast/edge
type Kind uint8
const (
	Invalid Kind = iota // for nodes at the root of the traversal

	// Kinds are sorted alphabetically.
	// Numbering is not stable.
	// Each is named Type_Field, where Type is the
	// ast.Node struct type and Field is the name of the field

	ArrayType_Elt
	ArrayType_Len
	AssignStmt_Lhs
	AssignStmt_Rhs
	...
)
func (k Kind) FieldName() string
func (k Kind) FieldType() reflect.Type
func (k Kind) Get(n ast.Node, idx int) ast.Node
func (k Kind) NodeType() reflect.Type
func (k Kind) String() string

Detailed API

package inspect

// -- existing API -- 

type Inspector

func (*Inspector) Nodes(types []ast.Node, f func(n ast.Node, push bool) (proceed bool)) 
func (*Inspector) Preorder(types []ast.Node, f func(ast.Node))
func (*Inspector) PreorderSeq(types ...ast.Node) iter.Seq[ast.Node]
func (*Inspector) WithStack(types []ast.Node, f func(n ast.Node, push bool, stack []ast.Node) (proceed bool))

// -- new API --

// A Cursor represents an ast.Node. It is immutable.
type Cursor struct {
    in *Inspector
    index int
}    

// Root returns a cursor for the virtual root node,
// whose children are the ast.Files provided to NewInspector.
//
// Its Node and Stack methods returns nil.
func (*Inspector) Root() Cursor

// At returns the Cursor belonging to this Inspector whose [Cursor.Index] is index.
func (*Inspector) At(index int32) Cursor

// Node returns the node at the current cursor position.
func (Cursor) Node() ast.Node

// String returns a description of the current node.
func (Cursor) String() string

// -- traversals --

// All the usual traversals over the inspector can
// be offered as traversals within a subtree rooted at a given
// node, allowing multi-level traversals with independent
// control over type-based node filtering and independent control
// over pruning descent into subtrees. We can also provide fast
// means to obtain the cursor for a particular node or position.

func (Cursor) Preorder(types ...ast.Node) iter.Seq[Cursor]
func (Cursor) Inspect(types []ast.Node, f func(c Cursor, push bool) (descend bool))

// -- navigation in the four cardinal directions --

// ancestors

// Parent returns the parent of the current node.
// It may return the root node (whose Cursor.Node methods returns nil).
func (Cursor) Parent() Cursor

// Enclosing returns an iterator over the nodes enclosing the current node, starting with this Cursor.
// Enclosing must not be called on the Root node (whose [Cursor.Node]) returns nil).
// The types argument, if non-empty, enables type-based filtering of events:
// the sequence includes only nodes whose type matches an element of the types slice.
func (Cursor) Enclosing(types ...ast.Node) iter.Seq[Cursor]

// Stack returns the stack of enclosing nodes,
// from the ast.File down to the current cursor's node.
// It appends to the provided slice, which must be empty,
// but may have spare capacity.
func (Cursor) Stack([]Cursor) []Cursor

// ParentEdge returns the identity of the field in the parent node that holds this cursor's node, and if it is a list, the index within it.
// For example, f(x, y) is a CallExpr whose three children are Idents. f has edge kind edge.CallExpr_Fun and index -1. x and y have kind edge.CallExpr_Args and indices 0 and 1, respectively.
// ParentEdge must not be called on the Root node (whose [Cursor.Node]) returns nil).
// If called on a child of the Root node, it returns (edge.Invalid, -1).
func (c Cursor) ParentEdge() (edge.Kind, int)

// children

func (Cursor) FirstChild() (Cursor, bool)
func (Cursor) LastChild() (Cursor, bool)
func (Cursor) Children() iter.Seq[Cursor]

// siblings

func (Cursor) NextSibling() (Cursor, bool)
func (Cursor) PrevSibling() (Cursor, bool)

// -- search --

func (c Cursor) FindNode(n ast.Node) (Cursor, bool)
func (c Cursor) FindByPos(start, end token.Pos) (Cursor, bool)

Example usage:

var (
   filterA = []ast.Node{(*ast.File)(nil)}
   filterB = []ast.Node{(*ast.BinaryExpr)(nil)}
)


var in *inspect.Inspector = ...

for curFile:= range in.Cursor().Preorder(filterA...) {
    file := curFile.Node().(*ast.File)
    ...
    for curBinary := range c.Preorder(filterB...) {
        binary := curBinary.Node().(*ast.BinaryExpr)
	fmt.Println(c.Stack(nil))
    }
}

See https://go.dev/cl/636656 for the implementation.

@findleyr @timothy-king @griesemer

Summary of changes:

  • Ancestors -> Enclosing, and made reflexive
  • Added Cursor.Index, Inspector.At
  • Added Cursor.Contains
@gopherbot gopherbot added this to the Proposal milestone Dec 15, 2024
@gabyhelp
Copy link

Related Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@adonovan adonovan moved this to Incoming in Proposals Dec 15, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/636156 mentions this issue: go/ast/inspector: Cursor: partial and multi-level traversals.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/636656 mentions this issue: gopls/internal/analysis/inspect: hack for Cursor access

@findleyr
Copy link
Member

Summarizing discussion from the CL: we're a little concerned about Cursor methods with suboptimal performance, since the Inspector itself was created as an optimization. Specifically:

  • The fact that Cursor.Preorder does not allow for pruning branches of the traversal may unsatisfactory. The advantage of a Cursor is that it allows for nested traversal, and in such cases you almost always want to prune the parent traversal.
  • Using Stack is not as fast as a WithStack traversal, at least in naive benchmarks. In practice it may still be better, if most iterations of the body don't access the stack.

We are going to use the Cursor internally to see which API is the most useful.

gopherbot pushed a commit to golang/tools that referenced this issue Dec 23, 2024
This CL defines the Cursor type, which represents a node in the
traversal done by go/ast/inspector (represented by the index
of its push event). From this, we can derive all the operators for
navigating the AST in the four "cardinal directions", similar
to the DOM of a web browser:

- Parent (the immediate parent)
- Stack (all ancestors)
- Children, FirstChild, and LastChild
- PrevSibling and NextSibling

All operations are O(1) except Parent, which is O(n) for a
node in a list of length n.

In addition, all the usual traversals over the inspector can
be offered as traversals within a subtree rooted at a given
node, allowing multi-level traversals with independent
control over type-based node filtering and independent control
over pruning descent into subtrees. We can also provide fast
means to obtain the cursor for a particular node or position.

- Cursor.Inspect(types []ast.Node, f func(c Cursor, push bool) (bool))
- Cursor.Preorder(types ...ast.Node) iter.Seq[Cursor]
- Cursor.FindNode(n ast.Node) (Cursor, bool)
- Cursor.FindPos(start, end token.Pos) (Cursor, bool)

All of these operations are simple to express using inspector's
existing threaded tree representation, and they make it both
simpler and more efficient to express the kinds of queries
that turn up in our refactoring efforts. For example:
"Find all function literals; then find all defer statements
within them; then go to the previous statement to see
if it is a call to Mutex.Lock" etc.

This CL also includes one token usage of Cursor from an
existing analyzer.

The CursorStack benchmark is significantly slower than the
existing WithStack operation, but I suspect the frequency of
calls to Cursor.Stack is actually much rarer in practice than
the benchmark would suggest, because one typically calls Stack
only after a highly selective filtering. When Stack is called
infrequently, the Cursor-based traversal is about 27% faster
while still providing the option to obtain the stack when
needed.

We will evaluate these operators in the coming weeks and
update proposal golang/go#70859 in light of our experience.
In the meantime, we must use linkname hackery to augment
the existing inspector for use in gopls.

Updates golang/go#70859

Change-Id: I1ec8c1a20dc07ad80dad8f0038c0cf3f8f791050
Reviewed-on: https://go-review.googlesource.com/c/tools/+/636656
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
@adonovan
Copy link
Member Author

  • The fact that Cursor.Preorder does not allow for pruning branches of the traversal may unsatisfactory.

The Inspect method provides pruning, at the cost of the need to deal with the annoying push bool parameter, which in practice has never really been needed. We could remove it in Cursor.Inspect, but that seems needlessly inconsistent with Inspector.Inspect.

The advantage of a Cursor is that it allows for nested traversal, and in such cases you almost always want to prune the parent traversal.

(It's actually fairly uncommon to prune the parent traversal after a nested traversal.)

  • Using Stack is not as fast as a WithStack traversal, at least in naive benchmarks. In practice it may still be better, if most iterations of the body don't access the stack.

It's true that Stack is not as fast as WithStack if you need the stack for every visited node, but one never does: the stack is typically needed only for a small minority of visited nodes, and typically only the parent or perhaps grandparent stack elements are needed. So in practice Cursor is much faster.

@rsc
Copy link
Contributor

rsc commented Feb 13, 2025

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Feb 13, 2025
@aclements
Copy link
Member

An internal version of this is widely used in gopls and is being used in x/tools analysis, so it's already proven to be a good and useful API.

@adonovan , you mentioned there were a few tweaks you've made while using this. Could you update the proposed API to reflect them?

@griesemer and @adonovan are going to sit down "together" and make sure they agree on the API.

@adonovan
Copy link
Member Author

adonovan commented Mar 21, 2025

I've made some minor tweaks to the issue body above. The remaining issues to discuss are:

  1. Additional methods Inspector.At(index int32) and Cursor.Index() int32 (see https://go.dev/cl/657958) to enable compact representations of collections of cursors.
// At returns the cursor at the specified index in the traversal,
// which must have been obtained from [Cursor.Index] on a Cursor
// belonging to the same Inspector.
func (*inspector.Inspector) At(index int32) Cursor

// Index returns the index of this cursor position within the package.
//
// Clients should not assume anything about the numeric Index value
// except that it increases monotonically throughout the traversal.
// The value is intended only for use with [At].
//
// Index must not be called on the Root node.
func (c Cursor) Index() int32

Also, func (Cursor).Contains(Cursor) bool.

  1. Cursor.Ancestors should be made reflexive, since this is what is nearly always wanted (and when not it is easy to work around, unlike the converse situation which is inconvenient). This is however inconsistent with the conclusion of a similar proposal in the x/net/html package (x/net/html: add Node methods Ancestors, ChildNodes, Descendants #62113). Do we need to choose a different name?

  2. Which package should Cursor inhabit? It is currently in a separate package (cursor) out of necessity, because we cannot add new API to the inspector package--but perhaps this is also a virtue? For one thing, it's three letters shorter, and cursor.Cursor is likely to be a frequently mentioned type. However, if they are in separate packages then Inspector cannot mention Cursor, and methods such as Inspector.Root and Inspector.At would have to remain as standalone functions in the cursor package.

  3. Naming, in general.

@adonovan
Copy link
Member Author

Addressing the above:

  1. done (both parts);
  2. done, and Ancestors is renamed to Enclosing;
  3. I like the package name cursor but I think it will be simpler to understand if it all lives in package inspector. So the subject line of this proposal remains unchanged.
  4. any remaining questions of naming? @griesemer

@aclements
Copy link
Member

Would it be worth having a func NewCursor([]*ast.File) Cursor that directly constructs a Cursor, so users who don't need to use the Inspector don't need to first construct an Inspector?

func (c Cursor) FindPos(start, end token.Pos) (Cursor, bool)

It's not clear from the name if this returns a Cursor completely containing [start, end) or completely contained within [start, end), or somehow otherwise overlapping. Maybe we try to capture "cursor containing pos range" in the name?

@adonovan adonovan self-assigned this Apr 16, 2025
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/666056 mentions this issue: internal/astutil/cursor: four API refinements

@adonovan
Copy link
Member Author

Would it be worth having a func NewCursor([]*ast.File) Cursor that directly constructs a Cursor, so users who don't need to use the Inspector don't need to first construct an Inspector?

If we were starting over, I would remove all the methods from Inspector, since the Root Cursor provides everything, so that Inspector becomes just an abstract type representing the traversal data structure. I would also define New as func New[N ast.Node](...N) Cursor so that you could (in principle) use it with non-File trees, not that we have had any need to do so. But we would still want to keep the Inspector type, so that we can pick apart a Cursor as a pair (i *Inspector, index int32), which we need on occasion for efficiency of storage. In short, even if we could have made all its methods disappear, we still couldn't dispense with the Inspector concept, so there is little gained by adding a NewCursor wrapper.

Maybe we try to capture "cursor containing pos range" in the name?

@gri suggested Innermost, but was ok with FindByPos, which is what I changed it to in the most recent CL. I don't have any better ideas.

gopherbot pushed a commit to golang/tools that referenced this issue Apr 17, 2025
1. Eliminate Cursor.Stack: it is redundant wrt
   Cursor.Enclosing, and essentially never needed
   since the stack can easily be computed cheaply
   on demand.

2. Rename FindPos to FindByPos.

3. Remove "push bool" parameter from Inspect callback.
   It has never once been needed; the only times we
   have ever needed to have some effect after visiting
   a node, we have wanted to express the whole recursion
   ourselves (see e.g. freeVisitor in gopls).

4. Add Cursor.Inspector accessor for completeness.

Also, simplify the traversal in unusedparams.

Updates golang/go#70859

Change-Id: I1c5218d1597105d6ffb423cbbb2306d62cabc47d
Reviewed-on: https://go-review.googlesource.com/c/tools/+/666056
Commit-Queue: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
@aclements
Copy link
Member

FindByPos is fine. We should be sure to document that it finds the smallest node fully containing the given range. (@adonovan points out that often start==end, in which case this is the only definition that makes sense.)

@aclements
Copy link
Member

Based on the discussion above, this proposal seems like a likely accept.
— aclements for the proposal review group

The proposal details are in #70859 (comment)

@aclements
Copy link
Member

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— aclements for the proposal review group

The proposal details are in #70859 (comment)

@aclements aclements moved this from Likely Accept to Accepted in Proposals May 8, 2025
@aclements aclements changed the title proposal: x/tools/go/ast/inspector: add Cursor, to enable partial and multi-level traversals x/tools/go/ast/inspector: add Cursor, to enable partial and multi-level traversals May 8, 2025
@aclements aclements modified the milestones: Proposal, Backlog May 8, 2025
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/670835 mentions this issue: go/ast/inspector: publish Cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

6 participants