Rewrite `collect_tokens` implementations to use a flattened buffer #77250

Aaron1011 · 2020-09-27T02:08:09Z

Instead of trying to collect tokens at each depth, we 'flatten' the
stream as we go allong, pushing open/close delimiters to our buffer
just like regular tokens. One capturing is complete, we reconstruct a
nested TokenTree::Delimited structure, producing a normal
TokenStream.

The reconstructed TokenStream is not created immediately - instead, it is
produced on-demand by a closure (wrapped in a new LazyTokenStream type). This
closure stores a clone of the original TokenCursor, plus a record of the
number of calls to next()/next_desugared(). This is sufficient to reconstruct
the tokenstream seen by the callback without storing any additional state. If
the tokenstream is never used (e.g. when a captured macro_rules! argument is
never passed to a proc macro), we never actually create a TokenStream.

This implementation has a number of advantages over the previous one:

It is significantly simpler, with no edge cases around capturing the
start/end of a delimited group.
It can be easily extended to allow replacing tokens an an arbitrary
'depth' by just using Vec::splice at the proper position. This is
important for PR [WIP] Token-based outer attributes handling #76130, which requires us to track information about
attributes along with tokens.
The lazy approach to TokenStream construction allows us to easily
parse an AST struct, and then decide after the fact whether we need a
TokenStream. This will be useful when we start collecting tokens for
Attribute - we can discard the LazyTokenStream if the parsed
attribute doesn't need tokens (e.g. is a builtin attribute).

The performance impact seems to be neglibile (see
#77250 (comment)). There is a
small slowdown on a few benchmarks, but it only rises above 1% for incremental
builds, where it represents a larger fraction of the much smaller instruction
count. There a ~1% speedup on a few other incremental benchmarks - my guess is
that the speedups and slowdowns will usually cancel out in practice.

rust-highfive · 2020-09-27T02:08:12Z

r? @oli-obk

(rust_highfive has picked a reviewer for you, use r? to override)

Aaron1011 · 2020-09-27T02:08:24Z

@bors try @rust-timer queue

rust-timer · 2020-09-27T02:08:26Z

Awaiting bors try build completion

bors · 2020-09-27T02:08:38Z

⌛ Trying commit 653edbfd18abfb56203811db0d974b9d5dc89d30 with merge f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa...

Aaron1011 · 2020-09-27T02:10:01Z

r? @petrochenkov

bors · 2020-09-27T02:49:21Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa (f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa)

rust-timer · 2020-09-27T02:49:23Z

Queued f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa with parent 623fb90, future comparison URL.

compiler/rustc_parse/src/parser/mod.rs

rust-timer · 2020-09-27T14:28:51Z

Finished benchmarking try commit (f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

Aaron1011 · 2020-09-27T17:24:47Z

From doing some local profiling, the slowdown appears to be due to the fact that there Parser.clone() on the successful path for parsing various things (not just when we've already encountered an error). This requires us to clone the entire flattened tokenstream - previously, many of these tokens would have been inside a TokenTree::Delimited, allowing us to just clone an Lrc.

petrochenkov · 2020-09-27T17:43:39Z

Hmm, why didn't this show up in #76130?

Aaron1011 · 2020-09-27T17:56:21Z

@petrochenkov: I added logic to skip token collection for attribute targets in more cases (for example, https://github.com/Aaron1011/rust/blob/feature/new-preexp-cfg-tmp/compiler/rustc_parse/src/parser/attr.rs#L541-L555). It looks like this may have masked the fact that token collection was more expensive when we actually performed it.

petrochenkov · 2020-09-27T18:22:13Z

this makes it easy to modify the captured stream at an
arbitrary depth, simply by replacing tokens in the flattened buffer.
This is important for PR #76130, which requires us to track information
about attributes along with tokens.

Tokens that need to be replaced in #76130 should all be at the same level, right?
So, the replacement is not fundamentally impossible, just needs some tree walking to get to the right level?

Aaron1011 · 2020-09-27T18:32:28Z

So, the replacement is not fundamentally impossible, just needs some tree walking to get to the right level?

Yes - however, the implementation ended up being significantly more complicated when I tried to do it that way.

The issue is that we can have code like this:

#[derive(MyDerive)]
struct Bar {
    val: [u8; {
		struct Inner {
			#[cfg(FALSE)] field: u8
		}
		0
	}]
}

We want the tokens for Bar to (eventually) end up with a PreexpTokenStream::OuterAttributes for field: u8. However, there can be arbitrary many nested delimited streams between where we start collecting tokens struct Bar and the nested attributes (#[cfg(FALSE)]).

The current implementation of collect_tokens will just clone the top-level { val: ... } delimited stream, and push it to the buffer - without ever recursing into it. In order to get it to the right struture, we would need to either:

Record a list of positions to allow us to traverse the topmost PreexpTokenStream to the proper position.
Keep track of the modifiations we apply at a particular frame (e.g. replacing #[cfg(FALSE)] field: u8 with PreexpTokenTree::OuterAttributes), and then propagate these changes back up the stack.

I think either of these implementations would be much more complicated than the current approach. I'm going to see if it's possible to eliminate the performance impact of this PR before exploring the alternatives.

Aaron1011 · 2020-09-30T00:12:34Z

@bors try @rust-timer queue

rust-timer · 2020-09-30T00:12:36Z

Awaiting bors try build completion

bors · 2020-09-30T00:12:51Z

⌛ Trying commit 5a048c438839606ab9e3db8d1055aa7b9c78b7d9 with merge f561c24c1fe21075a1a50561923515524aa1d33c...

bors · 2020-09-30T00:59:14Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: f561c24c1fe21075a1a50561923515524aa1d33c (f561c24c1fe21075a1a50561923515524aa1d33c)

rust-timer · 2020-09-30T00:59:17Z

Queued f561c24c1fe21075a1a50561923515524aa1d33c with parent 381b445, future comparison URL.

Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR rust-lang#76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see rust-lang#77250 (comment)). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.

Aaron1011 · 2020-10-19T18:01:56Z

@bors r=petrochenkov

bors · 2020-10-19T18:01:57Z

📌 Commit 593fdd3 has been approved by petrochenkov

bors · 2020-10-21T13:39:59Z

⌛ Testing commit 593fdd3 with merge b996a41c46a27fac7c931f52cbf04266cb213d12...

bors · 2020-10-21T14:15:42Z

💔 Test failed - checks-actions

Aaron1011 · 2020-10-21T14:17:59Z

warning: spurious network error (2 tries remaining): error inflating zlib stream; class=Zlib (5)
warning: spurious network error (1 tries remaining): error inflating zlib stream; class=Zlib (5)

@bors retry

bors · 2020-10-21T15:03:25Z

⌛ Testing commit 593fdd3 with merge 22e6b9c...

bors · 2020-10-21T17:44:16Z

☀️ Test successful - checks-actions, checks-azure
Approved by: petrochenkov
Pushing 22e6b9c to master...

Mark-Simulacrum · 2020-10-22T01:05:59Z

It seems like this ended up being a regression on a scattering of real world crates. @Aaron1011 -- do you think that's basically inevitable? I don't think we should revert this regardless.

Aaron1011 · 2020-10-22T02:08:45Z

That's odd - an earlier perf run appeared net neutral.

Right now, we're collecting tokens in more cases than is strictly necessary (builtin attributes other than cfg, cfg_attr, and derive don't need tokens). Some additional refactoring is needed before we can skip capturing tokens in these cases - hopefully, this will regain some of the performance lost by this PR.

petrochenkov · 2020-10-22T07:42:00Z

@Aaron1011
I'm still interested in trying splicing with the old token collection scheme.

Record a list of positions to allow us to traverse the topmost PreexpTokenStream to the proper position.

I don't think we need a list here, only a single number - index of the position in DFS traversal, or something similar to the number of next calls that you are counting here.
Or a pointer to the group (in some sense, perhaps Lrc or something) + offset in the group.

Right now, we're collecting tokens in more cases than is strictly necessary (builtin attributes other than cfg, cfg_attr, and derive don't need tokens).

fn maybe_needs_tokens here is really a layering violation, we shouldn't look at attribute names during parsing, it's a name resolution's job. I wouldn't want to go further in this direction.

…, r=petrochenkov Unconditionally capture tokens for attributes. This allows us to avoid synthesizing tokens in `prepend_attr`, since we have the original tokens available. We still need to synthesize tokens when expanding `cfg_attr`, but this is an unavoidable consequence of the syntax of `cfg_attr` - the user does not supply the `#` and `[]` tokens that a `cfg_attr` expands to. This is based on PR rust-lang#77250 - this PR exposes a bug in the current `collect_tokens` implementation, which is fixed by the rewrite.

For the `MiddleDot` case, current behaviour: - For a case like `1.2`, `sym1` is `1` and `sym2` is `2`, and `self.token` holds `1.2`. - It creates a new ident token from `sym1` that it puts into `self.token`. - Then it does `bump_with` with a new dot token, which moves the `sym1` token into `prev_token`. - Then it does `bump_with` with a new ident token from `sym2`, which moves the `dot` token into `prev_token` and discards the `sym1` token. - Then it does `bump`, which puts whatever is next into `self.token`, moves the `sym2` token into `prev_token`, and discards the `dot` token altogether. New behaviour: - Skips creating and inserting the `sym1` and dot tokens, because they are unnecessary. - This also demonstrates that the comment about `Spacing::Alone` is wrong -- that value is never used. That comment was added in rust-lang#77250, and AFAICT it has always been incorrect. The commit also expands comments. I found this code hard to read previously, the examples in comments make it easier.

rust-highfive assigned oli-obk Sep 27, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 27, 2020

rust-highfive assigned petrochenkov and unassigned oli-obk Sep 27, 2020

Aaron1011 mentioned this pull request Sep 27, 2020

Unconditionally capture tokens for attributes. #77255

Merged

petrochenkov reviewed Sep 27, 2020

View reviewed changes

petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 27, 2020

Aaron1011 force-pushed the feature/flat-token-collection branch from 653edbf to 5a048c4 Compare September 30, 2020 00:12

petrochenkov added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 19, 2020

Aaron1011 force-pushed the feature/flat-token-collection branch from f33fc5b to 593fdd3 Compare October 19, 2020 18:01

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 19, 2020

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Oct 21, 2020

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 21, 2020

bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 21, 2020

bors merged commit 22e6b9c into rust-lang:master Oct 21, 2020

rustbot added this to the 1.49.0 milestone Oct 21, 2020

dgrunwald mentioned this pull request Oct 26, 2020

ICE with "-Z ast-json": Attempted to encode LazyTokenStream #78398

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite `collect_tokens` implementations to use a flattened buffer #77250

Rewrite `collect_tokens` implementations to use a flattened buffer #77250

Aaron1011 commented Sep 27, 2020 •

edited

Loading

rust-highfive commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

rust-timer commented Sep 27, 2020

bors commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

bors commented Sep 27, 2020

rust-timer commented Sep 27, 2020

rust-timer commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

petrochenkov commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

petrochenkov commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

Aaron1011 commented Sep 30, 2020

rust-timer commented Sep 30, 2020

bors commented Sep 30, 2020

bors commented Sep 30, 2020

rust-timer commented Sep 30, 2020

Aaron1011 commented Oct 19, 2020

bors commented Oct 19, 2020

bors commented Oct 21, 2020

bors commented Oct 21, 2020

Aaron1011 commented Oct 21, 2020

bors commented Oct 21, 2020

bors commented Oct 21, 2020

Mark-Simulacrum commented Oct 22, 2020

Aaron1011 commented Oct 22, 2020

petrochenkov commented Oct 22, 2020

Rewrite collect_tokens implementations to use a flattened buffer #77250

Rewrite collect_tokens implementations to use a flattened buffer #77250

Conversation

Aaron1011 commented Sep 27, 2020 • edited Loading

rust-highfive commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

rust-timer commented Sep 27, 2020

bors commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

bors commented Sep 27, 2020

rust-timer commented Sep 27, 2020

rust-timer commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

petrochenkov commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

petrochenkov commented Sep 27, 2020

Aaron1011 commented Sep 27, 2020

Aaron1011 commented Sep 30, 2020

rust-timer commented Sep 30, 2020

bors commented Sep 30, 2020

bors commented Sep 30, 2020

rust-timer commented Sep 30, 2020

Aaron1011 commented Oct 19, 2020

bors commented Oct 19, 2020

bors commented Oct 21, 2020

bors commented Oct 21, 2020

Aaron1011 commented Oct 21, 2020

bors commented Oct 21, 2020

bors commented Oct 21, 2020

Mark-Simulacrum commented Oct 22, 2020

Aaron1011 commented Oct 22, 2020

petrochenkov commented Oct 22, 2020

Rewrite `collect_tokens` implementations to use a flattened buffer #77250

Rewrite `collect_tokens` implementations to use a flattened buffer #77250

Aaron1011 commented Sep 27, 2020 •

edited

Loading