Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite collect_tokens implementations to use a flattened buffer #77250

Merged
merged 1 commit into from
Oct 21, 2020

Conversation

Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Sep 27, 2020

Instead of trying to collect tokens at each depth, we 'flatten' the
stream as we go allong, pushing open/close delimiters to our buffer
just like regular tokens. One capturing is complete, we reconstruct a
nested TokenTree::Delimited structure, producing a normal
TokenStream.

The reconstructed TokenStream is not created immediately - instead, it is
produced on-demand by a closure (wrapped in a new LazyTokenStream type). This
closure stores a clone of the original TokenCursor, plus a record of the
number of calls to next()/next_desugared(). This is sufficient to reconstruct
the tokenstream seen by the callback without storing any additional state. If
the tokenstream is never used (e.g. when a captured macro_rules! argument is
never passed to a proc macro), we never actually create a TokenStream.

This implementation has a number of advantages over the previous one:

  • It is significantly simpler, with no edge cases around capturing the
    start/end of a delimited group.

  • It can be easily extended to allow replacing tokens an an arbitrary
    'depth' by just using Vec::splice at the proper position. This is
    important for PR [WIP] Token-based outer attributes handling #76130, which requires us to track information about
    attributes along with tokens.

  • The lazy approach to TokenStream construction allows us to easily
    parse an AST struct, and then decide after the fact whether we need a
    TokenStream. This will be useful when we start collecting tokens for
    Attribute - we can discard the LazyTokenStream if the parsed
    attribute doesn't need tokens (e.g. is a builtin attribute).

The performance impact seems to be neglibile (see
#77250 (comment)). There is a
small slowdown on a few benchmarks, but it only rises above 1% for incremental
builds, where it represents a larger fraction of the much smaller instruction
count. There a ~1% speedup on a few other incremental benchmarks - my guess is
that the speedups and slowdowns will usually cancel out in practice.

@rust-highfive
Copy link
Collaborator

r? @oli-obk

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 27, 2020
@Aaron1011
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 27, 2020

⌛ Trying commit 653edbfd18abfb56203811db0d974b9d5dc89d30 with merge f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa...

@Aaron1011
Copy link
Member Author

r? @petrochenkov

@bors
Copy link
Contributor

bors commented Sep 27, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa (f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa)

@rust-timer
Copy link
Collaborator

Queued f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa with parent 623fb90, future comparison URL.

compiler/rustc_parse/src/parser/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/mod.rs Outdated Show resolved Hide resolved
@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 27, 2020
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (f1e10e3e4f98fcb126962244ad3a9dc5b82e9dfa): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@Aaron1011
Copy link
Member Author

From doing some local profiling, the slowdown appears to be due to the fact that there Parser.clone() on the successful path for parsing various things (not just when we've already encountered an error). This requires us to clone the entire flattened tokenstream - previously, many of these tokens would have been inside a TokenTree::Delimited, allowing us to just clone an Lrc.

@petrochenkov
Copy link
Contributor

Hmm, why didn't this show up in #76130?

@Aaron1011
Copy link
Member Author

@petrochenkov: I added logic to skip token collection for attribute targets in more cases (for example, https://github.com/Aaron1011/rust/blob/feature/new-preexp-cfg-tmp/compiler/rustc_parse/src/parser/attr.rs#L541-L555). It looks like this may have masked the fact that token collection was more expensive when we actually performed it.

@petrochenkov
Copy link
Contributor

this makes it easy to modify the captured stream at an
arbitrary depth, simply by replacing tokens in the flattened buffer.
This is important for PR #76130, which requires us to track information
about attributes along with tokens.

Tokens that need to be replaced in #76130 should all be at the same level, right?
So, the replacement is not fundamentally impossible, just needs some tree walking to get to the right level?

@Aaron1011
Copy link
Member Author

So, the replacement is not fundamentally impossible, just needs some tree walking to get to the right level?

Yes - however, the implementation ended up being significantly more complicated when I tried to do it that way.

The issue is that we can have code like this:

#[derive(MyDerive)]
struct Bar {
    val: [u8; {
		struct Inner {
			#[cfg(FALSE)] field: u8
		}
		0
	}]
}

We want the tokens for Bar to (eventually) end up with a PreexpTokenStream::OuterAttributes for field: u8. However, there can be arbitrary many nested delimited streams between where we start collecting tokens struct Bar and the nested attributes (#[cfg(FALSE)]).

The current implementation of collect_tokens will just clone the top-level { val: ... } delimited stream, and push it to the buffer - without ever recursing into it. In order to get it to the right struture, we would need to either:

  1. Record a list of positions to allow us to traverse the topmost PreexpTokenStream to the proper position.
  2. Keep track of the modifiations we apply at a particular frame (e.g. replacing #[cfg(FALSE)] field: u8 with PreexpTokenTree::OuterAttributes), and then propagate these changes back up the stack.

I think either of these implementations would be much more complicated than the current approach. I'm going to see if it's possible to eliminate the performance impact of this PR before exploring the alternatives.

@Aaron1011
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 30, 2020

⌛ Trying commit 5a048c438839606ab9e3db8d1055aa7b9c78b7d9 with merge f561c24c1fe21075a1a50561923515524aa1d33c...

@bors
Copy link
Contributor

bors commented Sep 30, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: f561c24c1fe21075a1a50561923515524aa1d33c (f561c24c1fe21075a1a50561923515524aa1d33c)

@rust-timer
Copy link
Collaborator

Queued f561c24c1fe21075a1a50561923515524aa1d33c with parent 381b445, future comparison URL.

@petrochenkov petrochenkov added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 19, 2020
Instead of trying to collect tokens at each depth, we 'flatten' the
stream as we go allong, pushing open/close delimiters to our buffer
just like regular tokens. One capturing is complete, we reconstruct a
nested `TokenTree::Delimited` structure, producing a normal
`TokenStream`.

The reconstructed `TokenStream` is not created immediately - instead, it is
produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This
closure stores a clone of the original `TokenCursor`, plus a record of the
number of calls to `next()/next_desugared()`. This is sufficient to reconstruct
the tokenstream seen by the callback without storing any additional state. If
the tokenstream is never used (e.g. when a captured `macro_rules!` argument is
never passed to a proc macro), we never actually create a `TokenStream`.

This implementation has a number of advantages over the previous one:

* It is significantly simpler, with no edge cases around capturing the
  start/end of a delimited group.

* It can be easily extended to allow replacing tokens an an arbitrary
  'depth' by just using `Vec::splice` at the proper position. This is
  important for PR rust-lang#76130, which requires us to track information about
  attributes along with tokens.

* The lazy approach to `TokenStream` construction allows us to easily
  parse an AST struct, and then decide after the fact whether we need a
  `TokenStream`. This will be useful when we start collecting tokens for
  `Attribute` - we can discard the `LazyTokenStream` if the parsed
  attribute doesn't need tokens (e.g. is a builtin attribute).

The performance impact seems to be neglibile (see
rust-lang#77250 (comment)). There is a
small slowdown on a few benchmarks, but it only rises above 1% for incremental
builds, where it represents a larger fraction of the much smaller instruction
count. There a ~1% speedup on a few other incremental benchmarks - my guess is
that the speedups and slowdowns will usually cancel out in practice.
@Aaron1011 Aaron1011 force-pushed the feature/flat-token-collection branch from f33fc5b to 593fdd3 Compare October 19, 2020 18:01
@Aaron1011
Copy link
Member Author

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Oct 19, 2020

📌 Commit 593fdd3 has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 19, 2020
@bors
Copy link
Contributor

bors commented Oct 21, 2020

⌛ Testing commit 593fdd3 with merge b996a41c46a27fac7c931f52cbf04266cb213d12...

@bors
Copy link
Contributor

bors commented Oct 21, 2020

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Oct 21, 2020
@Aaron1011
Copy link
Member Author

warning: spurious network error (2 tries remaining): error inflating zlib stream; class=Zlib (5)
warning: spurious network error (1 tries remaining): error inflating zlib stream; class=Zlib (5)

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 21, 2020
@bors
Copy link
Contributor

bors commented Oct 21, 2020

⌛ Testing commit 593fdd3 with merge 22e6b9c...

@bors
Copy link
Contributor

bors commented Oct 21, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: petrochenkov
Pushing 22e6b9c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 21, 2020
@bors bors merged commit 22e6b9c into rust-lang:master Oct 21, 2020
@rustbot rustbot added this to the 1.49.0 milestone Oct 21, 2020
@Mark-Simulacrum
Copy link
Member

It seems like this ended up being a regression on a scattering of real world crates. @Aaron1011 -- do you think that's basically inevitable? I don't think we should revert this regardless.

@Aaron1011
Copy link
Member Author

That's odd - an earlier perf run appeared net neutral.

Right now, we're collecting tokens in more cases than is strictly necessary (builtin attributes other than cfg, cfg_attr, and derive don't need tokens). Some additional refactoring is needed before we can skip capturing tokens in these cases - hopefully, this will regain some of the performance lost by this PR.

@petrochenkov
Copy link
Contributor

@Aaron1011
I'm still interested in trying splicing with the old token collection scheme.

Record a list of positions to allow us to traverse the topmost PreexpTokenStream to the proper position.

I don't think we need a list here, only a single number - index of the position in DFS traversal, or something similar to the number of next calls that you are counting here.
Or a pointer to the group (in some sense, perhaps Lrc or something) + offset in the group.

Right now, we're collecting tokens in more cases than is strictly necessary (builtin attributes other than cfg, cfg_attr, and derive don't need tokens).

fn maybe_needs_tokens here is really a layering violation, we shouldn't look at attribute names during parsing, it's a name resolution's job. I wouldn't want to go further in this direction.

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 24, 2020
…, r=petrochenkov

Unconditionally capture tokens for attributes.

This allows us to avoid synthesizing tokens in `prepend_attr`, since we
have the original tokens available.

We still need to synthesize tokens when expanding `cfg_attr`,
but this is an unavoidable consequence of the syntax of `cfg_attr` -
the user does not supply the `#` and `[]` tokens that a `cfg_attr`
expands to.

This is based on PR rust-lang#77250 - this PR exposes a bug in the current `collect_tokens` implementation, which is fixed by the rewrite.
nnethercote added a commit to nnethercote/rust that referenced this pull request Mar 22, 2024
For the `MiddleDot` case, current behaviour:
- For a case like `1.2`, `sym1` is `1` and `sym2` is `2`, and `self.token`
  holds `1.2`.
- It creates a new ident token from `sym1` that it puts into `self.token`.
- Then it does `bump_with` with a new dot token, which moves the `sym1`
  token into `prev_token`.
- Then it does `bump_with` with a new ident token from `sym2`, which moves the
  `dot` token into `prev_token` and discards the `sym1` token.
- Then it does `bump`, which puts whatever is next into `self.token`,
  moves the `sym2` token into `prev_token`, and discards the `dot` token
  altogether.

New behaviour:
- Skips creating and inserting the `sym1` and dot tokens, because they are
  unnecessary.
- This also demonstrates that the comment about `Spacing::Alone` is
  wrong -- that value is never used. That comment was added in rust-lang#77250,
  and AFAICT it has always been incorrect.

The commit also expands comments. I found this code hard to read
previously, the examples in comments make it easier.
nnethercote added a commit to nnethercote/rust that referenced this pull request Mar 25, 2024
For the `MiddleDot` case, current behaviour:
- For a case like `1.2`, `sym1` is `1` and `sym2` is `2`, and `self.token`
  holds `1.2`.
- It creates a new ident token from `sym1` that it puts into `self.token`.
- Then it does `bump_with` with a new dot token, which moves the `sym1`
  token into `prev_token`.
- Then it does `bump_with` with a new ident token from `sym2`, which moves the
  `dot` token into `prev_token` and discards the `sym1` token.
- Then it does `bump`, which puts whatever is next into `self.token`,
  moves the `sym2` token into `prev_token`, and discards the `dot` token
  altogether.

New behaviour:
- Skips creating and inserting the `sym1` and dot tokens, because they are
  unnecessary.
- This also demonstrates that the comment about `Spacing::Alone` is
  wrong -- that value is never used. That comment was added in rust-lang#77250,
  and AFAICT it has always been incorrect.

The commit also expands comments. I found this code hard to read
previously, the examples in comments make it easier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants