Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle ASCII identifiers quickly. #94

Merged
merged 1 commit into from
May 29, 2018
Merged

Conversation

nnethercote
Copy link
Contributor

Because they're very common. rustc's lexer has the same optimization.

This speeds up various runs of cargo in rustc-perf, the best by 4%.

I admit to not understanding how proc-macro2 is used when rustc builds cargo. I don't need to recompile rustc for the changes to take effect, speedwise, so somehow proc-macro2 must be both compiled and run while cargo is being compiled.

Because they're very common. rustc's lexer has the same optimization.

This speeds up various runs of `cargo` in rustc-perf, the best by 4%.
@alexcrichton
Copy link
Contributor

Nice!

FWIW @nnethercote there are a lot of inefficiencies in proc-macro2, it's not currently engineered at all to be speedy. That being said it also means there's a lot of opportunities for improvement :)

You're likely seeing this related to custom derive (aka #[derive(Serialize)] for serde). Way under the hood of that proc-macro2 is being used. Eventually proc-macro2 will be implemented with the optimized routines in the compiler so it will be less important to optimize the src/stable.rs file (as it won't be used much of the time).

In general though we were under the assumption that custom derive was such a small fraction of the compilation process that we wouldn't need to optimize it too much here. If that's not the case though that could be problematic...

In any case there's probably much lower hanging fruit in the quote crate we may want to explore before merging this (if you'd like to push on this route of optimization). The quote crate is the primary way tokens are created in procedural macros, but is currently pretty inefficient. The quote! macro currently bottoms out in this large macro which primarily ends in this clause.

That clause at the end of the quote! maro means that if you do quote! { use foo::bar::baz; } it'll stringify and parse all of use, foo, ::, bar, ::, baz, and ;. That's pretty slow! The compiler, additionally, already knows the types of tokens there. I think an easy route of optimization may be to add a clause to that with $first:ident to catch all identifiers. I think that would avoid any usage of the parsing code you've optimized here as I'd guess that most tokens are parsed through that API.


In any case that's a lot of information and there's definitely more to this rabbit hole! I'm fine with whichever strategy you'd like. Would you prefer to land this now and look into those optimizations later?

@dtolnay
Copy link
Owner

dtolnay commented May 28, 2018

I agree, it seems likely that it is all from that macro rule. A Deserialize impl expands to quite a lot of ident tokens.

@nnethercote
Copy link
Contributor Author

Thank you for the detailed explanation.

You are right about it being related to #[derive(Serialize)]; there were serde_derive entries in the stack traces.

In terms of the broader optimization picture: cargo was recently added to the rustc-perf benchmarks. I have looked at a lot of rustc profiles lately and cargo is the only workload for which proc-macros2 has been hot. This change minimizes most of that hotness, and I haven't seen anything from the quote crate showing up in profiles, so I think this is good to land as-is. Though perhaps we should also bump the version number so that the new version can be pulled into rustc?

@nnethercote
Copy link
Contributor Author

I spoke too soon. I see that the strnom.rs file in proc-macro2 is also hot, presumably doing parsing related to custom derive. strnom.rs is a custom version of nom.

Similarly, when compiling style-servo from rustc-perf the synom crate is hot. It is a different customized version of nom, though with similar characteristics to strnom.rs. It's used by the syn crate, which is used by the cssparser-macros crate.

I feel like I've just grabbed ahold of a loose thread hanging out of a tangled ball of wool...

@nnethercote
Copy link
Contributor Author

@alexcrichton: on IRC you gave me a patch for quote that "in theory should eliminate stringifying and parsing any identifiers anywhere". It works well, but appears to not eliminate all identifier parsing.

Measurements, on a "clean incremental, check" build of cargo:

  • Original: 14.5B instructions
  • With the XID patch (this PR): 14.0B
  • With the quote patch: 13.3B
  • With both patches: 13.0B

So this PR still seems worth landing.

@dtolnay
Copy link
Owner

dtolnay commented May 29, 2018

The remaining invocations after the quote patch are probably from Syn parsing the input on the way into the custom derive macro.

@dtolnay dtolnay merged commit 5fc2d84 into dtolnay:master May 29, 2018
@nnethercote nnethercote deleted the avoid-xid branch May 29, 2018 04:11
@nnethercote
Copy link
Contributor Author

@alexcrichton: I could file a PR for your quote change, but it might be easier if you do it given that you are the patch author. Let me know what you'd prefer.

@dtolnay
Copy link
Owner

dtolnay commented May 29, 2018

This patch as written unfortunately requires Rust 1.20+ and quote currently supports 1.15. Maybe there is a different way to do it? I am not planning to bump rustc version requirements until after 2018 edition ships.

@nnethercote
Copy link
Contributor Author

Maybe there is a different way to do it?

I can see how to optimize the identifier parsing, but the benefit would be smaller than avoiding (most of) the identifier parsing altogether.

@nnethercote
Copy link
Contributor Author

This patch as written unfortunately requires Rust 1.20+ and quote currently supports 1.15. Maybe there is a different way to do it? I am not planning to bump rustc version requirements until after 2018 edition ships.

@dtolnay: is the patch still relevant? Rust 2018 has shipped now.

@dtolnay
Copy link
Owner

dtolnay commented Jan 9, 2019

Opened a PR: dtolnay/quote#91.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants