-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proc_macro/bridge: stop using a remote object handle for proc_macro Ident and Literal #98189
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
(rust-highfive has picked a reviewer for you, use r? to override) |
r? @eddyb |
3f1d648
to
171bb3d
Compare
This comment has been minimized.
This comment has been minimized.
171bb3d
to
84de2a8
Compare
☔ The latest upstream changes (presumably #98186) made this pull request unmergeable. Please resolve the merge conflicts. |
a4d0244
to
266d51b
Compare
proc_macro/bridge: cache static spans in proc_macro's client thread-local state This is the second part of rust-lang#86822, split off as requested in rust-lang#86822 (review). This patch removes the RPC calls required for the very common operations of `Span::call_site()`, `Span::def_site()` and `Span::mixed_site()`. Some notes: This part is one of the ones I don't love as a final solution from a design standpoint, because I don't like how the spans are serialized immediately at macro invocation. I think a more elegant solution might've been to reserve special IDs for `call_site`, `def_site`, and `mixed_site` at compile time (either starting at 1 or from `u32::MAX`) and making reading a Span handle automatically map these IDs to the relevant values, rather than doing extra serialization. This would also have an advantage for potential future work to allow `proc_macro` to operate more independently from the compiler (e.g. to reduce the necessity of `proc-macro2`), as methods like `Span::call_site()` could be made to function without access to the compiler backend. That was unfortunately tricky to do at the time, as this was the first part I wrote of the patches. After the later part (rust-lang#98188, rust-lang#98189), the other uses of `InternedStore` are removed meaning that a custom serialization strategy for `Span` is easier to implement. If we want to go that path, we'll still need the majority of the work to split the bridge object and introduce the `Context` trait for free methods, and it will be easier to do after `Span` is the only user of `InternedStore` (after rust-lang#98189).
266d51b
to
d49d742
Compare
☔ The latest upstream changes (presumably #98188) made this pull request unmergeable. Please resolve the merge conflicts. |
d49d742
to
eed3106
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit eed310647a8cb28a4ddabf5d43033cb1618f61a0 with merge 63a4e3d54a9ba6a95c75507b75a2b28fbfdbfa5c... |
☀️ Try build successful - checks-actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a couple nits (sorry for the delay!)
Unfortunately, as it is difficult to depend on crates from within proc_macro, this is done by vendoring a copy of the hasher as a module rather than depending on the rustc_hash crate. This probably doesn't have a substantial impact up-front, however will be more relevant once symbols are interned within the proc_macro client.
This was removed in a previous part, however it should be specialized for to_string performance and consistency.
Doing this for all unicode identifiers would require a dependency on `unicode-normalization` and `rustc_lexer`, which is currently not possible for `proc_macro` due to it being built concurrently with `std` and `core`. Instead, ASCII identifiers are validated locally, and an RPC message is used to validate unicode identifiers when needed. String values are interned on the both the server and client when deserializing, to avoid unnecessary copies and keep Ident cheap to copy and move. This appears to be important for performance. The client-side interner is based roughly on the one from rustc_span, and uses an arena inspired by rustc_arena. RPC messages passing symbols always include the full value. This could potentially be optimized in the future if it is revealed to be a performance bottleneck. Despite now having a relevant implementaion of Display for Ident, ToString is still specialized, as it is a hot-path for this object. The symbol infrastructure will also be used for literals in the next part.
This builds on the symbol infrastructure built for `Ident` to replicate the `LitKind` and `Lit` structures in rustc within the `proc_macro` client, allowing literals to be fully created and interacted with from the client thread. Only parsing and subspan operations still require sync RPC.
This method is still only used for Literal::subspan, however the implementation only depends on the Span component, so it is simpler and more efficient for now to pass down only the information that is needed. In the future, if more information about the Literal is required in the implementation (e.g. to validate that spans line up as expected with source text), that extra information can be added back with extra arguments.
7eaf669
to
c4acac6
Compare
Some changes occurred in library/proc_macro/src/bridge cc @rust-lang/wg-rls-2 |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (c3f3550): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Footnotes |
Odd, I wonder what changed since #98189 (comment) to make opt builds of GIven that we basically will always build macros as debug builds (IIRC) I think that this regression likely isn't super relevant, but perhaps the |
It seems like there isn't any actionable items to address here, and the improvements outweigh the regressions. I'll mark this as triaged. @rustbot label: +perf-regression-triaged |
…idge This is done by having the crossbeam dependency inserted into the proc_macro server code from the server side, to avoid adding a dependency to proc_macro. In addition, this introduces a -Z command-line option which will switch rustc to run proc-macros using this cross-thread executor. With the changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the performance of the executor should be much closer to same-thread execution. In local testing, the crossbeam executor was substantially more performant than either of the two existing CrossThread strategies, so they have been removed to keep things simple.
proc_macro: use crossbeam channels for the proc_macro cross-thread bridge This is done by having the crossbeam dependency inserted into the `proc_macro` server code from the server side, to avoid adding a dependency to `proc_macro`. In addition, this introduces a -Z command-line option which will switch rustc to run proc-macros using this cross-thread executor. With the changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the performance of the executor should be much closer to same-thread execution. In local testing, the crossbeam executor was substantially more performant than either of the two existing `CrossThread` strategies, so they have been removed to keep things simple. r? `@eddyb`
This is the fourth part of #86822, split off as requested in #86822 (review). This patch transforms the
Ident
andGroup
types into structs serialized over IPC rather than handles.Symbol values are interned on both the client and server when deserializing, to avoid unnecessary string copies and keep the size of
TokenTree
down. To do the interning efficiently on the client, the proc-macro crate is given a vendored version of the fxhash hasher, asSipHash
appeared to cause performance issues. This was done rather than depending onrustc_hash
as it is unfortunately difficult to depend on crates from withinproc_macro
due to it being built at the same time asstd
.In addition, a custom arena allocator and symbol store was also added, inspired by those in
rustc_arena
andrustc_span
. To prevent symbol re-use across multiple invocations of a macro on the same thread, a new range ofSymbol
names are used for each invocation of the macro, and symbols from previous invocations are cleaned-up.In order to keep
Ident
creation efficient, a special ASCII-only case was added to perform ident validation without using RPC for simple identifiers. Full identifier validation couldn't be easily added, as it would require depending on therustc_lexer
andunicode-normalization
crates from withinproc_macro
. Unicode identifiers are validated and normalized using RPC.See the individual commit messages for more details on trade-offs and design decisions behind these patches.