-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress most of spans to 32 bits #44646
Conversation
|
src/librustc/util/common.rs
Outdated
@@ -61,7 +60,7 @@ pub enum ProfileQueriesMsg { | |||
/// end a task | |||
TaskEnd, | |||
/// begin a new query | |||
QueryBegin(Span, QueryMsg), | |||
QueryBegin(QueryMsg), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Span
is not Send
/Sync
if it uses thread-local interner and profile queries are sent to other threads, so I had to temporary remove spans from them.
I'll restore this back later, queries will have to use SpanData
instead of Span
(I hoped to keep it private, but it looks like there's no better choice :( ).
src/libsyntax_pos/span_encoding.rs
Outdated
} | ||
_ => unreachable!() | ||
}; | ||
SpanData { lo: BytePos(base), hi: BytePos(base + len), ctxt: SyntaxContext(ctxt) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If encoding/decoding with 2-bit tag looks too error-prone/unmaintainable (or turns out to be too slow), then I can redo this with 1-bit tag instead, it will reduce the persentage of inlined spans from 82.68% to 80.01% (on rustc/libstd data).
src/libsyntax_pos/lib.rs
Outdated
} | ||
|
||
#[allow(deprecated)] | ||
pub const DUMMY_SP: Span = Span { lo: BytePos(0), hi: BytePos(0), ctxt: NO_EXPANSION }; | ||
#[derive(Clone, Copy, PartialEq, Eq, Hash)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelwoerister
I'm not sure if Span
can derive Hash
or not (for incremental, etc).
Is hashing the 32-bit "index" enough for interned spans, or actual data must be hashed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incremental uses HashStable
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. this impl:
Line 234 in ef227f5
impl<'a, 'gcx, 'tcx> HashStable<StableHashingContext<'a, 'gcx, 'tcx>> for Span { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as @arielb1 says. We keep hashing for incr. comp. separate because often it's a lot expensive than what we need for a regular hash table and it often requires additional contextual information. So you don't have to worry about it.
You might want to provide a specialized implementation that hashes just one 32 bit value instead for bytes though (unless we do that anyway). But probably not worth the trouble.
Why are you using a |
Yes.
Are reads from |
We can run this through perf.rlo's benchmarks when ready, just run a try build and ping me on completion. |
That's right - otherwise packed structs would be quite useless. And #27060 (which I'm working on fixing right now) only means that some code that should be |
☔ The latest upstream changes (presumably #44654) made this pull request unmergeable. Please resolve the merge conflicts. |
This looks awesome, @petrochenkov! |
a45b07a
to
f069c88
Compare
Updated. |
Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister
☀️ Test successful - status-travis |
@arielb1, I tried typing the merge commit hashes into perf.rlo compare but that didn't seem to work for me. Did you have a different method for performance comparison in mind? |
ping @Mark-Simulacrum |
I notice that the interner uses a default |
I looked through the performance data a bit. What we try to optimize is memory, i.e. max-rss. On the other side there are more regressions in speed. I don't know what exactly the tests measure (e.g. how |
Many of the tests showing regressions are ones with incremental compilation activated. Incremental compilation has to expand all spans to Great find, @llogiq! This should definitely use |
@bors try |
🔒 Merge conflict |
e723bf6
to
cb1158f
Compare
@bors try |
Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister
@michaelwoerister
"32-bit span, 1-bit tag" still seems to be better |
Alright, thanks for checking! 32 bits with 1 bit tag seems to be a good choice indeed. Maybe If you make the interner use an |
src/libsyntax_pos/span_encoding.rs
Outdated
// option. This file may not be copied, modified, or distributed | ||
// except according to those terms. | ||
|
||
// Spans are encoded using 2-bit tag and 4 different encoding formats for each tag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment has rotted.
74f4271
to
52251cd
Compare
@michaelwoerister |
@bors r+ Thanks, @petrochenkov! Looking forward to seeing the results |
📌 Commit 52251cd has been approved by |
Compress most of spans to 32 bits As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28 Closes #15594 r? @michaelwoerister
☀️ Test successful - status-appveyor, status-travis |
Looks like this may have improved the memory of the tuple-stress benchmark by 5%! |
Optimize some span operations Do not decode span data twice/thrice/etc unnecessarily. Applied to stable hashing and all methods in `impl Span`. Follow up to rust-lang#44646 r? @michaelwoerister
Due the limitation that #[derive(...)] on #[repr(packed)] structs does not guarantee proper alignment of the compiler-generated impls is not guaranteed (rust-lang#39696), the change in rust-lang#44646 to compress Spans results in the compiler generating code with unaligned access. Until rust-lang#39696 has been fixed, the issue can be worked around by not using the packed attribute on sparc64 and sparcv9 on the Span struct. Fixes: rust-lang#45509
As described in https://internals.rust-lang.org/t/rfc-compiler-refactoring-spans/1357/28
Closes #15594
r? @michaelwoerister