-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc: Improve compile time of platform intrinsics #32236
Conversation
r? @arielb1 (rust_highfive has picked a reviewer for you, use r? to override) |
cc @huonw |
6e2956b
to
0c8b1ba
Compare
Aggregate(bool, Vec<Type>), | ||
Pointer(&'static Type, Option<&'static Type>, /* const */ bool), | ||
Vector(&'static Type, Option<&'static Type>, u8), | ||
Aggregate(bool, &'static &'static [&'static Type]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a &&[] so that it is just pointer-sized, not slice sized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does seem a little odd, actually. Since it's not going to change the overall size of the type, the alignment means that Pointer
and Vector
are 4 words big anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this was actually because the generated static
values had the type &'static [T]
and that couldn't be used by value (only by reference). I realize now though that all of the statics can have type [T; N]
and then we can use references and coercions to get the right type, so no there's no longer double indirection.
0c8b1ba
to
d92b717
Compare
cc me |
Why not store everything in rodata and just do a binary search (instead of a |
That's it, generate code of the form pub static INTRINSICS: &'static [Intrinsic] = &[
Intrinsic {
name: "movemask_ps",
inputs: &[&::F32x4],
output: &::I32,
definition: Named("llvm.x86.sse.movmsk.ps")
},
Intrinsic {
name: "storeu_ps",
inputs: &[&Type::Pointer(&::F32, Some(&::I8), false), &::F32x4],
output: &::VOID,
definition: Named("llvm.x86.sse.storeu.ps")
},
]; |
@arielb1 yeah that would probably work, I don't think it would improve compile times that much, more, however. |
I think it would be cleaner |
@alexcrichton last time (quite a while ago) I was dealing with matches of |
Yeah that would be nice to implement. OTOH, I think that being data-driven is the right thing to do here. |
We could also use something phf-like for static tables. I’m pretty sure it wouldn’t be too hard to do something smart about encoding tries into rodata either. Regardless, this PR stands very well on its own and we could see if there’s anything more to gain here independently. |
I'll go with what we have. @bors r+ |
📌 Commit d92b717 has been approved by |
Yeah I was actually thinking this'd be a neat case for phf, which gave me more inspiration to keep pushing on rustbuild :) |
⌛ Testing commit d92b717 with merge 4f8ab7d... |
rustc: Improve compile time of platform intrinsics This commit improves the compile time of `rustc_platform_intrinsics` from 23s to 3.6s if compiling with `-O` and from 77s to 17s if compiling with `-O -g`. The compiled rlib size also drops from 3.1M to 1.2M. The wins here were gained by removing the destructors associated with `Type` by removing the internal `Box` and `Vec` indirections. These destructors meant that a lot of landing pads and extra code were generated to manage the runtime representations. Instead everything can basically be statically computed and shoved into rodata, so all we need is a giant string compare to lookup what's what. Closes #28273
💔 Test failed - auto-linux-64-opt |
This commit improves the compile time of `rustc_platform_intrinsics` from 23s to 3.6s if compiling with `-O` and from 77s to 17s if compiling with `-O -g`. The compiled rlib size also drops from 3.1M to 1.2M. The wins here were gained by removing the destructors associated with `Type` by removing the internal `Box` and `Vec` indirections. These destructors meant that a lot of landing pads and extra code were generated to manage the runtime representations. Instead everything can basically be statically computed and shoved into rodata, so all we need is a giant string compare to lookup what's what. Closes rust-lang#28273
d92b717
to
87ede2d
Compare
rustc: Improve compile time of platform intrinsics This commit improves the compile time of `rustc_platform_intrinsics` from 23s to 3.6s if compiling with `-O` and from 77s to 17s if compiling with `-O -g`. The compiled rlib size also drops from 3.1M to 1.2M. The wins here were gained by removing the destructors associated with `Type` by removing the internal `Box` and `Vec` indirections. These destructors meant that a lot of landing pads and extra code were generated to manage the runtime representations. Instead everything can basically be statically computed and shoved into rodata, so all we need is a giant string compare to lookup what's what. Closes #28273
This commit improves the compile time of
rustc_platform_intrinsics
from 23s to3.6s if compiling with
-O
and from 77s to 17s if compiling with-O -g
. Thecompiled rlib size also drops from 3.1M to 1.2M.
The wins here were gained by removing the destructors associated with
Type
byremoving the internal
Box
andVec
indirections. These destructors meant thata lot of landing pads and extra code were generated to manage the runtime
representations. Instead everything can basically be statically computed and
shoved into rodata, so all we need is a giant string compare to lookup what's
what.
Closes #28273