-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid layout calculations in assert_bits to speed up match checking #57546
Conversation
assert_bits ensures that the given type matches the type of the constant value, and additionally performs a query for the layout of the given type to get its size. This is then used to assert that it matches the size of the constant. But since the types are already known to be the same, this second check is unnecessary, and skipping it also allows to skip the expensive layout query. For the unicode_normalization crate, the match checking time drops from about 3.8s to about 0.8s for me.
@bors try |
(rust_highfive has picked a reviewer for you, use r? to override) |
Avoid layout calculations in assert_bits to speed up match checking assert_bits ensures that the given type matches the type of the constant value, and additionally performs a query for the layout of the given type to get its size. This is then used to assert that it matches the size of the constant. But since the types are already known to be the same, this second check is unnecessary, and skipping it also allows to skip the expensive layout query. For the unicode_normalization crate, the match checking time drops from about 3.8s to about 0.8s for me. Ref #55528 cc unicode-rs/unicode-normalization#29
☀️ Test successful - checks-travis |
@rust-timer build f81ba2b |
Success: Queued f81ba2b with parent d6525ef, comparison URL. |
Finished benchmarking try commit f81ba2b |
I did not expect much of a change for crates that aren't match-heavy, but it is simple, and for the really match-heavy unicode_normalization crate, it cuts away ~50% of the total compilation time. Is there already something that uses match heavily on prlo? If not, could we maybe add the unicode_normalization crate? |
I think we should add that crate to perfrlo, otherwise I'm certain it will regress again. I'm amazed that there is much effect from such a change, as I would have assumed this to be a minor point in the matching code. For futher perf boons it's fine to additionally change the leftover type assertions to debug assertions. |
It's "just" a two cache lookups each time AFAICT, but for that crate, it means >50M extra queries. It adds up ;-)
In that case, we should try to come up with a design that avoids creating the ParamEnvAnd instance then, because ParamEnv::and() now actually shows up high in the profile. Which actually seems kind of sad, because I suspect it's mostly hitting the case where the caller_bounds are empty, which could be optimized better, but I had no time to investigate that yet. |
In the case of this PR, you can replace the The type assertion is just a pointer equality check in most situations |
self.val.try_to_bits(size) | ||
match self.val.try_to_scalar()? { | ||
Scalar::Bits { bits, .. } => { | ||
Some(bits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fishy, we shouldn't usually return bits
without checking that the size is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that they should already be correct because the type of the constant is the same. Sanity checks would have bailed long before if the number of bits didn't match the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are constants expected to have sizes that don't match their type's size? If so, the premise to this PR is broken. If not, it would seem more sensible to me to assert that the sizes match when the constant is created, rather than on each time we read from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are constants expected to have sizes that don't match their type's size?
No, that's why there are assertions in place that check the size. If we happened to get the size wrong, that's a bug. These assertions just make sure that we don't screw up (as we have done before, this is not just hypothetical, it's easy to get wrong in some places, but less so nowadays).
If not, it would seem more sensible to me to assert that the sizes match when the constant is created, rather than on each time we read from it.
I believe that we are doing this now, not directly when creating the ty::Const
, but during const_eval
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I am not very happy about removing a sanity check that costs no measurable performance and caught real bugs. So I'd r-.
But if @oli-obk vetoes me on this, that's fine for me. Just please add a comment stating very explicitly that we are deliberately omitting a sanity check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, the assertion would have triggered then anyway. I've just been confused, because @RalfJung's comment sounded to me as if the type equality assertion is not enough, and the sizes could be mismatched regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the type equality assertion is not enough, and the sizes could be mismatched regardless.
It is not enough and there could be a mismatch, pointing to a bug elsewhere. I am not sure if all Const
constructors have this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively we can completely move the sanity check to debug assertions. Some #[inline]
annotations in appropriate places should make release mode optimize away the leftover arguments
Oh, I missed this, I just saw the other comment saying that perf did not change. If this does affect perf of real crates we should consider it, and moving the check to Any chance you could add a benchmark to rustc-perf that would capture this? |
I wonder why this PR hasn't brought it over the finish-line? |
A mix of me not having the best mood back then (sorry if anyone felt
offended) and lacking the time to either show that this approach is fine (I
felt that the assertion lost its original rationale when the code changed
over time, but I can't, well, prove that) or come up with a solution that
at least caches the sizes locally in the match checking code, which would
just bypass the assertion except for the first lookup.
If anyone wants to pick this up, feel free to.
Jens Hausdorf <notifications@github.com> schrieb am Fr., 22. März 2019,
20:19:
… I wonder why this PR hasn't brought it over the finish-line?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#57546 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAOGMvN4qCMEilFj-IFnBMmmoFUgD1EJks5vZSzZgaJpZM4Z8rbB>
.
|
I'm doing some related cleanups in #59369 |
assert_bits ensures that the given type matches the type of the constant
value, and additionally performs a query for the layout of the given
type to get its size. This is then used to assert that it matches the
size of the constant. But since the types are already known to be the
same, this second check is unnecessary, and skipping it also allows to
skip the expensive layout query.
For the unicode_normalization crate, the match checking time drops from
about 3.8s to about 0.8s for me.
Ref #55528
cc unicode-rs/unicode-normalization#29