Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid layout calculations in assert_bits to speed up match checking #57546

Closed
wants to merge 1 commit into from

Conversation

dotdash
Copy link
Contributor

@dotdash dotdash commented Jan 12, 2019

assert_bits ensures that the given type matches the type of the constant
value, and additionally performs a query for the layout of the given
type to get its size. This is then used to assert that it matches the
size of the constant. But since the types are already known to be the
same, this second check is unnecessary, and skipping it also allows to
skip the expensive layout query.

For the unicode_normalization crate, the match checking time drops from
about 3.8s to about 0.8s for me.

Ref #55528

cc unicode-rs/unicode-normalization#29

assert_bits ensures that the given type matches the type of the constant
value, and additionally performs a query for the layout of the given
type to get its size. This is then used to assert that it matches the
size of the constant. But since the types are already known to be the
same, this second check is unnecessary, and skipping it also allows to
skip the expensive layout query.

For the unicode_normalization crate, the match checking time drops from
about 3.8s to about 0.8s for me.
@dotdash
Copy link
Contributor Author

dotdash commented Jan 12, 2019

@bors try

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 12, 2019
@bors
Copy link
Contributor

bors commented Jan 12, 2019

⌛ Trying commit 42e996e with merge f81ba2b...

bors added a commit that referenced this pull request Jan 12, 2019
Avoid layout calculations in assert_bits to speed up match checking

assert_bits ensures that the given type matches the type of the constant
value, and additionally performs a query for the layout of the given
type to get its size. This is then used to assert that it matches the
size of the constant. But since the types are already known to be the
same, this second check is unnecessary, and skipping it also allows to
skip the expensive layout query.

For the unicode_normalization crate, the match checking time drops from
about 3.8s to about 0.8s for me.

Ref #55528

cc unicode-rs/unicode-normalization#29
@bors
Copy link
Contributor

bors commented Jan 12, 2019

☀️ Test successful - checks-travis
State: approved= try=True

@nagisa
Copy link
Member

nagisa commented Jan 12, 2019

@rust-timer build f81ba2b

@rust-timer
Copy link
Collaborator

Success: Queued f81ba2b with parent d6525ef, comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit f81ba2b

@Mark-Simulacrum
Copy link
Member

Performance looks neutral -- no changes up or down on perf.r-l.o. However, this does seem simpler so perhaps we should go ahead with it? I think @oli-obk or perhaps @RalfJung probably knows this code the best?

@dotdash
Copy link
Contributor Author

dotdash commented Jan 14, 2019

I did not expect much of a change for crates that aren't match-heavy, but it is simple, and for the really match-heavy unicode_normalization crate, it cuts away ~50% of the total compilation time.

Is there already something that uses match heavily on prlo? If not, could we maybe add the unicode_normalization crate?

@oli-obk
Copy link
Contributor

oli-obk commented Jan 14, 2019

for the really match-heavy unicode_normalization crate, it cuts away ~50% of the total compilation time.

I think we should add that crate to perfrlo, otherwise I'm certain it will regress again.

I'm amazed that there is much effect from such a change, as I would have assumed this to be a minor point in the matching code. For futher perf boons it's fine to additionally change the leftover type assertions to debug assertions.

@dotdash
Copy link
Contributor Author

dotdash commented Jan 14, 2019

I'm amazed that there is much effect from such a change, as I would have assumed this to be a minor point in the matching code.

It's "just" a two cache lookups each time AFAICT, but for that crate, it means >50M extra queries. It adds up ;-)

For futher perf boons it's fine to additionally change the leftover type assertions to debug assertions.

In that case, we should try to come up with a design that avoids creating the ParamEnvAnd instance then, because ParamEnv::and() now actually shows up high in the profile. Which actually seems kind of sad, because I suspect it's mostly hitting the case where the caller_bounds are empty, which could be optimized better, but I had no time to investigate that yet.

@oli-obk
Copy link
Contributor

oli-obk commented Jan 14, 2019

In the case of this PR, you can replace the ParamEnvAnd<Ty<'tcx>> with Ty<'tcx> as you're just checking the value part anyway ;)

The type assertion is just a pointer equality check in most situations

self.val.try_to_bits(size)
match self.val.try_to_scalar()? {
Scalar::Bits { bits, .. } => {
Some(bits)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fishy, we shouldn't usually return bits without checking that the size is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that they should already be correct because the type of the constant is the same. Sanity checks would have bailed long before if the number of bits didn't match the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are constants expected to have sizes that don't match their type's size? If so, the premise to this PR is broken. If not, it would seem more sensible to me to assert that the sizes match when the constant is created, rather than on each time we read from it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are constants expected to have sizes that don't match their type's size?

No, that's why there are assertions in place that check the size. If we happened to get the size wrong, that's a bug. These assertions just make sure that we don't screw up (as we have done before, this is not just hypothetical, it's easy to get wrong in some places, but less so nowadays).

If not, it would seem more sensible to me to assert that the sizes match when the constant is created, rather than on each time we read from it.

I believe that we are doing this now, not directly when creating the ty::Const, but during const_eval.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I am not very happy about removing a sanity check that costs no measurable performance and caught real bugs. So I'd r-.

But if @oli-obk vetoes me on this, that's fine for me. Just please add a comment stating very explicitly that we are deliberately omitting a sanity check here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, the assertion would have triggered then anyway. I've just been confused, because @RalfJung's comment sounded to me as if the type equality assertion is not enough, and the sizes could be mismatched regardless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type equality assertion is not enough, and the sizes could be mismatched regardless.

It is not enough and there could be a mismatch, pointing to a bug elsewhere. I am not sure if all Const constructors have this check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively we can completely move the sanity check to debug assertions. Some #[inline] annotations in appropriate places should make release mode optimize away the leftover arguments

@dotdash dotdash closed this Jan 14, 2019
@RalfJung
Copy link
Member

RalfJung commented Jan 14, 2019

For the unicode_normalization crate, the match checking time drops from
about 3.8s to about 0.8s for me.

Oh, I missed this, I just saw the other comment saying that perf did not change. If this does affect perf of real crates we should consider it, and moving the check to Const-construction time should help.

Any chance you could add a benchmark to rustc-perf that would capture this?

@jens1o
Copy link
Contributor

jens1o commented Mar 22, 2019

I wonder why this PR hasn't brought it over the finish-line?

@dotdash
Copy link
Contributor Author

dotdash commented Mar 23, 2019 via email

@oli-obk
Copy link
Contributor

oli-obk commented Mar 23, 2019

I'm doing some related cleanups in #59369

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants