[EXPERIMENT] Use wider types in Layout multiplication #100866

scottmcm · 2022-08-22T05:09:09Z

This lets us phrase it as just one check, rather than two, and might make it easier on LLVM to optimize. It still passes the codegen test from #99174 without needing the manual optimization anymore, so let's see whether perf likes it.

We've picked up llvm/llvm-project#56563, so LLVM is now smarter about optimizing mul nuw with constants, which is what this is frequently emitting (because the type size often comes from a generic type parameter).

The IR/ASM looks pretty good for this approach too. For comparisons, see

https://rust.godbolt.org/z/ssjr1Yeas (64-bit)
https://rust.godbolt.org/z/9f8c45b7W (32-bit)

cc @CAD97 & #99117

…sics This lets us phrase it as just one check, rather than two, and might make it easier on LLVM to optimize. It still passes the codegen test without needing the manual optimization anymore, so let's see.

rust-highfive · 2022-08-22T05:09:12Z

r? @thomcc

(rust-highfive has picked a reviewer for you, use r? to override)

scottmcm · 2022-08-22T07:29:37Z

@bors try @rust-timer queue

rust-timer · 2022-08-22T07:29:38Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-08-22T07:29:46Z

⌛ Trying commit d8ed3d2 with merge 1bd20b5b3b52926ba90895b739c9b4b5472f4adf...

bors · 2022-08-22T08:50:17Z

☀️ Try build successful - checks-actions
Build commit: 1bd20b5b3b52926ba90895b739c9b4b5472f4adf (1bd20b5b3b52926ba90895b739c9b4b5472f4adf)

rust-timer · 2022-08-22T08:50:19Z

Queued 1bd20b5b3b52926ba90895b739c9b4b5472f4adf with parent d0ea1d7, future comparison URL.

rust-timer · 2022-08-22T11:13:22Z

Finished benchmarking commit (1bd20b5b3b52926ba90895b739c9b4b5472f4adf): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	max	count²
Regressions ❌ (primary)	1.0%	2.1%	33
Regressions ❌ (secondary)	0.9%	1.7%	6
Improvements ✅ (primary)	-0.5%	-0.6%	5
Improvements ✅ (secondary)	-0.5%	-0.7%	20
All ❌✅ (primary)	0.8%	2.1%	38

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	max	count²
Regressions ❌ (primary)	2.1%	2.2%	3
Regressions ❌ (secondary)	2.2%	2.2%	1
Improvements ✅ (primary)	-7.1%	-14.8%	3
Improvements ✅ (secondary)	-2.1%	-3.3%	4
All ❌✅ (primary)	-2.5%	-14.8%	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	max	count²
Regressions ❌ (primary)	2.4%	2.9%	5
Regressions ❌ (secondary)	3.5%	5.1%	10
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.4%	2.9%	5

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

scottmcm · 2022-08-22T21:19:45Z

Some surprising wins in doc builds, somehow, but clearly not a good change overall.

Use wider types in Layout multiplication, rather than overflow intrin…

d8ed3d2

…sics This lets us phrase it as just one check, rather than two, and might make it easier on LLVM to optimize. It still passes the codegen test without needing the manual optimization anymore, so let's see.

rust-highfive assigned thomcc Aug 22, 2022

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Aug 22, 2022

This comment was marked as resolved.

Sign in to view

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 22, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 22, 2022

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 22, 2022

scottmcm closed this Aug 22, 2022

scottmcm deleted the wide-layout-experiment branch August 22, 2022 21:20

scottmcm mentioned this pull request Nov 24, 2023

Indicate that multiplication in Layout::array cannot overflow #118228

Merged

scottmcm mentioned this pull request Apr 19, 2024

LLVM failed to use the knowledge from a never-overflow assumption rust-lang/hashbrown#509

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EXPERIMENT] Use wider types in Layout multiplication #100866

[EXPERIMENT] Use wider types in Layout multiplication #100866

scottmcm commented Aug 22, 2022 •

edited

Loading

This comment was marked as resolved.

rust-highfive commented Aug 22, 2022

scottmcm commented Aug 22, 2022

rust-timer commented Aug 22, 2022

bors commented Aug 22, 2022

bors commented Aug 22, 2022

rust-timer commented Aug 22, 2022

rust-timer commented Aug 22, 2022

scottmcm commented Aug 22, 2022

[EXPERIMENT] Use wider types in Layout multiplication #100866

[EXPERIMENT] Use wider types in Layout multiplication #100866

Conversation

scottmcm commented Aug 22, 2022 • edited Loading

This comment was marked as resolved.

rust-highfive commented Aug 22, 2022

scottmcm commented Aug 22, 2022

rust-timer commented Aug 22, 2022

bors commented Aug 22, 2022

bors commented Aug 22, 2022

rust-timer commented Aug 22, 2022

rust-timer commented Aug 22, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

scottmcm commented Aug 22, 2022

scottmcm commented Aug 22, 2022 •

edited

Loading