Add optimized lock methods for `Sharded` and refactor `Lock` #115388

Zoxc · 2023-08-30T16:42:59Z

This adds methods to Sharded which pick a shard and also locks it. These branch on parallelism just once instead of twice, improving performance.

Benchmark for cfg(parallel_compiler) and 1 thread:

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6461s	1.6345s	-0.70%
🟣 hyper:check	0.2414s	0.2394s	-0.83%
🟣 regex:check	0.9205s	0.9143s	-0.67%
🟣 syn:check	1.4981s	1.4869s	-0.75%
🟣 syntex_syntax:check	5.7629s	5.7256s	-0.65%
Total	10.0690s	10.0008s	-0.68%
Summary	1.0000s	0.9928s	-0.72%

cc @SparrowLii

rustbot · 2023-08-30T16:50:59Z

r? @compiler-errors

(rustbot has picked a reviewer for you, use r? to override)

SparrowLii · 2023-08-31T01:07:02Z

@bors try @rust-timer queue

bors · 2023-08-31T01:07:11Z

⌛ Trying commit 73917dd4206ccbacccdc201d529561ce5bd9055f with merge 24259321f2e7a82959b47b86ded3d1073f281746...

compiler/rustc_data_structures/src/sharded.rs

compiler/rustc_data_structures/src/sync/lock.rs

SparrowLii · 2023-08-31T01:28:17Z

compiler/rustc_data_structures/src/sync/lock.rs

+    /// Safety
+    /// This method must only be called if `might_be_dyn_thread_safe` was true on lock creation.
+    #[inline(always)]
+    unsafe fn lock_assume_sync(&self) {


Do we have to add several unsafe functions? We can just do this under Lock's method

This keeps the code non-generic and it also makes LockRaw more fully featured.

I'm not sure if it's worth adding extra unsafe functions, after all, both are to reduce maintenance costs.
cc @compiler-errors

bors · 2023-08-31T02:17:47Z

☀️ Try build successful - checks-actions
Build commit: 24259321f2e7a82959b47b86ded3d1073f281746 (24259321f2e7a82959b47b86ded3d1073f281746)

klensy · 2023-08-31T15:12:10Z

Well, regressed heavily for big crates, as of now: https://perf.rust-lang.org/status.html (Or CI feeling bad itself)

Step	Took	Expected
await-call-tree	0m28s	0m28s
bitmaps-3.1.0	1m06s	0m59s
cargo-0.60.0	8m24s	5m23s
clap-3.1.6	1m43s	1m32s
coercions	0m56s	0m54s
cranelift-codegen-0.82.1	9m50s	3m02s

rust-timer · 2023-08-31T16:21:58Z

Finished benchmarking commit (24259321f2e7a82959b47b86ded3d1073f281746): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.4%, -0.4%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.2%	[2.2%, 2.2%]	1
Improvements ✅ (primary)	-0.8%	[-1.1%, -0.6%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.8%	[-1.1%, -0.6%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.5%	[2.0%, 3.5%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 631.655s -> 631.389s (-0.04%)
Artifact size: 316.64 MiB -> 316.65 MiB (0.00%)

klensy · 2023-08-31T16:24:42Z

Well, regressed heavily for big crates, as of now: https://perf.rust-lang.org/status.html (Or CI feeling bad itself)
Step Took Expected
await-call-tree 0m28s 0m28s
bitmaps-3.1.0 1m06s 0m59s
cargo-0.60.0 8m24s 5m23s
clap-3.1.6 1m43s 1m32s
coercions 0m56s 0m54s
cranelift-codegen-0.82.1 9m50s 3m02s

Perf looks neutral. Why in that case took time differs so much for some benches? cranelift x3, for example.

klensy · 2023-08-31T17:36:48Z

And in next perf run time returned back, sus:

Currently benchmarking: 6ff94474e1d11.
Time left: 9m56s

Step	Took	Expected
await-call-tree	0m30s	0m28s
bitmaps-3.1.0	1m02s	1m06s
cargo-0.60.0	5m26s	8m24s
clap-3.1.6	1m32s	1m43s
coercions	0m55s	0m56s
cranelift-codegen-0.82.1	3m04s	9m50s

Zoxc · 2023-09-03T01:34:49Z

This now includes a refactored Lock implementation that removes RawLock, uses enums and works with track_caller.

Zoxc · 2023-09-03T02:08:15Z

Up to date benchmark for cfg(parallel_compiler) and 1 thread:

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6611s	1.6510s	-0.61%
🟣 hyper:check	0.2533s	0.2516s	-0.65%
🟣 regex:check	0.9303s	0.9228s	-0.81%
🟣 syn:check	1.5010s	1.4892s	-0.78%
🟣 syntex_syntax:check	5.7691s	5.7315s	-0.65%
Total	10.1147s	10.0461s	-0.68%
Summary	1.0000s	0.9930s	-0.70%

SparrowLii · 2023-09-05T07:26:11Z

I think the new commit follow the discussion about split impl of Lock into two mods.

This looks good to me. @nnethercote Can you have a look?

SparrowLii · 2023-09-08T09:15:44Z

Thanks! Let's run a perf again for confirm
@bors try @rust-timer queue

bors · 2023-09-08T09:15:56Z

⌛ Trying commit 9690142 with merge 1f36988...

Add optimized lock methods for `Sharded` and refactor `Lock` This adds methods to `Sharded` which pick a shard and also locks it. These branch on parallelism just once instead of twice, improving performance. Benchmark for `cfg(parallel_compiler)` and 1 thread: <table><tr><td rowspan="2">Benchmark</td><td colspan="1">Before</th><td colspan="2">After</th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 clap:check</td><td align="right">1.6461s</td><td align="right">1.6345s</td><td align="right"> -0.70%</td></tr><tr><td>🟣 hyper:check</td><td align="right">0.2414s</td><td align="right">0.2394s</td><td align="right"> -0.83%</td></tr><tr><td>🟣 regex:check</td><td align="right">0.9205s</td><td align="right">0.9143s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 syn:check</td><td align="right">1.4981s</td><td align="right">1.4869s</td><td align="right"> -0.75%</td></tr><tr><td>🟣 syntex_syntax:check</td><td align="right">5.7629s</td><td align="right">5.7256s</td><td align="right"> -0.65%</td></tr><tr><td>Total</td><td align="right">10.0690s</td><td align="right">10.0008s</td><td align="right"> -0.68%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9928s</td><td align="right"> -0.72%</td></tr></table> cc `@SparrowLii`

bors · 2023-09-08T10:25:55Z

☀️ Try build successful - checks-actions
Build commit: 1f36988 (1f36988828d2c6b2475df97d8de0e86ff7a4d9b5)

rust-timer · 2023-09-08T12:20:01Z

Finished benchmarking commit (1f36988): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.7%, 0.7%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.3%	[-3.3%, -3.3%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.3%	[-3.3%, 0.7%]	2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.0%	[1.0%, 1.0%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 628.475s -> 628.149s (-0.05%)
Artifact size: 318.12 MiB -> 318.16 MiB (0.01%)

SparrowLii · 2023-09-11T00:51:46Z

@bors r+

bors · 2023-09-11T00:51:47Z

📌 Commit 9690142 has been approved by SparrowLii

It is now in the queue for this repository.

bors · 2023-09-11T01:43:32Z

⌛ Testing commit 9690142 with merge 9b72cc9...

bors · 2023-09-11T03:28:44Z

☀️ Test successful - checks-actions
Approved by: SparrowLii
Pushing 9b72cc9 to master...

rust-timer · 2023-09-11T06:36:48Z

Finished benchmarking commit (9b72cc9): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.7%	[2.7%, 2.7%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 631.455s -> 631.227s (-0.04%)
Artifact size: 317.62 MiB -> 317.64 MiB (0.01%)

This comment has been minimized.

Sign in to view

rustbot assigned compiler-errors Aug 30, 2023

Zoxc force-pushed the sharded-lock branch from 629820a to 4d664de Compare August 30, 2023 17:48

This comment has been minimized.

Sign in to view

Zoxc force-pushed the sharded-lock branch from 4d664de to 73917dd Compare August 30, 2023 18:09

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 31, 2023

SparrowLii reviewed Aug 31, 2023

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 31, 2023

klensy mentioned this pull request Aug 31, 2023

Suspicious time diff at https://perf.rust-lang.org/status.html in took/expected rust-lang/rustc-perf#1713

Open

Zoxc force-pushed the sharded-lock branch 2 times, most recently from bfcd7a1 to d500310 Compare September 3, 2023 01:30

Zoxc changed the title ~~Add optimized lock methods for Sharded~~ Add optimized lock methods for Sharded and refactor Lock Sep 3, 2023

SparrowLii approved these changes Sep 5, 2023

View reviewed changes

rustbot unassigned compiler-errors Sep 7, 2023

Zoxc added 2 commits September 8, 2023 08:48

Add optimized lock methods for Sharded

8fc160b

Refactor Lock implementation

61cc00d

Zoxc force-pushed the sharded-lock branch from d500310 to 2138a97 Compare September 8, 2023 07:29

This comment has been minimized.

Sign in to view

Remove the LockMode enum and dispatch

9690142

Zoxc force-pushed the sharded-lock branch from 2138a97 to 9690142 Compare September 8, 2023 08:15

SparrowLii approved these changes Sep 8, 2023

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 8, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 8, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 11, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 11, 2023

bors merged commit 9b72cc9 into rust-lang:master Sep 11, 2023
11 checks passed

rustbot added this to the 1.74.0 milestone Sep 11, 2023

SparrowLii mentioned this pull request Sep 11, 2023

Tracking Issue for Parallel Rustc Front-end #113349

Open

28 tasks

Zoxc deleted the sharded-lock branch September 11, 2023 04:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optimized lock methods for `Sharded` and refactor `Lock` #115388

Add optimized lock methods for `Sharded` and refactor `Lock` #115388

Zoxc commented Aug 30, 2023

This comment has been minimized.

rustbot commented Aug 30, 2023

This comment has been minimized.

SparrowLii commented Aug 31, 2023

This comment has been minimized.

bors commented Aug 31, 2023

SparrowLii Aug 31, 2023

Zoxc Aug 31, 2023

SparrowLii Sep 1, 2023

bors commented Aug 31, 2023

This comment has been minimized.

klensy commented Aug 31, 2023 •

edited

Loading

rust-timer commented Aug 31, 2023

klensy commented Aug 31, 2023

klensy commented Aug 31, 2023

Zoxc commented Sep 3, 2023 •

edited

Loading

Zoxc commented Sep 3, 2023

SparrowLii commented Sep 5, 2023 •

edited

Loading

This comment has been minimized.

SparrowLii commented Sep 8, 2023 •

edited

Loading

This comment has been minimized.

bors commented Sep 8, 2023

This comment has been minimized.

bors commented Sep 8, 2023

This comment has been minimized.

rust-timer commented Sep 8, 2023

SparrowLii commented Sep 11, 2023

bors commented Sep 11, 2023

bors commented Sep 11, 2023

bors commented Sep 11, 2023

rust-timer commented Sep 11, 2023

Add optimized lock methods for Sharded and refactor Lock #115388

Add optimized lock methods for Sharded and refactor Lock #115388

Conversation

Zoxc commented Aug 30, 2023

This comment has been minimized.

rustbot commented Aug 30, 2023

This comment has been minimized.

SparrowLii commented Aug 31, 2023

This comment has been minimized.

bors commented Aug 31, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Aug 31, 2023

This comment has been minimized.

klensy commented Aug 31, 2023 • edited Loading

rust-timer commented Aug 31, 2023

Overall result: ✅ improvements - no action needed

klensy commented Aug 31, 2023

klensy commented Aug 31, 2023

Zoxc commented Sep 3, 2023 • edited Loading

Zoxc commented Sep 3, 2023

SparrowLii commented Sep 5, 2023 • edited Loading

This comment has been minimized.

SparrowLii commented Sep 8, 2023 • edited Loading

This comment has been minimized.

bors commented Sep 8, 2023

This comment has been minimized.

bors commented Sep 8, 2023

This comment has been minimized.

rust-timer commented Sep 8, 2023

Overall result: no relevant changes - no action needed

SparrowLii commented Sep 11, 2023

bors commented Sep 11, 2023

bors commented Sep 11, 2023

bors commented Sep 11, 2023

rust-timer commented Sep 11, 2023

Overall result: no relevant changes - no action needed

Add optimized lock methods for `Sharded` and refactor `Lock` #115388

Add optimized lock methods for `Sharded` and refactor `Lock` #115388

klensy commented Aug 31, 2023 •

edited

Loading

Zoxc commented Sep 3, 2023 •

edited

Loading

SparrowLii commented Sep 5, 2023 •

edited

Loading

SparrowLii commented Sep 8, 2023 •

edited

Loading