Reverse Timsort scan direction #107191

Voultapher · 2023-01-22T11:31:21Z

Another PR in the series of stable sort improvements. Best reviewed by looking at the individual commits.

The main perf gain here is for fully ascending (sorted) or reversed inputs for cheap to compare types such as u64, these see a ~1.5x speedup.

Types such as string with indirect pre-fetching see only minor changes. Further speedups are planned in future PRs so, I wouldn't spend too much time for benchmarks here.

This is more clear about the intent of the pointer and avoids problems if the allocation returns a null pointer.

Avoid duplicate insertion sort implementations. Optimize implementations.

Memory pre-fetching prefers forward scanning vs backwards scanning, and the code-gen is usually better. For the most sensitive types such as integers, these are planned to be merged bidirectionally at once. So there is no benefit in scanning backwards. The largest perf gains are seen for full ascending and descending inputs, which see 1.5x speedups. Random inputs benefit too, and some patterns can loose out, but these losses are minimal.

rustbot · 2023-01-22T11:31:29Z

r? @m-ou-se

(rustbot has picked a reviewer for you, use r? to override)

rustbot · 2023-01-22T11:31:31Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

Voultapher · 2023-01-22T19:26:42Z

r? thomcc

thomcc · 2023-01-22T20:28:13Z

@bors try @rust-timer queue

bors · 2023-01-22T20:28:22Z

⌛ Trying commit f297afa with merge ff841929044c3390745330612525de1a62492383...

bors · 2023-01-22T22:35:03Z

☀️ Try build successful - checks-actions
Build commit: ff841929044c3390745330612525de1a62492383 (ff841929044c3390745330612525de1a62492383)

rust-timer · 2023-01-23T03:43:48Z

Finished benchmarking commit (ff841929044c3390745330612525de1a62492383): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.4%	[0.2%, 0.6%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.4%, -0.4%]	1
All ❌✅ (primary)	0.4%	[0.2%, 0.6%]	4

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.3%]	1
Regressions ❌ (secondary)	7.5%	[7.5%, 7.5%]	1
Improvements ✅ (primary)	-3.1%	[-3.1%, -3.1%]	1
Improvements ✅ (secondary)	-1.1%	[-1.5%, -0.6%]	2
All ❌✅ (primary)	-1.4%	[-3.1%, 0.3%]	2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.0%	[1.0%, 1.0%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.0%	[1.0%, 1.0%]	1

Voultapher · 2023-01-25T19:10:42Z

@thomcc looking at the regressions and improvements specific to this PR, I get the impression there is no clear win or loss here. Also it seems the magnitude of change is rather small. But I'm not familiar with these benchmarks and their significance, I'd like to hear your impression. It should also be said that these changes are mostly of setup nature, and the next PR plans to introduce the first chunk of larger speedups.

thomcc · 2023-01-31T03:29:17Z

@Voultapher Not sure. It could be noise, but it looks like the regressions are more significant than the improvements. Note that there are 4 primary benchmarks regressed vs 1 secondary (e.g. synthetic) benchmark which improved. I'll try more runs in case it's noise, but it's worth investigating.

@bors try @rust-timer queue runs=5

bors · 2023-01-31T03:29:28Z

⌛ Trying commit 5eff264 with merge 82bc25cb7e57ff1e01bdc1a76269d8c5ff08d2f3...

rust-timer · 2023-01-31T07:45:43Z

Finished benchmarking commit (82bc25cb7e57ff1e01bdc1a76269d8c5ff08d2f3): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.4%, 0.5%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.5%, -0.5%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.5%, 0.5%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.9%	[0.3%, 3.5%]	3
Regressions ❌ (secondary)	2.8%	[0.9%, 4.7%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.7%	[-3.5%, -2.1%]	8
All ❌✅ (primary)	1.9%	[0.3%, 3.5%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.3%	[3.1%, 3.6%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.0%	[-2.0%, -2.0%]	1
All ❌✅ (primary)	-	-	0

thomcc · 2023-01-31T07:47:21Z

@rustbot author

Voultapher · 2023-01-31T11:17:19Z

@thomcc looking at the two runs, the second one has one primary improvement of 0.5% and one primary regression of 0.5% in the same crate, as well as one further regression of 0.3% in the same crate. The bootstrap timings look all over the place. proc_macro regressed by 2.3% in the second run, but improved by 1.4% in the first run. rustc_builtin_macros regressed by 0.7% or 3.3%. It notes the noise is 1-3%, and pretty much everything falls within this range. I'd argue this change has no significant detrimental effect.

Voultapher · 2023-01-31T11:17:40Z

@rustbot ready

thomcc · 2023-02-01T01:00:54Z

Hm, fair enough (to be clear: my pickiness here is just to ensure we don't land optimizations that are actually pessimizations, I think the change is good in general).

(I'll do my review this weekend)

Voultapher · 2023-02-01T07:45:06Z

One open question is, how much sort performance even influences compiler performance. As IIUC this benchmark suite is focused on compiler performance only.

thomcc · 2023-02-01T15:47:32Z

It is. The compiler definitely performs sorts though, and it wouldn't surprise me if some were in sensitive positions.

thomcc · 2023-02-13T02:19:51Z

Okay, it took a jillion years but I am convinced of this code's correctness. Thanks.

@bors r+

bors · 2023-02-13T02:19:53Z

📌 Commit 5eff264 has been approved by thomcc

It is now in the queue for this repository.

bors · 2023-02-13T04:06:07Z

⌛ Testing commit 5eff264 with merge 96834f0...

bors · 2023-02-13T07:07:27Z

☀️ Test successful - checks-actions
Approved by: thomcc
Pushing 96834f0 to master...

rust-timer · 2023-02-13T08:52:11Z

Finished benchmarking commit (96834f0): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	0.2%	[0.2%, 0.2%]	2
Improvements ✅ (primary)	-0.4%	[-0.4%, -0.4%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.1%	[-0.4%, 0.6%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.4%	[0.1%, 2.8%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.7%	[-2.7%, -2.7%]	1
Improvements ✅ (secondary)	-1.7%	[-2.1%, -1.2%]	3
All ❌✅ (primary)	0.1%	[-2.7%, 2.8%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.0%	[-3.0%, -2.9%]	2
All ❌✅ (primary)	-	-	0

rylev · 2023-02-14T16:28:39Z

Regressions are small enough that I think we don't need to investigate this closely.

@rustbot label: perf-regression-triaged

… r=Mark-Simulacrum Fix no_global_oom_handling build `provide_sorted_batch` in core is incorrectly marked with `#[cfg(not(no_global_oom_handling))]` which prevents core from building with the cfg enabled. Nothing in `core` allocates memory (including this function). The `cfg` gate is incorrect. cc `@dpaoliello` r? `@wesleywiser` The cfg was added by rust-lang#107191

… r=Mark-Simulacrum Fix no_global_oom_handling build `provide_sorted_batch` in core is incorrectly marked with `#[cfg(not(no_global_oom_handling))]` which prevents core from building with the cfg enabled. Nothing in `core` allocates memory (including this function). The `cfg` gate is incorrect. cc ``@dpaoliello`` r? ``@wesleywiser`` The cfg was added by rust-lang#107191

Voultapher added 3 commits January 21, 2023 10:17

Use NonNull in merge_sort

703ff60

This is more clear about the intent of the pointer and avoids problems if the allocation returns a null pointer.

Unify insertion sort implementations

a3065a1

Avoid duplicate insertion sort implementations. Optimize implementations.

rustbot assigned m-ou-se Jan 22, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 22, 2023

Voultapher changed the title ~~Reverse timsort scan direction~~ Reverse Timsort scan direction Jan 22, 2023

This comment has been minimized.

Sign in to view

rustbot assigned thomcc and unassigned m-ou-se Jan 22, 2023

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 22, 2023

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 23, 2023

Document missing unsafe blocks

5eff264

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 31, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 31, 2023

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 31, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 31, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 13, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 13, 2023

bors merged commit 96834f0 into rust-lang:master Feb 13, 2023

rustbot added this to the 1.69.0 milestone Feb 13, 2023

This was referenced Feb 13, 2023

[WIP] Constified slice::sort_unstable, sort_internals #102279

Closed

Add Median of Medians fallback to introselect #107522

Merged

rustbot added the perf-regression-triaged The performance regression has been triaged. label Feb 14, 2023

arlosi mentioned this pull request Apr 21, 2023

Fix no_global_oom_handling build #110649

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reverse Timsort scan direction #107191

Reverse Timsort scan direction #107191

Voultapher commented Jan 22, 2023

rustbot commented Jan 22, 2023

rustbot commented Jan 22, 2023

This comment has been minimized.

Voultapher commented Jan 22, 2023

thomcc commented Jan 22, 2023

This comment has been minimized.

bors commented Jan 22, 2023

bors commented Jan 22, 2023

This comment has been minimized.

rust-timer commented Jan 23, 2023

Voultapher commented Jan 25, 2023

thomcc commented Jan 31, 2023

This comment has been minimized.

bors commented Jan 31, 2023

This comment has been minimized.

rust-timer commented Jan 31, 2023

thomcc commented Jan 31, 2023

Voultapher commented Jan 31, 2023 •

edited

Loading

Voultapher commented Jan 31, 2023

thomcc commented Feb 1, 2023 •

edited

Loading

Voultapher commented Feb 1, 2023

thomcc commented Feb 1, 2023

thomcc commented Feb 13, 2023

bors commented Feb 13, 2023

bors commented Feb 13, 2023

bors commented Feb 13, 2023

rust-timer commented Feb 13, 2023

rylev commented Feb 14, 2023

Reverse Timsort scan direction #107191

Reverse Timsort scan direction #107191

Conversation

Voultapher commented Jan 22, 2023

rustbot commented Jan 22, 2023

rustbot commented Jan 22, 2023

This comment has been minimized.

Voultapher commented Jan 22, 2023

thomcc commented Jan 22, 2023

This comment has been minimized.

bors commented Jan 22, 2023

bors commented Jan 22, 2023

This comment has been minimized.

rust-timer commented Jan 23, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Voultapher commented Jan 25, 2023

thomcc commented Jan 31, 2023

This comment has been minimized.

bors commented Jan 31, 2023

This comment has been minimized.

rust-timer commented Jan 31, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

thomcc commented Jan 31, 2023

Voultapher commented Jan 31, 2023 • edited Loading

Voultapher commented Jan 31, 2023

thomcc commented Feb 1, 2023 • edited Loading

Voultapher commented Feb 1, 2023

thomcc commented Feb 1, 2023

thomcc commented Feb 13, 2023

bors commented Feb 13, 2023

bors commented Feb 13, 2023

bors commented Feb 13, 2023

rust-timer commented Feb 13, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

rylev commented Feb 14, 2023

Voultapher commented Jan 31, 2023 •

edited

Loading

thomcc commented Feb 1, 2023 •

edited

Loading