Remove manual unrolling from slice::Iter(Mut)::try_fold #64600

scottmcm · 2019-09-19T04:37:31Z

While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone.

I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like #64545).

Final perf comparison: #64600 (comment)

For context see #64572 (comment)

scottmcm · 2019-09-19T05:00:24Z

@bors try @rust-timer queue

rust-timer · 2019-09-19T05:00:26Z

Awaiting bors try build completion

bors · 2019-09-19T05:00:36Z

⌛ Trying commit 38d8c8d with merge dd115ba...

@scottmcm

[DO NOT MERGE] Experiment with removing unrolling from slice::Iter::try_fold For context see #64572 (comment) r? @scottmcm

bors · 2019-09-19T07:55:53Z

☀️ Try build successful - checks-azure
Build commit: dd115ba

rust-timer · 2019-09-19T07:55:54Z

Queued dd115ba with parent eceec57, future comparison URL.

rust-timer · 2019-09-19T10:20:00Z

Finished benchmarking try commit dd115ba, comparison URL.

nnethercote · 2019-09-19T11:58:00Z

This change gets roughly half the improvements that the commit in #64572 gets.

bluss · 2019-09-20T06:23:22Z

I think that unrolling would eventually have to go and be removed from libcore, I was just hoping the compiler would catch up and be able to unroll loops with multiple exits itself. Unrolling should ideally belong to the compiler, so it can make the decision about when to duplicate code. I haven't revisited that, so for all I know llvm could have learned this by now. [Edit: checked -- rustc nightly does not unroll such things by itself right now either. I wonder if this multiple exit improvement means that things are on the way..? No clue]

This seems like a situation where it's easy to find both good and bad cases. Things like scans through bytes for parsing with a simple predicate benefit a lot from unrolling in all/find etc.

While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone. I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like rust-lang#64545).

scottmcm · 2019-09-22T04:23:02Z

Ok, it seems like the inclination is that we should do this so I've turned this into a "real" PR.

I do think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually, though I certainly with LLVM was better at these cases.

I'm not sure who should approve this -- does it need libs sign-off?

timvermeulen · 2019-09-22T19:51:33Z

Couldn't you just remove the try_fold and fold overrides entirely? They seem to now mirror the default implementations. And I think there's manual unrolling going on in the DoubleEndedIterator impl, too.

…ions

scottmcm · 2019-09-24T09:54:58Z

@bors try @rust-timer queue

(I'm curious to see the new self-profile results, and want to make sure that removing the overrides still keeps the gain here -- it might mean more work to eliminate !s.)

rust-timer · 2019-09-24T09:54:59Z

Awaiting bors try build completion

bors · 2019-09-24T09:55:11Z

⌛ Trying commit 6ac64ab with merge 8be3622...

Remove manual unrolling from slice::Iter(Mut)::try_fold While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone. I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like #64545). --- For context see #64572 (comment)

bors · 2019-09-24T12:54:28Z

☀️ Try build successful - checks-azure
Build commit: 8be3622 (8be3622ad74755484fd9b9e401d0ee96837be244)

rust-timer · 2019-09-24T12:54:30Z

Queued 8be3622 with parent 66bf391, future comparison URL.

rust-timer · 2019-09-24T18:53:29Z

Finished benchmarking try commit 8be3622, comparison URL.

bjorn3 · 2019-09-24T19:11:12Z

script-servo-opt regressed, the rest is positive or neutral.

scottmcm · 2019-09-24T19:33:42Z

Oh, interesting. script-servo-opt is new, right? I can't find it in the previous perf comparison...

scottmcm · 2019-09-24T20:12:56Z

New perf link (thank you, Mark-Simulacrum!) with self-profile results for both sides:
https://perf.rust-lang.org/compare.html?start=b4ba2a3953ea9ec28f01c314be315d46673bd782&end=8be3622ad74755484fd9b9e401d0ee96837be244

It looks like nearly all of the speedup for clap-rs-debug run clean is ~2.5s in LLVM_emit_obj (and ~0.3s in LLVM_make_bitcode). Any idea why this would be so much better there? I'd hypothesized that the closure might be getting inlined 5x even in debug, but based on a quick experiment (https://rust.godbolt.org/z/HmYVvd) that doesn't seem to happen. Since this PR still ends up using try_fold (unlike #64572), it doesn't feel like it should have had a major enough impact on the amount of code that's getting sent to LLVM to account for the size of the win.

And for script-servo-opt run baseline incremental it's LLVM_emit_obj that increased by ~13.5s with this change. Curious.

bluss · 2019-09-27T16:57:51Z

This change looks good to me, but I guess we are waiting for some discussion. I'll try to ask @Geal about nom performance and unrolling.

You know how much I would like to say we can just reimplement important stuff, like an unrolling slice iterator, outside libcore, but the libcore version is still tied up with unstable features like assume — unknown what impact they have as of current rustc version.

Geal · 2019-09-27T17:31:01Z

@bluss no issue for me, nom does not use try_fold nor try_rfold internally

bluss · 2019-09-27T20:44:59Z

It looks like it's not impossible for rustc to unroll an "Iterator::all" like loop. It just can't do it in the simplest forms that those loops take, for example not in for elt in v { .. }

I have some old alternative slice iterator code, and it can be automatically unrolled by the compiler. The code is here (github link to iter.rs) and there are benchmarks that show the unrolling on that specific branch. I haven't managed to reduce the loop that will unroll, though — maybe it's specific to the code in the benchmark? The compiler's unroll disappears if the special case .all() method is removed from that iterator, even though it only adds indirections (similar to using try_fold).

scottmcm · 2019-09-29T23:37:55Z

@bluss do you still have r+ here, or do I need to find a different reviewer for this?

bluss · 2019-09-30T05:37:18Z

@scottmcm I guess I do, but with the nominated tag I thought we were waiting for the libs team

bluss · 2019-09-30T05:39:22Z

@bors r+ rollup=never

bors · 2019-09-30T05:39:23Z

📌 Commit 6ac64ab has been approved by bluss

bors · 2019-09-30T09:52:35Z

⌛ Testing commit 6ac64ab with merge e0436d9...

Remove manual unrolling from slice::Iter(Mut)::try_fold While this definitely helps sometimes (particularly for trivial closures), it's also a pessimization sometimes, so it's better to leave this to (hypothetical) future LLVM improvements instead of forcing this on everyone. I think it's better for the advice to be that sometimes you need to unroll manually than you sometimes need to not-unroll manually (like #64545). --- For context see #64572 (comment)

bors · 2019-09-30T13:33:09Z

☀️ Test successful - checks-azure
Approved by: bluss
Pushing e0436d9 to master...

bluss · 2019-10-01T19:24:38Z

@Geal is now in the rustc 1.40.0-nightly (22bc9e1 2019-09-30) nightly. I'd be surprised if it didn't impact benches in some way

rust-highfive assigned scottmcm Sep 19, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 19, 2019

bors added a commit that referenced this pull request Sep 19, 2019

Auto merge of #64600 - scottmcm:no-slice-tryfold-unroll, r=<try>

dd115ba

[DO NOT MERGE] Experiment with removing unrolling from slice::Iter::try_fold For context see #64572 (comment) r? @scottmcm

RalfJung mentioned this pull request Sep 19, 2019

try builds: include a copyable version of the full commit SHA in comment rust-lang/homu#53

Merged

nnethercote mentioned this pull request Sep 19, 2019

Simplify some Iterator methods. #64572

Closed

scottmcm force-pushed the no-slice-tryfold-unroll branch from 38d8c8d to 92e91f7 Compare September 22, 2019 04:16

scottmcm changed the title ~~[DO NOT MERGE] Experiment with removing unrolling from slice::Iter::try_fold~~ Remove manual unrolling from slice::Iter(Mut)::try_fold Sep 22, 2019

scottmcm assigned bluss and unassigned scottmcm Sep 22, 2019

scottmcm added I-nominated T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 22, 2019

scottmcm force-pushed the no-slice-tryfold-unroll branch from 572de05 to 2f7b32a Compare September 23, 2019 04:17

This comment has been minimized.

Sign in to view

Just delete the overrides now that they match the default implementat…

6ac64ab

…ions

scottmcm force-pushed the no-slice-tryfold-unroll branch from 2f7b32a to 6ac64ab Compare September 24, 2019 06:13

andjo403 mentioned this pull request Sep 28, 2019

use try_fold instead of try_for_each to reduce compile time #64885

Merged

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 30, 2019

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 30, 2019

bors merged commit 6ac64ab into rust-lang:master Sep 30, 2019

scottmcm deleted the no-slice-tryfold-unroll branch October 1, 2019 19:52

nnethercote mentioned this pull request Oct 3, 2019

Remove most #[inline] annotations rust-lang/hashbrown#119

Merged

the8472 mentioned this pull request Jan 24, 2020

perf: Use for_each in Vec::extend #68046

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove manual unrolling from slice::Iter(Mut)::try_fold #64600

Remove manual unrolling from slice::Iter(Mut)::try_fold #64600

scottmcm commented Sep 19, 2019 •

edited

Loading

scottmcm commented Sep 19, 2019

rust-timer commented Sep 19, 2019

bors commented Sep 19, 2019

bors commented Sep 19, 2019

rust-timer commented Sep 19, 2019

rust-timer commented Sep 19, 2019

nnethercote commented Sep 19, 2019 •

edited

Loading

bluss commented Sep 20, 2019 •

edited

Loading

scottmcm commented Sep 22, 2019

timvermeulen commented Sep 22, 2019

This comment has been minimized.

scottmcm commented Sep 24, 2019

rust-timer commented Sep 24, 2019

bors commented Sep 24, 2019

bors commented Sep 24, 2019

rust-timer commented Sep 24, 2019

rust-timer commented Sep 24, 2019

bjorn3 commented Sep 24, 2019

scottmcm commented Sep 24, 2019

scottmcm commented Sep 24, 2019 •

edited

Loading

bluss commented Sep 27, 2019 •

edited

Loading

Geal commented Sep 27, 2019

bluss commented Sep 27, 2019

scottmcm commented Sep 29, 2019

bluss commented Sep 30, 2019

bluss commented Sep 30, 2019

bors commented Sep 30, 2019

bors commented Sep 30, 2019

bors commented Sep 30, 2019

bluss commented Oct 1, 2019

Remove manual unrolling from slice::Iter(Mut)::try_fold #64600

Remove manual unrolling from slice::Iter(Mut)::try_fold #64600

Conversation

scottmcm commented Sep 19, 2019 • edited Loading

scottmcm commented Sep 19, 2019

rust-timer commented Sep 19, 2019

bors commented Sep 19, 2019

bors commented Sep 19, 2019

rust-timer commented Sep 19, 2019

rust-timer commented Sep 19, 2019

nnethercote commented Sep 19, 2019 • edited Loading

bluss commented Sep 20, 2019 • edited Loading

scottmcm commented Sep 22, 2019

timvermeulen commented Sep 22, 2019

This comment has been minimized.

scottmcm commented Sep 24, 2019

rust-timer commented Sep 24, 2019

bors commented Sep 24, 2019

bors commented Sep 24, 2019

rust-timer commented Sep 24, 2019

rust-timer commented Sep 24, 2019

bjorn3 commented Sep 24, 2019

scottmcm commented Sep 24, 2019

scottmcm commented Sep 24, 2019 • edited Loading

bluss commented Sep 27, 2019 • edited Loading

Geal commented Sep 27, 2019

bluss commented Sep 27, 2019

scottmcm commented Sep 29, 2019

bluss commented Sep 30, 2019

bluss commented Sep 30, 2019

bors commented Sep 30, 2019

bors commented Sep 30, 2019

bors commented Sep 30, 2019

bluss commented Oct 1, 2019

scottmcm commented Sep 19, 2019 •

edited

Loading

nnethercote commented Sep 19, 2019 •

edited

Loading

bluss commented Sep 20, 2019 •

edited

Loading

scottmcm commented Sep 24, 2019 •

edited

Loading

bluss commented Sep 27, 2019 •

edited

Loading