Specialize StepBy<Range(Inclusive)> #51601

Emerentius · 2018-06-16T21:13:01Z

Part of #51557, related to #43064, #31155

As discussed in the above issues, step_by optimizes very badly on ranges which is related to

the special casing of the first StepBy::next() call
the need to do 2 additions of n - 1 and 1 inside the range's next()

This PR eliminates both by overriding next() to always produce the current element and also step ahead by n elements in one go. The generated code is much better, even identical in the case of a Range with constant start and end where start+step can't overflow. Without constant bounds it's a bit longer than the manual loop. RangeInclusive doesn't optimize as nicely but is still much better than the original asm.
Unsigned integers optimize better than signed ones for some reason.

See the following two links for a comparison.

godbolt: specialization for ..
godbolt: specialization for ..=

RangeFrom, the only other range with an Iterator implementation can't be specialized like this without changing behaviour due to overflow. There is no way to save "finished-ness".

The approach can not be used in general, because it would produce side effects of the underlying iterator too early.

May obsolete #51435, haven't checked.

rust-highfive · 2018-06-16T22:21:03Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[01:05:39] travis_fold:start:test_stage1-core
travis_time:start:test_stage1-core
Testing core stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:05:39]    Compiling core v0.0.0 (file:///checkout/src/libcore)
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == isize`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1602 |     assert_eq!((0..20).step_by(5).collect::<Vec<isize>>(), [0, 5, 10, 15]);
[01:05:46]      |                                   ^^^^^^^ expected i32, found isize
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `isize`
[01:05:46] 
[01:05:46] 
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == u8`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1605 |     assert_eq!((200..255).step_by(50).collect::<Vec<u8>>(), [200, 250]);
[01:05:46]      |                                       ^^^^^^^ expected i32, found u8
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `u8`
[01:05:46] 
[01:05:46] 
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == isize`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1606 |     assert_eq!((200..-5).step_by(1).collect::<Vec<isize>>(), []);
[01:05:46]      |                                     ^^^^^^^ expected i32, found isize
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `isize`
[01:05:46] 
[01:05:46] 
[01:05:46] error[E0271]: type mismatch resolving `<std::iter::StepBy<std::ops::Range<i32>> as std::iter::Iterator>::Item == isize`
[01:05:46]      |
[01:05:46]      |
[01:05:46] 1607 |     assert_eq!((200..200).step_by(1).collect::<Vec<isize>>(), []);
[01:05:46]      |                                      ^^^^^^^ expected i32, found isize
[01:05:46]      = note: expected type `i32`
[01:05:46]                 found type `isize`
[01:05:46] 
[01:05:51] error: aborting due to 4 previous errors
[01:05:51] error: aborting due to 4 previous errors
[01:05:51] 
[01:05:51] For more information about this error, try `rustc --explain E0271`.
[01:05:51] error: Could not compile `core`.
[01:05:51] 
[01:05:51] To learn more, run the command again with --verbose.
[01:05:51] 
[01:05:51] 
[01:05:51] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:05:51] 
[01:05:51] 
[01:05:51] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:05:51] Build completed unsuccessfully in 0:25:25
[01:05:51] Build completed unsuccessfully in 0:25:25
[01:05:51] make: *** [check] Error 1
[01:05:51] Makefile:58: recipe for target 'check' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:13d40a90
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Emerentius · 2018-06-16T22:46:55Z

#36262 strikes again. And here I thought I'd be safe 'cause it's always clear which range type we have and all integers implement Step.

leonardo-m · 2018-06-17T07:50:10Z

Making Specialize StepBy<Range(Inclusive)> efficient has a high priority in system language design.

rust-highfive · 2018-06-17T20:13:20Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[01:17:42] 
[01:17:42] running 845 tests
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ..........................................................F.........................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:42] ....................................................................................................
[01:17:43] .............................................
[01:17:43] failures:
[01:17:43] 
[01:17:43] ---- iter::test_range_step stdout ----
[01:17:43] thread 'iter::test_range_step' panicked at 'assertion failed: `(left == right)`
[01:17:43]   left: `[200]`,
[01:17:43]  right: `[200, 250]`', libcore/../libcore/tests/iter.rs:1605:5
[01:17:43] 
[01:17:43] failures:
[01:17:43]     iter::test_range_step
[01:17:43] 
[01:17:43] 
[01:17:43] test result: FAILED. 842 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out
[01:17:43] 
[01:17:43] error: test failed, to rerun pass '--test coretests'
[01:17:43] 
[01:17:43] 
[01:17:43] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:17:43] 
[01:17:43] 
[01:17:43] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:17:43] Build completed unsuccessfully in 0:29:51
[01:17:43] Build completed unsuccessfully in 0:29:51
[01:17:43] Makefile:58: recipe for target 'check' failed
[01:17:43] make: *** [check] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2eb90abf
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Emerentius · 2018-06-17T21:03:40Z

Pro-tip: if your local tests don't run against your own code, they are much more likely to succeed.

scottmcm · 2018-06-18T04:27:00Z

src/libcore/iter/mod.rs

+    #[inline]
+    fn next(&mut self) -> Option<Self::Item> {
+        self.first_take = false;
+        if self.iter.start >= self.iter.end {


Step only requires PartialOrd, so using >= is technically a behaviour change. It wouldn't change anything that's stable today, but I think the original way is better, with an NAN-like endpoint making the range empty. (See similar discussion for Range::contains in #49130 (comment))

pietroalbini · 2018-06-18T08:47:31Z

Picking someone from the libs team to review this. r? @sfackler

Emerentius · 2018-06-18T12:49:34Z

I flipped the comparisons back around to keep the original behaviour for NaN-like values (for when they are possible) and squashed the commits together.
Updated and cleaned up the the godbolt code and exchanged the links in the OP. Signed integers don't optimize as well as unsigned ones, most likely because of the additional checks in Step::add_usize but it's still an improvement.

sfackler · 2018-06-19T13:15:04Z

src/libcore/iter/mod.rs

@@ -737,6 +733,76 @@ impl<I> Iterator for StepBy<I> where I: Iterator {
    }
 }

+// hidden trait for specializing iterator methods
+// current use is limited to next() on StepBy
+trait SpecIter {


Could this trait mention StepBy in its name just to make it a bit more clear that it's limited to those impls?

sfackler · 2018-06-19T13:16:06Z

src/libcore/iter/mod.rs

+    #[inline]
+    fn next(&mut self) -> Option<Self::Item> {
+        self.first_take = false;
+        if !(self.iter.start < self.iter.end) {


This comparison is structured weirdly to handle NaN properly?

yes, it's if start < end { return stuff } else { None } flipped around so the positive case has no indent. Should I change it?

sfackler · 2018-06-19T13:17:02Z

Seems reasonable to me!

the originally generated code was highly suboptimal this brings it close to the same code or even exactly the same as a manual while-loop by eliminating a branch and the double stepping of n-1 + 1 steps The intermediate trait lets us circumvent the specialization type inference bugs

Emerentius · 2018-06-19T17:35:31Z

@sfackler Renamed SpecIter to StepBySpecIterator and its next() method to spec_next() following SpecExtend.

sfackler · 2018-06-21T01:00:48Z

@bors r+

bors · 2018-06-21T01:00:49Z

📌 Commit 000aff6 has been approved by sfackler

bors · 2018-06-21T08:55:20Z

⌛ Testing commit 000aff6 with merge 95979dc...

Specialize StepBy<Range(Inclusive)> Part of #51557, related to #43064, #31155 As discussed in the above issues, `step_by` optimizes very badly on ranges which is related to 1. the special casing of the first `StepBy::next()` call 2. the need to do 2 additions of `n - 1` and `1` inside the range's `next()` This PR eliminates both by overriding `next()` to always produce the current element and also step ahead by `n` elements in one go. The generated code is much better, even identical in the case of a `Range` with constant `start` and `end` where `start+step` can't overflow. Without constant bounds it's a bit longer than the manual loop. `RangeInclusive` doesn't optimize as nicely but is still much better than the original asm. Unsigned integers optimize better than signed ones for some reason. See the following two links for a comparison. [godbolt: specialization for ..](https://godbolt.org/g/haHLJr) [godbolt: specialization for ..=](https://godbolt.org/g/ewyMu6) `RangeFrom`, the only other range with an `Iterator` implementation can't be specialized like this without changing behaviour due to overflow. There is no way to save "finished-ness". The approach can not be used in general, because it would produce side effects of the underlying iterator too early. May obsolete #51435, haven't checked.

bors · 2018-06-21T11:09:20Z

☀️ Test successful - status-appveyor, status-travis
Approved by: sfackler
Pushing 95979dc to master...

jethrogb · 2018-11-15T16:29:03Z

This is buggy: #55985

newpavlov · 2018-11-18T17:38:40Z

So should we temporary revert this PR? Or should we leave it as-is and wait for the proper fix? I personally prefer the first option.

sfackler · 2018-11-18T17:49:36Z

Let's revert until it's fixed, yeah.

leonardo-m · 2018-11-18T18:56:16Z

The performance of this patch was not good enough :-(

Revert #51601 Closes: #55985 Specialization of `StepBy<Range(Inclusive)>` results in an incorrectly behaving code when `step_by` is combined with `skip` or `nth`. If this will get merged we probably should reopen issues previously closed by #51601 (if there was any).

@nrc

beta backport rollup Backports of some beta-approved PRs - [x] #55385: NLL: cast causes failure to promote to static - [x] #56043: remove "approx env bounds" if we already know from trait - [x] #56003: do not propagate inferred bounds on trait objects if they involve `Self` - [x] #55852: Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint - [x] #55804: rustdoc: don't inline `pub use some_crate` unless directly asked to - [x] #56059: Increase `Duration` approximate equal threshold to 1us - [x] Keep resolved defs in path prefixes and emit them in save-analysis #54145 - [x] Adjust Ids of path segments in visibility modifiers #55487 - [x] save-analysis: bug fix and optimisation. #55521 - [x] save-analysis: be even more aggressive about ignorning macro-generated defs #55936 - [x] save-analysis: fallback to using path id #56060 - [x] save-analysis: Don't panic for macro-generated use globs #55879 - [x] Add temporary renames to manifests for rustfmt/clippy #56081 - [x] Revert #51601 #56049 - [x] Fix stability hole with `static _` #55983 - [x] #56077 - [x] Fix Rustdoc ICE when checking blanket impls #55258 - [x] Updated RELEASES.md for 1.31.0 #55678 - [x] ~~#56061~~ #56111 - [x] Stabilize `extern_crate_item_prelude` #56032 Still running tests locally, and I plan to backport @nrc's other PRs too (cc @petrochenkov -- thanks for the advice)

… r=scottmcm Override `StepBy::{try_fold, try_rfold}` Previous PR: rust-lang#51435 The previous PR was closed in favor of rust-lang#51601, which was later reverted. I don't think these implementations will make it harder to specialize `StepBy<Range<_>>` later, so we should be able to land this without any consequences. This should fix rust-lang#57517 – in my benchmarks `iter` and `iter.step_by(1)` now perform equally well, provided internal iteration is used.

SimonSapin mentioned this pull request Jun 18, 2018

Override try_fold for StepBy #51435

Closed

scottmcm reviewed Jun 18, 2018

View reviewed changes

pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 18, 2018

rust-highfive assigned sfackler Jun 18, 2018

Emerentius force-pushed the step_by_range_diet branch from 6d6e976 to 2879174 Compare June 18, 2018 12:13

kennytm mentioned this pull request Jun 19, 2018

Change RangeInclusive to a three-field struct. #51622

Merged

sfackler reviewed Jun 19, 2018

View reviewed changes

pietroalbini added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 19, 2018

Emerentius force-pushed the step_by_range_diet branch from 2879174 to 000aff6 Compare June 19, 2018 17:34

pietroalbini added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 19, 2018

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 21, 2018

bors merged commit 000aff6 into rust-lang:master Jun 21, 2018

jethrogb mentioned this pull request Nov 15, 2018

step_by + skip produce incorrect results #55985

Closed

newpavlov mentioned this pull request Nov 18, 2018

Revert #51601 #56049

Merged

nikomatsakis mentioned this pull request Nov 20, 2018

beta backport rollup #56102

Merged

20 tasks

timvermeulen mentioned this pull request Sep 3, 2019

Override StepBy::{try_fold, try_rfold} #64121

Merged

the8472 mentioned this pull request May 20, 2023

Ironing out StepBy<Range>'s performance issues #51557

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize StepBy<Range(Inclusive)> #51601

Specialize StepBy<Range(Inclusive)> #51601

Emerentius commented Jun 16, 2018 •

edited

Loading

rust-highfive commented Jun 16, 2018

Emerentius commented Jun 16, 2018

leonardo-m commented Jun 17, 2018

rust-highfive commented Jun 17, 2018

Emerentius commented Jun 17, 2018

scottmcm Jun 18, 2018

pietroalbini commented Jun 18, 2018

Emerentius commented Jun 18, 2018 •

edited

Loading

sfackler Jun 19, 2018

sfackler Jun 19, 2018

Emerentius Jun 19, 2018 •

edited

Loading

sfackler commented Jun 19, 2018

Emerentius commented Jun 19, 2018

sfackler commented Jun 21, 2018

bors commented Jun 21, 2018

bors commented Jun 21, 2018

bors commented Jun 21, 2018

jethrogb commented Nov 15, 2018

newpavlov commented Nov 18, 2018

sfackler commented Nov 18, 2018

leonardo-m commented Nov 18, 2018

Specialize StepBy<Range(Inclusive)> #51601

Specialize StepBy<Range(Inclusive)> #51601

Conversation

Emerentius commented Jun 16, 2018 • edited Loading

rust-highfive commented Jun 16, 2018

Emerentius commented Jun 16, 2018

leonardo-m commented Jun 17, 2018

rust-highfive commented Jun 17, 2018

Emerentius commented Jun 17, 2018

scottmcm Jun 18, 2018

Choose a reason for hiding this comment

pietroalbini commented Jun 18, 2018

Emerentius commented Jun 18, 2018 • edited Loading

sfackler Jun 19, 2018

Choose a reason for hiding this comment

sfackler Jun 19, 2018

Choose a reason for hiding this comment

Emerentius Jun 19, 2018 • edited Loading

Choose a reason for hiding this comment

sfackler commented Jun 19, 2018

Emerentius commented Jun 19, 2018

sfackler commented Jun 21, 2018

bors commented Jun 21, 2018

bors commented Jun 21, 2018

bors commented Jun 21, 2018

jethrogb commented Nov 15, 2018

newpavlov commented Nov 18, 2018

sfackler commented Nov 18, 2018

leonardo-m commented Nov 18, 2018

Emerentius commented Jun 16, 2018 •

edited

Loading

Emerentius commented Jun 18, 2018 •

edited

Loading

Emerentius Jun 19, 2018 •

edited

Loading