Short-circuiting internal iteration with Iterator::try_fold & try_rfold #45595

scottmcm · 2017-10-28T15:25:12Z

These are the core methods in terms of which the other methods (fold, all, any, find, position, nth, ...) can be implemented, allowing Iterator implementors to get the full goodness of internal iteration by only overriding one method (per direction).

Based off the Try trait, so works with both Result and Option (:tada: #42526). The try_fold rustdoc examples use Option and the try_rfold ones use Result.

AKA continuing in the vein of PRs #44682 & #44856 for more of Iterator.

New bench following the pattern from the latter of those:

test iter::bench_take_while_chain_ref_sum          ... bench:   1,130,843 ns/iter (+/- 25,110)
test iter::bench_take_while_chain_sum              ... bench:     362,530 ns/iter (+/- 391)

I also ran the benches without the fold & rfold overrides to test their new default impls, with basically no change. I left them there, though, to take advantage of existing overrides and because AlwaysOk has some sub-optimality due to #43278 (which 45225 should fix).

If you're wondering why there are three type parameters, see issue #45462

Thanks for @bluss for the original IRLO thread and the rfold PR and to @cuviper for adding so many folds, encouraging me to make this PR, and finding a catastrophic bug in a pre-review.

bluss · 2017-10-28T15:53:29Z

src/libcore/slice/mod.rs

+                        accum = f(accum, $mkref!(self.ptr.post_inc()));
+                    }
+                }
+                accum


This fold impl can just be a for loop instead I think.

bluss · 2017-10-28T15:54:49Z

src/libcore/slice/mod.rs

+                    if mem::size_of::<T>() != 0 {
+                        assume(!self.ptr.is_null());
+                        assume(self.ptr <= self.end);
+                    }


What's the reason these assumes were added here?

next and next_back have assumptions, so I put them here too. Given their status as arcana (from the rposition PR), I could certainly remove them...

It's a shame if they should be needed, but I know they are needed for some optimizations in next. We should probably have some evidence for each of them being added. The non-null one is at least used when we create an Option<&T>, and that doesn't happen in this method, so it doesn't seem necessary?

Hmm, search_while didn't have them, so hopefully they're not needed. I'll remove.

bluss · 2017-10-28T15:55:41Z

src/libcore/iter/iterator.rs

+    /// An iterator method that applies a function as long as it returns
+    /// successfully, producing a single, final value.
+    ///
+    /// `fold()` takes two arguments: an initial value, and a closure with two


fold → try_fold here.

bluss · 2017-10-28T15:59:13Z

src/libcore/iter/iterator.rs

+    ///
+    /// If possible, override this method with an implementation using
+    /// internal iteration.  Most of the other methods have their default
+    /// implementation in terms of this one.


Hm, any implementation is by definition using internal iteration? Maybe we can say they should in turn use try_fold on the parts the iterator are composed of, if possible. And some kind of smooth reminder that the implementation must keep all internal state up to date, no matter if it finishes with Ok or Err.

I took another stab at this section; hopefully it's on the right track.

bluss · 2017-10-28T16:00:44Z

src/libcore/iter/mod.rs

@@ -336,6 +337,84 @@ mod range;
 mod sources;
 mod traits;

+/// ZST used to implement foo methods in terms of try_foo
+struct AlwaysOk<T>(pub T);


ZST is a misnomer (it's just "zero additional size" here), maybe just say "newtype".

bluss · 2017-10-28T16:03:22Z

src/libcore/iter/iterator.rs

+        self.try_fold((), move |(), x| {
+            if f(x) { SearchResult::Found(()) }
+            else { SearchResult::NotFound(()) }
+        }).is_found()


Just curious why all and any are different (one uses SearchResult and the other not)

I think it's because my brain thinks in the true case, so all is continue on "success" (Result) but any is break on "success" (SearchResult). That is, of course, a super-fuzzy statement, since the opposites also work just as well.

bluss · 2017-10-28T16:04:52Z

src/libcore/iter/iterator.rs

@@ -1922,10 +1973,10 @@ pub trait Iterator {
        let mut ts: FromA = Default::default();
        let mut us: FromB = Default::default();

-        for (t, u) in self {
+        self.for_each(|(t, u)|{


style nit: space before { please.

bluss · 2017-10-28T16:05:08Z

src/libcore/iter/iterator.rs

+        // The addition might panic on overflow
+        self.try_fold(0, move |i, x| {
+            if predicate(x) { SearchResult::Found(i) }
+            else { SearchResult::NotFound(i+1) }


style nit: space around + please

bluss · 2017-10-28T16:06:04Z

src/libcore/iter/iterator.rs

+   }
+}
+
+impl<I:Iterator+Sized> SpecIterator for I {


style nit: space after : space around +

bluss · 2017-10-28T16:06:16Z

src/libcore/iter/iterator.rs

+    fn spec_nth(&mut self, n: usize) -> Option<Self::Item> {
+        self.try_fold(n, move |i, x| {
+            if i == 0 { SearchResult::Found(x) }
+            else { SearchResult::NotFound(i-1) }


style nit: space around -.

bluss · 2017-10-28T16:08:37Z

src/libcore/slice/mod.rs

                    }
                }
-                default
+                accum


same as fold, why not just a for or while let loop here.

kennytm · 2017-10-28T16:30:51Z

src/libcore/iter/iterator.rs

+    /// assert_eq!(it.next(), Some(&40));
+    /// ```
+    #[inline]
+    #[unstable(feature = "iterator_try_fold", issue = "88888888")]


Now that the tracking issue is filed, please change these to 45594 before we forget 😃.

bluss · 2017-10-28T16:33:01Z

src/libcore/iter/mod.rs

+                if n == 0 { SearchResult::Found(acc) }
+                else { SearchResult::NotFound(acc) }
+            }).into_inner()
+        }


can't this fold just use .try_fold()?

I was worried about SearchResult::from_try(r) potentially pessimizing things, since it does a bunch of destruct-rebuild.
There may be no basis behind that, though, since it's not the more important case of "fold should use fold to pick up any existing overrides". I'll just delete it, since less code is better :)

bluss · 2017-10-28T17:57:18Z

src/libcore/iter/mod.rs

+                if *n == 0 { SearchResult::Found(r) }
+                else { SearchResult::from_try(r) }
+            }).into_try()
+        }


It wasn't obvious to me if this was correct, but I think it is? The inversion of using SearchResult is a bit confusing even if the intention is probably the opposite.

Hmm, it got named that way because it was first written for find, but then I ended up using it in far more places than just that. I wonder if LoopResult with Break and Continue would be better -- would turn SearchResult::from_try into LoopResult::break_if_error (or continue_if_ok)...

I renamed SearchResult, and I think I like the new way better. Agree? Other suggestions?
scottmcm/rust@iter-try-fold...scottmcm:iter-try-fold-experiment

bluss · 2017-10-28T18:00:19Z

src/libcore/iter/mod.rs

+    fn try_fold<Acc, F, R>(&mut self, init: Acc, mut f: F) -> R where
+        Self: Sized, F: FnMut(Acc, Self::Item) -> R, R: Try<Ok=Acc>
+    {
+        match self.state {


In this method I'm wondering about the benefits of avoiding duplicating expressions that are big loops (self.a.try_fold(init, f) and similar for b). I imagine if these are small enough to inline, they are all inlined and duplicated.

Oh, binary size. Makes sense; I can share the calls for the states.

code size is relevant for the package of cpu-cache-memory's performance too, just to give a broader sense for what binary size means. Thanks for the fix.

JordiPolo · 2017-10-29T00:01:14Z

src/libcore/iter/iterator.rs

-            accum = f(accum, x);
-        }
-        accum
+        self.try_fold(init, move |acc, x| AlwaysOk(f(acc, x))).0


I'm fairly new to rust and I know other languages implements their iterator methods based on fold. I imagine the original implementors of these methods know that also and if they use simple for loops is because they are compiler friendly.
Creating the closure here + if let, etc. In try_fold. Unless the compiler is really really good it will create slower code, no?
Have you tried iterating on more complex data than just integers?

I think it's more that a for loop was the obvious way to do it, and the emphasis on internal iteration is a more recent idea. For simple iterators, it should be a wash, but iterators like Chain can lift their conditionals out in a fold or try_fold, better than repeated next calls.

@JordiPolo Here's a link to explore how this gets translated: https://godbolt.org/g/3ehFBV

There are a few things interacting here to make it relatively straight-forward for the compiler to turn this into good code. Note the definition of the AlwaysOk type:

struct AlwaysOk<T>(pub T);

That means that wrapping something in AlwaysOk is actually not doing anything -- the memory layout doesn't change at all. (Asterisk for potential ABI implications and that repr(rust) layout is subject to change, but that shouldn't be relevant in this case.) Similarly, the .0 at the end is also a type-level-only thing, as it doesn't need to change the representation at all.

The other thing that the compiler needs to be able to do is to know that the ? operators in the try_fold materialization will never return early. But see the Try impl:

impl<T> Try for AlwaysOk<T> { type Ok = T; type Error = !; ... }

The "never type" ! there is the canonical uninhabited type. (There are others, like if you define enum NoVariants {}.) Because uninhabited types have no valid values, it knows that an error can never happen, so it can completely remove the early-return paths, making it equivalent to normal fold.

@JordiPolo You're right, it's asking the compiler to inline a lot, but it's not a miraculous effort, since closures like other things in Rust default to being unboxed.

Thanks so much for the details, I think ! is the magic I needed to understand, now I see the logic

This is the core method in terms of which the other methods (fold, all, any, find, position, nth, ...) can be implemented, allowing Iterator implementors to get the full goodness of internal iteration by only overriding one method (per direction).

scottmcm · 2017-10-30T01:20:18Z

r? @aturon

This is ready, but @rust-highfive didn't give me a reviewer. Please redirect if someone else should review.

shepmaster · 2017-11-03T14:05:06Z

Moving to another libs team member...

r? @dtolnay

dtolnay · 2017-11-07T20:41:46Z

@bors r+

bors · 2017-11-07T20:41:47Z

📌 Commit b5dba91 has been approved by dtolnay

bors · 2017-11-08T05:04:20Z

⌛ Testing commit b5dba91 with merge 157c9d0fd39ccfc058f7cefc8de7bc98b69741ea...

bors · 2017-11-08T06:06:09Z

💔 Test failed - status-travis

kennytm · 2017-11-08T06:25:33Z

cargotest of ripgrep failed.

[00:56:32] failures:
[00:56:32] 
[00:56:32] ---- feature_45_relative_cwd stdout ----
[00:56:32] 	thread 'feature_45_relative_cwd' panicked at 'assertion failed: `(left == right)`
[00:56:32]   left: `["bar/test", "baz/bar/test", "baz/foo", "baz/test", "foo", "test"]`,
[00:56:32]  right: `["bar/test", "baz/bar/test", "baz/baz/bar/test", "baz/foo", "baz/test", "foo", "test"]`', tests/tests.rs:1033:4
[00:56:32] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:56:32] 
[00:56:32] 
[00:56:32] failures:
[00:56:32]     feature_45_relative_cwd
[00:56:32] 
[00:56:32] test result: FAILED. 101 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
[00:56:32] 
[00:56:32] error: test failed, to rerun pass '--test integration'
[00:56:32] thread 'main' panicked at 'tests failed for https://github.com/BurntSushi/ripgrep', /checkout/src/tools/cargotest/main.rs:100:8
[00:56:32] note: Run with `RUST_BACKTRACE=1` for a backtrace.

Line 1033 of the test:

https://github.com/BurntSushi/ripgrep/blob/b65bb37b14655e1a89c7cd19c8b011ef3e312791/tests/tests.rs#L1033

The two involved methods:

https://github.com/BurntSushi/ripgrep/blob/b65bb37b14655e1a89c7cd19c8b011ef3e312791/tests/tests.rs#L57-L69

kennytm · 2017-11-15T19:22:11Z

@scottmcm Hi, it has been a week since last activity. Are you still working on the cause of the ripgrep failure?

scottmcm · 2017-11-16T09:01:37Z

Thanks for the ping, @kennytm; I'd been spending my time on 45676.

Any tips for repro'ing the failures? After cloning ripgrep, git checkout b65bb37b14655e1a8 && cargo +stage2 test is completing successfully for me. (I'm on windows; could that matter?)

kennytm · 2017-11-16T09:28:24Z

@scottmcm It could matter since the failure happens on Linux. If it is still not reproducible on Linux, we could ask bors to retry assuming ripgrep's test is flaky.

cuviper · 2017-11-16T22:28:35Z

It works for me on Fedora.

kennytm · 2017-11-17T05:58:33Z

@bors retry

bors · 2017-11-17T07:43:14Z

⌛ Testing commit b5dba91 with merge b32267f...

@bluss

Short-circuiting internal iteration with Iterator::try_fold & try_rfold These are the core methods in terms of which the other methods (`fold`, `all`, `any`, `find`, `position`, `nth`, ...) can be implemented, allowing Iterator implementors to get the full goodness of internal iteration by only overriding one method (per direction). Based off the `Try` trait, so works with both `Result` and `Option` (:tada: #42526). The `try_fold` rustdoc examples use `Option` and the `try_rfold` ones use `Result`. AKA continuing in the vein of PRs #44682 & #44856 for more of `Iterator`. New bench following the pattern from the latter of those: ``` test iter::bench_take_while_chain_ref_sum ... bench: 1,130,843 ns/iter (+/- 25,110) test iter::bench_take_while_chain_sum ... bench: 362,530 ns/iter (+/- 391) ``` I also ran the benches without the `fold` & `rfold` overrides to test their new default impls, with basically no change. I left them there, though, to take advantage of existing overrides and because `AlwaysOk` has some sub-optimality due to #43278 (which 45225 should fix). If you're wondering why there are three type parameters, see issue #45462 Thanks for @bluss for the [original IRLO thread](https://internals.rust-lang.org/t/pre-rfc-fold-ok-is-composable-internal-iteration/4434) and the rfold PR and to @cuviper for adding so many folds, [encouraging me](#45379 (comment)) to make this PR, and finding a catastrophic bug in a pre-review.

scottmcm · 2017-11-17T08:48:06Z

Thanks, @kennytm & @cuviper! Looks like the cargotest job passed this time: https://travis-ci.org/rust-lang/rust/jobs/303410000#L7454 ¯\_(ツ)_/¯

bors · 2017-11-17T10:12:16Z

☀️ Test successful - status-appveyor, status-travis
Approved by: dtolnay
Pushing b32267f to master...

kennytm · 2017-11-17T14:41:38Z

cc @BurntSushi

Undo the Sized specialization from Iterator::nth I just added this as part of #45595, but I'm now afraid there's a specialization issue with it, since I tried to add [another similar specialization](https://github.com/rust-lang/rust/compare/master...scottmcm:faster-iter-by-ref?expand=1#diff-1398f322bc563592215b583e9b0ba936R2390), and ended up getting really disturbing test failures like ``` thread 'iter::test_by_ref_folds' panicked at 'assertion failed: `(left == right)` left: `15`, right: `15`', src\libcore\../libcore/tests\iter.rs:1720:4 ``` So since this wasn't the most critical part of the change and a new beta is branching within a week, I think putting this part back to what it was before is the best option.

scottmcm mentioned this pull request Oct 28, 2017

Tracking issue for Iterator::try_fold and try_rfold (feature iterator_try_fold) #45594

Closed

4 tasks

bluss reviewed Oct 28, 2017

View reviewed changes

kennytm added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 28, 2017

kennytm reviewed Oct 28, 2017

View reviewed changes

bluss reviewed Oct 28, 2017

View reviewed changes

kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 28, 2017

scottmcm force-pushed the iter-try-fold branch from 5fa3b53 to 04269c9 Compare October 28, 2017 17:42

bluss reviewed Oct 28, 2017

View reviewed changes

JordiPolo reviewed Oct 29, 2017

View reviewed changes

scottmcm force-pushed the iter-try-fold branch from 692ecd9 to 4db8332 Compare October 29, 2017 02:55

kennytm added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 29, 2017

scottmcm mentioned this pull request Oct 29, 2017

impl FromIterator<()> for () #45379

Merged

scottmcm force-pushed the iter-try-fold branch from 4db8332 to eef4d42 Compare October 29, 2017 22:46

rust-highfive assigned aturon Oct 30, 2017

arthurprs mentioned this pull request Nov 1, 2017

Optimize slice.{r}position result bounds check #45501

Closed

rust-highfive assigned dtolnay and unassigned aturon Nov 3, 2017

CR feedback

b5dba91

kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 8, 2017

bors merged commit b5dba91 into rust-lang:master Nov 17, 2017

scottmcm deleted the iter-try-fold branch November 17, 2017 10:12

scottmcm mentioned this pull request Nov 18, 2017

Undo the Sized specialization from Iterator::nth #46074

Merged

jonasbb mentioned this pull request Jan 15, 2018

Inconsistent inlineing of Iterator Adaptors - Missed Optimizations #47461

Open

scottmcm mentioned this pull request Jan 12, 2019

Iterating with step_by(1) is much slower than without #57517

Open

scottmcm mentioned this pull request Jan 21, 2019

RangeInclusive iteration performance improvement. #57378

Closed

scottmcm mentioned this pull request Aug 7, 2019

chain() make collect very slow #63340

Open

hanna-kruppe mentioned this pull request Sep 18, 2019

Simplify some Iterator methods. #64572

Closed

NoraCodes mentioned this pull request Sep 3, 2020

Tracking issue for ControlFlow enum, for use with try_fold and in Try #75744

Closed

9 tasks

scottmcm mentioned this pull request Feb 18, 2022

Tracking Issue for Iterator::try_collect #94047

Open

6 tasks

Short-circuiting internal iteration with Iterator::try_fold & try_rfold #45595

Short-circuiting internal iteration with Iterator::try_fold & try_rfold #45595

Conversation

scottmcm commented Oct 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bluss Oct 28, 2017 • edited Loading

Choose a reason for hiding this comment

bluss Oct 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottmcm Oct 29, 2017 • edited Loading

Choose a reason for hiding this comment

bluss Oct 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bluss Nov 3, 2017 • edited Loading

Choose a reason for hiding this comment

JordiPolo Oct 29, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottmcm commented Oct 30, 2017

shepmaster commented Nov 3, 2017

dtolnay commented Nov 7, 2017

bors commented Nov 7, 2017

bors commented Nov 8, 2017

bors commented Nov 8, 2017

kennytm commented Nov 8, 2017

kennytm commented Nov 15, 2017

scottmcm commented Nov 16, 2017

kennytm commented Nov 16, 2017

cuviper commented Nov 16, 2017

kennytm commented Nov 17, 2017

bors commented Nov 17, 2017

scottmcm commented Nov 17, 2017 • edited Loading

bors commented Nov 17, 2017

kennytm commented Nov 17, 2017

scottmcm commented Oct 28, 2017 •

edited

Loading

bluss Oct 28, 2017 •

edited

Loading

bluss Oct 28, 2017 •

edited

Loading

scottmcm Oct 29, 2017 •

edited

Loading

bluss Oct 28, 2017 •

edited

Loading

bluss Nov 3, 2017 •

edited

Loading

JordiPolo Oct 29, 2017 •

edited

Loading

scottmcm commented Nov 17, 2017 •

edited

Loading