Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Calculate capacity when collecting into Option and Result #52910

Closed
wants to merge 1 commit into from

Conversation

ljedrz
Copy link
Contributor

@ljedrz ljedrz commented Jul 31, 2018

I was browsing the perf page to see the impact of my recent changes (e.g. #52697) and I was surprised that some of the results were not as awesome as I expected. I dug some more and found an issue that is the probable culprit: Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance.

Collecting into Option or Result might result in an empty collection, but there is no reason why we shouldn't provide a non-zero lower bound when we know the Iterator we are collecting from doesn't contain any None or Err.

We know this, because the Adapter iterator used in the FromIterator implementations for Option and Result registers if any None or Err are present in the Iterator in question; we can use this information and return a more accurate lower bound in case we know it won't be equal to zero.

I have benchmarked collecting into Option and Result using the current implementation and one with the proposed changes; I have also benchmarked a push loop with a known capacity as a reference that should be slower than using FromIterator (i.e. collect()). The results are quite promising:

test bench_collect_to_option_new ... bench:         246 ns/iter (+/- 23)
test bench_collect_to_option_old ... bench:         954 ns/iter (+/- 54)
test bench_collect_to_result_new ... bench:         250 ns/iter (+/- 25)
test bench_collect_to_result_old ... bench:         939 ns/iter (+/- 104)
test bench_push_loop_to_option   ... bench:         294 ns/iter (+/- 21)
test bench_push_loop_to_result   ... bench:         303 ns/iter (+/- 29)

Fixes #48994.

@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 31, 2018
@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 31, 2018
@sfackler
Copy link
Member

but there is no reason why we shouldn't provide a non-zero lower bound when we know the Iterator we are collecting from doesn't contain any None or Err.

We don't know that the iterator doesn't contain a None or Err, just that we haven't seen one yet. That being said, it does seem pretty reasonable to optimize for the case where there is no None or Err since that's probably the common case.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jul 31, 2018

Team member @sfackler has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 31, 2018
@nagisa
Copy link
Member

nagisa commented Jul 31, 2018

Even if there are any Nones and Errs within the iterator, method will still end up collecting all the elements into some collection before the first None or Err and the collection gets discarded, right? This change is a no-brainer.

@rfcbot
Copy link

rfcbot commented Aug 1, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Aug 1, 2018
@sfackler
Copy link
Member

sfackler commented Aug 1, 2018

@bors r+

@bors
Copy link
Contributor

bors commented Aug 1, 2018

📌 Commit 77aa031 has been approved by sfackler

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 1, 2018
@ollie27
Copy link
Member

ollie27 commented Aug 1, 2018

Isn't this creating an iterator that might return less elements than the lower bound of its size_hint? That's a violation of the trait's protocol: https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#implementation-notes

@sfackler
Copy link
Member

sfackler commented Aug 1, 2018

Yep

@SimonSapin
Copy link
Contributor

Yes it is a protocol violation, but since the iterator is private there is no "victim".

@ollie27
Copy link
Member

ollie27 commented Aug 1, 2018

The iterator is private but it can be passed to anything implementing FromIterator including things outside of std which we can't guarantee will behave sensibly with an invalid iterator.

@sfackler
Copy link
Member

sfackler commented Aug 2, 2018

That's true but it seems like the benefits outweigh the costs here, right? I don't really see there being FromIterator implementations that would explode if the lower bound estimate is wrong.

@ollie27
Copy link
Member

ollie27 commented Aug 2, 2018

How about SmallVec: https://github.com/servo/rust-smallvec/blob/fcb1b7e99a2a98d928983af78a494ef566cbde8b/lib.rs#L1141-L1174. If the lower bound is wrong then it will try to iterate over an already consumed iterator (playground).

@ljedrz
Copy link
Contributor Author

ljedrz commented Aug 2, 2018

@ollie27 I'm not sure about your example, but SmallVec doesn't seem to regress with this change.

@ollie27
Copy link
Member

ollie27 commented Aug 2, 2018

SmallVec was just an example of real world code that relies on a correct implementation of size_hint to not explode. It does regress though if you consider that "no further elements are taken" once the iterator has returned None (playground).

Even Vec's FromIterator implementation can't handle nonsense lower bounds in size_hint (playground).

@ljedrz
Copy link
Contributor Author

ljedrz commented Aug 2, 2018

Is there any way around this? We wouldn't want to be forced to using push loops, which are both less idiomatic and less performant (though a lot less than using the existing implementation of Option/Result::FromIterator).

@ljedrz
Copy link
Contributor Author

ljedrz commented Aug 2, 2018

@ollie27 What about the following alternative implementation?

fn option_from_iter_new<A, V: FromIterator<A>, I: IntoIterator<Item=Option<A>>>(iter: I) -> Option<V> {
    let iter = iter.into_iter();
    let mut v = Vec::with_capacity(
        if iter.size_hint().1.is_some() { iter.size_hint().0 } else { 0 }
    );

    for elem in iter.into_iter() {
        match elem {
            Some(e) => v.push(e),
            None => return None
        }
    }

    Some(v.into_iter().collect())
}

It makes both your playground examples compile and is about as fast as a manual push loop:

test bench_collect_to_option_new ... bench:         220 ns/iter (+/- 12)
test bench_collect_to_option_old ... bench:         600 ns/iter (+/- 32)
test bench_collect_to_result_new ... bench:         222 ns/iter (+/- 16)
test bench_collect_to_result_old ... bench:         596 ns/iter (+/- 40)
test bench_push_loop_to_option   ... bench:         211 ns/iter (+/- 13)
test bench_push_loop_to_result   ... bench:         221 ns/iter (+/- 8)

I'm not too happy about that intermediate Vec (any other ideas?), but it still looks like a performance win.

cramertj added a commit to cramertj/rust that referenced this pull request Aug 2, 2018
Calculate capacity when collecting into Option and Result

I was browsing the [perf page](http://perf.rust-lang.org) to see the impact of my recent changes (e.g. rust-lang#52697) and I was surprised that some of the results were not as awesome as I expected. I dug some more and found an issue that is the probable culprit: [Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance](rust-lang#48994).

Collecting into `Option` or `Result` might result in an empty collection, but there is no reason why we shouldn't provide a non-zero lower bound when we know the `Iterator` we are collecting from doesn't contain any `None` or `Err`.

We know this, because the `Adapter` iterator used in the `FromIterator` implementations for `Option` and `Result` registers if any `None` or `Err` are present in the `Iterator` in question; we can use this information and return a more accurate lower bound in case we know it won't be equal to zero.

I [have benchmarked](https://gist.github.com/ljedrz/c2fcc19f6260976ae7a46ae47aa71fb5) collecting into `Option` and `Result` using the current implementation and one with the proposed changes; I have also benchmarked a push loop with a known capacity as a reference that should be slower than using `FromIterator` (i.e. `collect()`). The results are quite promising:
```
test bench_collect_to_option_new ... bench:         246 ns/iter (+/- 23)
test bench_collect_to_option_old ... bench:         954 ns/iter (+/- 54)
test bench_collect_to_result_new ... bench:         250 ns/iter (+/- 25)
test bench_collect_to_result_old ... bench:         939 ns/iter (+/- 104)
test bench_push_loop_to_option   ... bench:         294 ns/iter (+/- 21)
test bench_push_loop_to_result   ... bench:         303 ns/iter (+/- 29)
```
Fixes rust-lang#48994.
@ollie27
Copy link
Member

ollie27 commented Aug 2, 2018

Using an intermediate Vec like that is may work fine when collecting into a Vec because of this specialisation but likely won't be good for other collections. It will also still hit OOM if the size_hint is something like (usize::MAX, Some(usize::MAX)).

I'm not sure what can really be done here. Unfortunately, this may be a case where you just have to manually use with_capacity to get the best performance. That's already what you have to do when collecting iterators that don't have precise size_hints.

I'll take this off the queue, hopefully I've made my point.

@bors r-

@bors bors removed the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Aug 2, 2018
@scottmcm
Copy link
Member

@TimNN
Copy link
Contributor

TimNN commented Jan 29, 2019

Ping from triage! Could someone summarize the status of this PR?

@scottmcm
Copy link
Member

The PR looks plausible to me, FWIW.

I think it needs @ljedrz to rebase it to resolve the conflicts, then probably a try+perf to confirm it's effective?

@ljedrz
Copy link
Contributor Author

ljedrz commented Jan 30, 2019

@scottmcm Rebased; please feel free to do a perf run if you feel this can work.

@bors
Copy link
Contributor

bors commented Jan 30, 2019

☔ The latest upstream changes (presumably #57974) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm
Copy link
Member

scottmcm commented Jan 30, 2019

@ljedrz Sorry, can you rebase again? Seems like something else stepped on you again 🙁

@ljedrz
Copy link
Contributor Author

ljedrz commented Feb 1, 2019

@scottmcm no prob; rebased.

@scottmcm
Copy link
Member

scottmcm commented Feb 1, 2019

Let's see if I have permissions to

@bors try

@bors
Copy link
Contributor

bors commented Feb 1, 2019

⌛ Trying commit 5762e67 with merge 8b32f79...

bors added a commit that referenced this pull request Feb 1, 2019
[WIP] Calculate capacity when collecting into Option and Result

I was browsing the [perf page](http://perf.rust-lang.org) to see the impact of my recent changes (e.g. #52697) and I was surprised that some of the results were not as awesome as I expected. I dug some more and found an issue that is the probable culprit: [Collecting into a Result<Vec<_>> doesn't reserve the capacity in advance](#48994).

Collecting into `Option` or `Result` might result in an empty collection, but there is no reason why we shouldn't provide a non-zero lower bound when we know the `Iterator` we are collecting from doesn't contain any `None` or `Err`.

We know this, because the `Adapter` iterator used in the `FromIterator` implementations for `Option` and `Result` registers if any `None` or `Err` are present in the `Iterator` in question; we can use this information and return a more accurate lower bound in case we know it won't be equal to zero.

I [have benchmarked](https://gist.github.com/ljedrz/c2fcc19f6260976ae7a46ae47aa71fb5) collecting into `Option` and `Result` using the current implementation and one with the proposed changes; I have also benchmarked a push loop with a known capacity as a reference that should be slower than using `FromIterator` (i.e. `collect()`). The results are quite promising:
```
test bench_collect_to_option_new ... bench:         246 ns/iter (+/- 23)
test bench_collect_to_option_old ... bench:         954 ns/iter (+/- 54)
test bench_collect_to_result_new ... bench:         250 ns/iter (+/- 25)
test bench_collect_to_result_old ... bench:         939 ns/iter (+/- 104)
test bench_push_loop_to_option   ... bench:         294 ns/iter (+/- 21)
test bench_push_loop_to_result   ... bench:         303 ns/iter (+/- 29)
```
Fixes #48994.
@bors
Copy link
Contributor

bors commented Feb 2, 2019

☀️ Test successful - checks-travis
State: approved= try=True

@Mark-Simulacrum
Copy link
Member

@rust-timer build 8b32f79

@rust-timer
Copy link
Collaborator

Success: Queued 8b32f79 with parent 23d8d0c, comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 8b32f79

@ljedrz
Copy link
Contributor Author

ljedrz commented Feb 2, 2019

Hmm, perf doesn't seem to have detected any difference.

@Mark-Simulacrum
Copy link
Member

@sfackler Are you still the correct reviewer for this PR? Perhaps we should re-assign to @scottmcm? I'm not sure what the current status here is.

@Dylan-DPC-zz
Copy link

ping from triage @ljedrz you have conflicts to resolve.

@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 18, 2019
@Dylan-DPC-zz
Copy link

ping from triage @ljedrz any updates?

@ljedrz
Copy link
Contributor Author

ljedrz commented Mar 25, 2019

@Dylan-DPC Actually I'm not sure how to proceed at this point - the specialization doesn't seem to be kicking in. I think we can close this for the time being and perhaps get back to it some other time.

@Dylan-DPC-zz
Copy link

@ljedrz no issues. Closing this as per your comment. Thanks for taking the time to contribute :)

@Dylan-DPC-zz Dylan-DPC-zz added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 25, 2019
@ljedrz ljedrz deleted the fix_48994 branch June 24, 2019 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.