Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add try_fold, try_for_each, try_reduce #563

Merged
merged 11 commits into from
Jun 27, 2018
Merged

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Apr 7, 2018

There are six variations here, matching the existing non-try suite:

  • try_fold and try_fold_with
  • try_for_each and try_for_each_with
  • try_reduce and try_reduce_with

All of them operate on Try::Ok values similar to the exiting non-try
methods, and short-circuit early to return any Try::Error value seen.
This Try is a pub-in-private clone of the unstable std::ops::Try,
implemented for Option<T> and Result<T, E>.

TODO and open questions:

  • Needs documentation, examples, and tests.
  • Should we wait for Iterator::try_fold and try_for_each to
    reach rust stable? They were stabilized in Stabilize iterator methods in 1.27 rust-lang/rust#49607,
    but there's always a chance this could get backed out.
    • Resolved: they're stable in 1.27
  • Should we wait for stable std::ops::Try? We could just keep
    ours private for now, and change to publicly use the standard
    trait later (on sufficiently new rustc).
    • Resolved: keep our pub-in-private Try for now.
  • Should try_fold and try_fold_with continue to short-circuit
    globally, or change to only a local check?
    • When I first implemented try_reduce_with, I use a try_fold +
      try_reduce combination, like reduce_with's implementation, but
      I didn't like the idea of having double full: AtomicBool flags
      in use.
    • If try_fold only errors locally, then other threads can continue
      folding normally, and you can decide what to do with the error
      when you further reduce/collect/etc. e.g. A following try_reduce
      will still short-circuit globally.
    • Resolved: changed to just a local check.

Closes #495.

@nikomatsakis nikomatsakis self-requested a review May 30, 2018 17:48
@nikomatsakis
Copy link
Member

nikomatsakis commented Jun 6, 2018

I think the proper behavior of try_fold is for it to end early, just as you have implemented it. Let me summarize to make sure I understand:

  • Today, try_fold yields up an iterator over Option<T> values (or whatever);
  • But once it encounters a None anywhere, it makes a best effort to stop producing further values anywhere;
  • So typically you would have at most one None in the sequence (though it's probably possible to have to more than one, if the timing is poor).

The alternative being proposed is that, once fold has settled on a sequential list of things to process, it will always finish, and hence you can very easily get multiple None in the sequence, and fold avoids global communication via an AtomicBool.

Given these two options, I suspect that we want the current behavior. It seems to me that the are only reason to want the second behavior is if you somehow wanted to ensure that some elements were processed even though a None would occur at some other point; but given that you don't control where the splits occur in a fold operation, you are not actually given that assurance.

The only property that you do get, afaict, is that if a None occurs later in the sequence, then the side-effects of earlier operations will occur (the side-effects of later operations may occur). That seems pretty thin and unlikely to be what you wanted.

In contrast, what we give up is that we potentially do a bunch of extra work, producing Some(_) values that will later be discarded.

Copy link
Member

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from needing tests/docs/etc, I feel pretty good about mirroring the std API. However, we definitely want to keep the Try trait private (as indeed we have done here).

@cuviper
Copy link
Member Author

cuviper commented Jun 6, 2018

The alternative being proposed is that, once fold has settled on a sequential list of things to process, it will always finish, and hence you can very easily get multiple None in the sequence, and fold avoids global communication via an AtomicBool.

I'm proposing that try_fold would still short-circuit its local work, and just lose the global check. My thought is that you'd most often combine with try_reduce, which has its own global check. But that won't know there's any Err/None until the preceding try_fold passes it through, so we still need that to happen ASAP.

I don't know if there's an actual useful semantic here, beyond avoiding redundant AtomicBools. This is a small optimization for the try_fold().try_reduce() case, but the change would be noticeable if anybody did try_fold() followed by something that doesn't short circuit. I don't know if that's realistic. You're right that folks don't have control over splitting in general, but with_min/max_len do offer some control.

@cuviper
Copy link
Member Author

cuviper commented Jun 6, 2018

I guess what you thought I was proposing was that all sequential runs would be processed up to an Err/None, then resume that local sequence with a new accumulator? So a local fold might actually produce multiple accumulations for the next consumer... that's interesting too. I don't know if that's useful either.

}

fn full(&self) -> bool {
self.result.is_err() || self.full.load(Ordering::Relaxed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't self.result.is_err() impliy self.full is true? (Or will be shortly?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does. In theory the local check could be faster than an atomic load, even relaxed, but I suppose that's micro-optimizing for the uncommon error case. I'll just check the latter.

Even though we already know `Self::Item: Send`, this doesn't apply
constraints to its `Try::Ok` and `Error`.  Therefore, if we map to
`Result<Option<Ok>, Error>` at a high level, we have to require extra
`Send` bounds to make sure that map is `Send`, like:

     <Self::Item as Try>::Ok: Send,
     <Self::Item as Try>::Error: Send

We can avoid this by implementing a custom `Consumer`, where we only
hold it as a `Result` in the low level `Folder` regardless of `Send`.
@nikomatsakis nikomatsakis self-assigned this Jun 27, 2018
@nikomatsakis
Copy link
Member

bors r+

bors bot added a commit that referenced this pull request Jun 27, 2018
563: [WIP] Add try_fold, try_for_each, try_reduce r=nikomatsakis a=cuviper

There are six variations here, matching the existing non-try suite:

- `try_fold` and `try_fold_with`
- `try_for_each` and `try_for_each_with`
- `try_reduce` and `try_reduce_with`

All of them operate on `Try::Ok` values similar to the exiting non-try
methods, and short-circuit early to return any `Try::Error` value seen.
This `Try` is a pub-in-private clone of the unstable `std::ops::Try`,
implemented for `Option<T>` and `Result<T, E>`.

TODO and open questions:

- [ ] Needs documentation, examples, and tests.
- [x] Should we wait for `Iterator::try_fold` and `try_for_each` to
      reach rust stable?  They were stabilized in rust-lang/rust#49607,
      but there's always a chance this could get backed out.
    - **Resolved**: they're stable in 1.27
- [x] Should we wait for stable `std::ops::Try`?  We could just keep
      ours private for now, and change to publicly use the standard
      trait later (on sufficiently new rustc).
    - **Resolved**: keep our pub-in-private `Try` for now.
- [x] Should `try_fold` and `try_fold_with` continue to short-circuit
      globally, or change to only a local check?
  - When I first implemented `try_reduce_with`, I use a `try_fold` +
    `try_reduce` combination, like `reduce_with`'s implementation, but
    I didn't like the idea of having double `full: AtomicBool` flags
    in use.
  - If `try_fold` only errors locally, then other threads can continue
    folding normally, and you can decide what to do with the error
    when you further reduce/collect/etc.  e.g. A following `try_reduce`
    will still short-circuit globally.
  - **Resolved**: changed to just a local check.

Closes #495.

Co-authored-by: Josh Stone <cuviper@gmail.com>
@bors
Copy link
Contributor

bors bot commented Jun 27, 2018

@bors bors bot merged commit 800c736 into rayon-rs:master Jun 27, 2018
@cuviper cuviper deleted the try_fold branch October 18, 2018 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants