Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add itertools::note_position. #169

Merged
merged 2 commits into from
Nov 24, 2016

Conversation

luser
Copy link
Contributor

@luser luser commented Nov 15, 2016

note_position wraps each returned iterator element in a Position<T>
element, which indicates the position of the element in the iterator
results. This is useful for special-case handling of the first or
last elements of an iterator.

I wrote an implementation of this on a whim last week and someone suggested that itertools would be a good home for it. I've written lots of code in other languages where I've needed to handle the first or last
element of an iteration specially, and I was curious as to whether Rust would make writing a general-purpose implementation of that easy. Of course it did. :)

`note_position` wraps each returned iterator element in a `Position<T>`
element, which indicates the position of the element in the iterator
results. This is useful for special-case handling of the first or
last elements of an iterator.
/// An iterator adaptor that wraps each element in an [`Position`](../enum.Position.html).
///
/// Iterator element type is `Position<I::Item>`.
/// This iterator is *fused*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator is not fused. It doesn't need to be, just needs correct docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, although it doesn't use Fuse directly. Peekable is fused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will remove the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fused iterator would pass the test called exact_position in this diff on top of your PR (as it was), and it doesn't. Peekable calls .next() on the underlying iterator, even if it has seen None already.

diff --git a/src/note_position.rs b/src/note_position.rs
index 58f8dc8..8bac7b1 100644
--- a/src/note_position.rs
+++ b/src/note_position.rs
@@ -89,4 +89,8 @@ impl<I: Iterator> Iterator for NotePosition<I> {
             (_, None) => None,
         }
     }
+
+    fn size_hint(&self) -> (usize, Option<usize>) {
+        self.peekable.size_hint()
+    }
 }
diff --git a/tests/quick.rs b/tests/quick.rs
index a0176c8..9aa8681 100644
--- a/tests/quick.rs
+++ b/tests/quick.rs
@@ -346,6 +346,12 @@ quickcheck! {
     fn exact_interleave(a: Iter<i16>, b: Iter<i16>) -> bool {
         exact_size_for_this(a.interleave(b))
     }
+    fn exact_position(a: Iter<i16>) -> bool {
+        exact_size_for_this(a.note_position())
+    }
+    fn exact_position_2(a: Vec<i16>) -> bool {
+        exact_size_for_this(a.iter().note_position())
+    }
     fn size_interleave_shortest(a: Iter<i16>, b: Iter<i16>) -> bool {
         correct_size_hint(a.interleave_shortest(b))
     }

(Iter is a non-fused iterator that the quickcheck test suite uses).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm ok, after some thinking about peekable, it's actually pretty bad that it's not fused.

Calling .peek() and observing None is the only signal the user gets that the iterator is exhausted -- calling .next() after that leads to iterator-specific behaviour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuck. FWIW, I think I misread the std docs, which say impl<I> FusedIterator for Peekable<I> where I: FusedIterator, which makes sense.

Should I change this to Peekable<Fuse<I>> to work around that problem for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Only(T),
}

impl<T: PartialEq> PartialEq for Position<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to derive(PartialEq), prefer using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not know you could derive(PartialEq) or Debug on a generic without having a trait bound for it. TIL!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you know what the deriving recipes do: literally what you typed :)

}
}

impl<T: fmt::Debug> fmt::Debug for Position<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to derive(Debug) except for pretty printing, why not use deriving.

// Iterator is finished.
(_, None) => None,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should forward size_hint from the underlying iterator.

/// use itertools::{Itertools, Position};
///
/// let it = (0..4).note_position();
/// itertools::assert_equal(it, vec![Position::First(0), Position::Middle(1), Position::Middle(2), Position::Last(3)]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is too long, keep it inside 100 columns please.

type Item = Position<I::Item>;

fn next(&mut self) -> Option<Self::Item> {
match (self.handled_first, self.peekable.next()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if this method was written in a style where there's a branch if self.handled_first inside the match/conditional on .next()

@bluss
Copy link
Member

bluss commented Nov 15, 2016

This looks interesting. Why do you need to separate first from only? Just wondering.

Also, is the name .with_position() better?

/// Indicates the position of this element in the iterator results.
///
/// See [`.note_position()`](trait.Itertools.html#method.note_position) for more information.
pub enum Position<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

derive(Copy, Clone) are appropriate.

@@ -70,6 +70,7 @@ pub mod structs {
pub use intersperse::Intersperse;
pub use kmerge_impl::KMerge;
pub use pad_tail::PadUsing;
pub use note_position::NotePosition;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line is off by one from being in alpha order.

@luser
Copy link
Contributor Author

luser commented Nov 15, 2016

I split Only out because if you are expecting to special-case the last element of an iterator and there's only one element, you'd miss it. If there's only one element it's the first and the last, but I don't think I'd want it to be returned twice, that seems more confusing.

I am open to whatever naming you want. I don't know of prior art in this area so I just sort of plucked that name out of thin air.

Thanks for the review! I tried to copy things from the surrounding context but I guess I missed a few things. :)

@bluss
Copy link
Member

bluss commented Nov 15, 2016

The rest that I didnt' comment on was very good. That's always what the reviewer forgets to say.

@bluss
Copy link
Member

bluss commented Nov 15, 2016

You know what, if you give the decision to me I'd prefer .with_position()

But now I have also contemplated Position<T> versus (Position, T) and I actually think the latter is better. The former is good if we want to force the user to handle each case exhaustively, but that's not necessarily the case. It seems harmless to allow (Position, T) instead?

@bluss
Copy link
Member

bluss commented Nov 15, 2016

Maybe embedding T in position has a point, that it makes people not forget about the Only case.

@bluss
Copy link
Member

bluss commented Nov 15, 2016

I've reported rust-lang/rust/issues/37784 since Peekable's current behaviour makes it very hard to use, indeed this PR shows why (the implementation needs to use .fuse()).

@luser
Copy link
Contributor Author

luser commented Nov 15, 2016

But now I have also contemplated Position versus (Position, T) and I actually think the latter is better. The former is good if we want to force the user to handle each case exhaustively, but that's not necessarily the case. It seems harmless to allow (Position, T) instead?

This argument is pretty compelling. If you only want to handle the first item specially, right now you'd have to write:

match item {
  Position::First(i) | Position::Only(i) => {}
  Position::Middle(i) | Position::Last(i) => {}
}

If the element type was a tuple you could instead write:

match item {
  (Position::First, i) | (Position::Only, i) => {}
  (_, i) => {}
}

with the downside that yeah, you could easily forget Position::Only. I guess an alternative here would be to instead do something like:

struct Position {
  pub is_first: bool,
  pub is_last: bool,
}

and then have the iterator produce (Position, T) tuples. Then you'd instead use it like:

for (pos, i) in iter.with_position() {
  if pos.is_first {
    //...
  }
  if pos.is_last {
    // ...
  }
}

What do you think?

@bluss
Copy link
Member

bluss commented Nov 15, 2016

Enum is better. Rust enforces exhaustive matching or explicit opt out, so that's good enough?

* Rename note_position -> with_position
* Use Peekable<Fuse<I>> in WithPosition
* Forward size_hint from WithPosition to inner iterator
* Use derive(Copy, Clone, Debug, PartialEq) on Position
* Change next() impl conditionals
* Fix import ordering
* Fix line length
@luser
Copy link
Contributor Author

luser commented Nov 24, 2016

This just needs to be merged, then.

@bluss
Copy link
Member

bluss commented Nov 24, 2016

I'm sorry for keeping this waiting. This is going to be useful I think.

@bluss bluss merged commit 0576057 into rust-itertools:master Nov 24, 2016
@bluss
Copy link
Member

bluss commented Nov 24, 2016

I still disagree about documenting it as fused, but I can make small edits myself.

I'm adding .into_inner() -> T, but further methods could be useful too, just don't know what to name them.

For example is_first() -> bool (First, Only) and is_strictly_first() -> bool (First), maybe?

@luser
Copy link
Contributor Author

luser commented Dec 7, 2016

I think is_first(), is_last() make sense, because they each cover two enum variants. I don't know about is_strictly_first(), it seems like at that point you could just match on the enum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants