Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add take_while_inclusive method #616

Merged
merged 7 commits into from
Jun 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 86 additions & 17 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ pub mod structs {
pub use crate::repeatn::RepeatN;
#[allow(deprecated)]
pub use crate::sources::{RepeatCall, Unfold, Iterate};
pub use crate::take_while_inclusive::TakeWhileInclusive;
#[cfg(feature = "use_alloc")]
pub use crate::tee::Tee;
pub use crate::tuple_impl::{TupleBuffer, TupleWindows, CircularTupleWindows, Tuples};
Expand Down Expand Up @@ -233,6 +234,7 @@ mod rciter_impl;
mod repeatn;
mod size_hint;
mod sources;
mod take_while_inclusive;
#[cfg(feature = "use_alloc")]
mod tee;
mod tuple_impl;
Expand Down Expand Up @@ -904,7 +906,7 @@ pub trait Itertools : Iterator {

/// Return an iterator adaptor that flattens every `Result::Ok` value into
/// a series of `Result::Ok` values. `Result::Err` values are unchanged.
///
///
/// This is useful when you have some common error type for your crate and
/// need to propagate it upwards, but the `Result::Ok` case needs to be flattened.
///
Expand All @@ -914,7 +916,7 @@ pub trait Itertools : Iterator {
/// let input = vec![Ok(0..2), Err(false), Ok(2..4)];
/// let it = input.iter().cloned().flatten_ok();
/// itertools::assert_equal(it.clone(), vec![Ok(0), Ok(1), Err(false), Ok(2), Ok(3)]);
///
///
/// // This can also be used to propagate errors when collecting.
/// let output_result: Result<Vec<i32>, bool> = it.collect();
/// assert_eq!(output_result, Err(false));
Expand Down Expand Up @@ -1389,6 +1391,74 @@ pub trait Itertools : Iterator {
adaptors::take_while_ref(self, accept)
}

/// Returns an iterator adaptor that consumes elements while the given
/// predicate is `true`, *including* the element for which the predicate
/// first returned `false`.
///
/// The [`.take_while()`][std::iter::Iterator::take_while] adaptor is useful
/// when you want items satisfying a predicate, but to know when to stop
/// taking elements, we have to consume that first element that doesn't
/// satisfy the predicate. This adaptor includes that element where
/// [`.take_while()`][std::iter::Iterator::take_while] would drop it.
///
/// The [`.take_while_ref()`][crate::Itertools::take_while_ref] adaptor
/// serves a similar purpose, but this adaptor doesn't require [`Clone`]ing
/// the underlying elements.
///
/// ```rust
/// # use itertools::Itertools;
/// let items = vec![1, 2, 3, 4, 5];
/// let filtered: Vec<_> = items
/// .into_iter()
/// .take_while_inclusive(|&n| n % 3 != 0)
/// .collect();
///
/// assert_eq!(filtered, vec![1, 2, 3]);
/// ```
///
/// ```rust
/// # use itertools::Itertools;
/// let items = vec![1, 2, 3, 4, 5];
///
/// let take_while_inclusive_result: Vec<_> = items
/// .iter()
/// .copied()
/// .take_while_inclusive(|&n| n % 3 != 0)
/// .collect();
/// let take_while_result: Vec<_> = items
/// .into_iter()
/// .take_while(|&n| n % 3 != 0)
/// .collect();
///
/// assert_eq!(take_while_inclusive_result, vec![1, 2, 3]);
/// assert_eq!(take_while_result, vec![1, 2]);
/// // both iterators have the same items remaining at this point---the 3
/// // is lost from the `take_while` vec
/// ```
///
/// ```rust
/// # use itertools::Itertools;
/// #[derive(Debug, PartialEq)]
/// struct NoCloneImpl(i32);
///
/// let non_clonable_items: Vec<_> = vec![1, 2, 3, 4, 5]
/// .into_iter()
/// .map(NoCloneImpl)
/// .collect();
/// let filtered: Vec<_> = non_clonable_items
/// .into_iter()
/// .take_while_inclusive(|n| n.0 % 3 != 0)
/// .collect();
/// let expected: Vec<_> = vec![1, 2, 3].into_iter().map(NoCloneImpl).collect();
/// assert_eq!(filtered, expected);
fn take_while_inclusive<F>(&mut self, accept: F) -> TakeWhileInclusive<Self, F>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested in the original pull github issue in rust, wouldn't something like take_until be easier to find?

Copy link
Author

@junbl junbl Aug 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. The motivation for this method is "take_while but keeping the last element". take_until doesn't imply anything about the behavior with the final element—the only thing it implies is that the semantics of the predicate are reversed, e.g.:

// take elements while they're less than 4
iter.take_while(|x| x < 4)
// take elements until one is greater than or equal to 4
iter.take_until(|x| x >= 4)

If I just read this code without reading any docs, I would probably assume these did the same thing, and certainly wouldn't guess the motivation of having both methods. The negation becomes the most noticeable difference and distracts from the actual purpose.

Plus, if I see take_while in std and I'm looking for a method that's almost that but lets me keep the last element, I'm much more likely to find take_while_inclusive than take_until. And I'd be annoyed to have to negate the predicate if I was converting one to the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely used to until COND being shorthand for while !COND, like with https://www.perltutorial.org/perl-until/, not implying another iteration or similar.

where
Self: Sized,
F: FnMut(&Self::Item) -> bool,
{
take_while_inclusive::TakeWhileInclusive::new(self, accept)
}

/// Return an iterator adaptor that filters `Option<A>` iterator elements
/// and produces `A`. Stops on the first `None` encountered.
///
Expand Down Expand Up @@ -1812,14 +1882,14 @@ pub trait Itertools : Iterator {
///
/// #[derive(PartialEq, Debug)]
/// enum Enum { A, B, C, D, E, }
///
///
/// let mut iter = vec![Enum::A, Enum::B, Enum::C, Enum::D].into_iter();
///
///
/// // search `iter` for `B`
/// assert_eq!(iter.contains(&Enum::B), true);
/// // `B` was found, so the iterator now rests at the item after `B` (i.e, `C`).
/// assert_eq!(iter.next(), Some(Enum::C));
///
///
/// // search `iter` for `E`
/// assert_eq!(iter.contains(&Enum::E), false);
/// // `E` wasn't found, so `iter` is now exhausted
Expand Down Expand Up @@ -2694,7 +2764,6 @@ pub trait Itertools : Iterator {
/// itertools::assert_equal(oldest_people_first,
/// vec!["Jill", "Jack", "Jane", "John"]);
/// ```
/// ```
#[cfg(feature = "use_alloc")]
fn sorted_by_cached_key<K, F>(self, f: F) -> VecIntoIter<Self::Item>
where
Expand Down Expand Up @@ -2870,13 +2939,13 @@ pub trait Itertools : Iterator {
group_map::into_group_map_by(self, f)
}

/// Constructs a `GroupingMap` to be used later with one of the efficient
/// Constructs a `GroupingMap` to be used later with one of the efficient
/// group-and-fold operations it allows to perform.
///
///
/// The input iterator must yield item in the form of `(K, V)` where the
/// value of type `K` will be used as key to identify the groups and the
/// value of type `V` as value for the folding operation.
///
///
/// See [`GroupingMap`] for more informations
/// on what operations are available.
#[cfg(feature = "use_std")]
Expand All @@ -2887,12 +2956,12 @@ pub trait Itertools : Iterator {
grouping_map::new(self)
}

/// Constructs a `GroupingMap` to be used later with one of the efficient
/// Constructs a `GroupingMap` to be used later with one of the efficient
/// group-and-fold operations it allows to perform.
///
///
/// The values from this iterator will be used as values for the folding operation
/// while the keys will be obtained from the values by calling `key_mapper`.
///
///
/// See [`GroupingMap`] for more informations
/// on what operations are available.
#[cfg(feature = "use_std")]
Expand Down Expand Up @@ -3603,7 +3672,7 @@ pub trait Itertools : Iterator {
/// first_name: &'static str,
/// last_name: &'static str,
/// }
///
///
/// let characters =
/// vec![
/// Character { first_name: "Amy", last_name: "Pond" },
Expand All @@ -3614,12 +3683,12 @@ pub trait Itertools : Iterator {
/// Character { first_name: "James", last_name: "Norington" },
/// Character { first_name: "James", last_name: "Kirk" },
/// ];
///
/// let first_name_frequency =
///
/// let first_name_frequency =
/// characters
/// .into_iter()
/// .counts_by(|c| c.first_name);
///
///
/// assert_eq!(first_name_frequency["Amy"], 3);
/// assert_eq!(first_name_frequency["James"], 4);
/// assert_eq!(first_name_frequency.contains_key("Asha"), false);
Expand All @@ -3640,7 +3709,7 @@ pub trait Itertools : Iterator {
/// column.
///
/// This function is, in some sense, the opposite of [`multizip`].
///
///
/// ```
/// use itertools::Itertools;
///
Expand Down
68 changes: 68 additions & 0 deletions src/take_while_inclusive.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
use core::iter::FusedIterator;
use std::fmt;

/// An iterator adaptor that consumes elements while the given predicate is
/// `true`, including the element for which the predicate first returned
/// `false`.
///
/// See [`.take_while_inclusive()`](crate::Itertools::take_while_inclusive)
/// for more information.
#[must_use = "iterator adaptors are lazy and do nothing unless consumed"]
pub struct TakeWhileInclusive<'a, I: 'a, F> {
iter: &'a mut I,
predicate: F,
done: bool,
}

impl<'a, I, F> TakeWhileInclusive<'a, I, F>
where
I: Iterator,
F: FnMut(&I::Item) -> bool,
{
/// Create a new [`TakeWhileInclusive`] from an iterator and a predicate.
pub fn new(iter: &'a mut I, predicate: F) -> Self {
Self { iter, predicate, done: false}
}
}

impl<'a, I, F> fmt::Debug for TakeWhileInclusive<'a, I, F>
where I: Iterator + fmt::Debug,
{
debug_fmt_fields!(TakeWhileInclusive, iter);
}

impl<'a, I, F> Iterator for TakeWhileInclusive<'a, I, F>
where
I: Iterator,
F: FnMut(&I::Item) -> bool
{
type Item = I::Item;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pondering: Would it be worth returning the result of the predicate in some way in the item type, since that information is known free anyway?

I was reminded of Position::Last, though it shouldn't be that type (since this doesn't know the other variants inherently), and I suppose one could always .take_while_inclusive(…).with_position() if they wanted.

My brain then thought about things like Result<I::Item, I::Item> so the last item is an Err. But that makes me go one step further and think of a take_while_map_inclusive (hopefully with a better name!) where the predicate returns a Result (or a ControlFlow) instead of a bool, and stops after returning the Err (or the Break).

Dunno if any of that is useful or worth doing, but I figured I'd toss it out for discussion.

Copy link

@smessmer smessmer Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a custom type that Deref's to the actual item type but has additional information about the predicate could be a good idea.

Otherwise, the call site might be able to infer it:
In the returned iterator, all entries except the last one had the predicate return true, and the last had the predicate return false, with the following exceptions:

  • If the returned iterator has the same length as the original iterator, then all elements returned true and there is no last element that returned false.
  • (special case of the previous rule) If the returned iterator is empty, then the input iterator was empty and there is no last element that returned false.

Given those rules, it might also be an idea to make that information accessible from the TakeWhileInclusive iterator object in something like an "last_element_failed_predicate" boolean flag instead of having a per-item boolean flag. Wouldn't be as convenient for iterator chaining, but if the calling code needs it, they could get it if they store the iterator in an intermediate variable.

edit: nvm the proposal. The rule is correct but when instantiating theTakeWhileInclusive, we don't have that information yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I should have read the Iterator docs better. It looks like what I propose would be named map_while_inclusive to go with take_while_inclusive.

I'll add that to the ACP, if one gets opened.


fn next(&mut self) -> Option<Self::Item> {
if self.done {
None
} else {
self.iter.next().map(|item| {
if !(self.predicate)(&item) {
self.done = true;
}
item
})
}
}

fn size_hint(&self) -> (usize, Option<usize>) {
if self.done {
(0, Some(0))
} else {
(0, self.iter.size_hint().1)
}
}
}

impl<I, F> FusedIterator for TakeWhileInclusive<'_, I, F>
where
I: Iterator,
F: FnMut(&I::Item) -> bool
{
}