Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACP: Uplift iter::repeat_n from itertools #120

Closed
scottmcm opened this issue Oct 16, 2022 · 5 comments
Closed

ACP: Uplift iter::repeat_n from itertools #120

scottmcm opened this issue Oct 16, 2022 · 5 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@scottmcm
Copy link
Member

scottmcm commented Oct 16, 2022

Proposal

Problem statement

Add the repeat_n<T>(value: T, n: usize) -> RepeatN<T> function and iterator type to iter.

This exists as itertools::repeat_n, but can easily be uplifted to core+std, like how iter::repeat_with has obviated itertools::repeat_call.

(This is a function, not a trait method, so doesn't have the resolution issues that make this hard for things like intersperse.)

Motivation, use-cases

The obvious alternative here is iter::repeat(value).take(n). But there's two gotchas with that approach:

It has to always clone value, never move it

If it's a string, for example, that buffer can never be reused (other than in specializations) since it needs to always keep the value around for the potentially-next iteration. (And even if it somehow knew it was the last element needed, it still can't move it in non-consuming calls as it'd make a follow-up .next() call UB, and the Drop would double-free.)

Vec has the secret from_elem for vec! to use to overcome this problem, as well as a bunch of extra internal stuff like ExtendElement for other places that want this functionality.

But other containers don't have this. For example,

iter::repeat(my_string).take(10).collect::<VecDeque<String>>()

will do 10 clones (instead of the 9 that are really needed) and there's no nice way to avoid that.

Even VecDeque::resize didn't bother doing better here, as it just calls .resize_with(n, || value.clone()), which will also do the extra clone.

Thus one would need to do some long dance like

let mut v = VecDeque::with_capacity(10);
v.extend(iter::repeat(&my_string).cloned().take(9));
v.push_back(my_string);

and that's enough of a pain that it's almost certainly not happening.

It would be nice to be able to just do iter::repeat_n(my_string, n).collect() and have that work great in containers, rather than needing them all to add from_elem-like methods or recreate extend specialization infrastructure for all of them like Vec is doing.

It's not ExactSizeIterator.

While that does have a perfect size_hint, iter::Repeat<_>: !ExactSizeIterator (and must not be) and thus the Take<…> isn't either:

iter::repeat(1).take(10).len();

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=74ea40842f56f87224a4bb2411967ec6

  |     iter::repeat(1).take(10).len();
  |                              ^^^ method cannot be called on `std::iter::Take<std::iter::Repeat<{integer}>>` due to unsatisfied trait bounds
  |
  = note: the following trait bounds were not satisfied:
          `std::iter::Repeat<{integer}>: ExactSizeIterator`
          which is required by `std::iter::Take<std::iter::Repeat<{integer}>>: ExactSizeIterator`

That could be made to work, but the "obvious" way would need both an InfiniteIterator trait and some #[marker] traits to express the bound, which combined make a pretty big hammer.

Whereas RepeatN<_> is non-infinite -- limited by the usize -- so is naturally ExactSizeIterator. Thus even for things like repeat(0).take(n) where buffer reuse is irrelevant, this is still useful. (Such as in -> impl ExactSizeIterator<Item = …> APIs.)

Solution sketches

So long as cloneing in this isn't considered a side effect, it can be made to work nicely. That should be uncontroversial, as the existing repeat doesn't either -- for example iter::repeat(NoisyClone).nth(10) only clones once: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b8455030a83615d6b4a1b92c25d65571

A straight-forward and natural implementation even allows LLVM to unify the "clone" and "move the last time" branches for trivial-to-clone types even if they're not Copy -- with no special needs_drop checks or specialization: https://rust.godbolt.org/z/drjzsdfoe

Links and related work

As mentioned, https://docs.rs/itertools/latest/itertools/fn.repeat_n.html

The suffix _n for "counted" versions of methods is used in Elements of Programming and common in C++ functions like copy_n.

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

@scottmcm scottmcm added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 16, 2022
@the8472
Copy link
Member

the8472 commented Oct 29, 2022

That could be made to work, but the "obvious" way would need both an InfiniteIterator trait and some #[marker] traits to express the bound, which combined make a pretty big hammer.

Smaller hammer:

impl<T> ExactSizeIterator for Take<Repeat<T>> where T: Clone {}

It has to always clone value, never move it

This could be solved with specialization on Take and a MaybeUninit. But yeah, now the hammer sizes are going up again.

@thomcc
Copy link
Member

thomcc commented Nov 18, 2022

This could be solved with specialization on Take and a MaybeUninit. But yeah, now the hammer sizes are going up again.

It's also a specialization observable by users, which is unfortunate.

@mina86
Copy link

mina86 commented Jan 28, 2023

By the way, I was unaware of repeat_n and ended up implementing Take<Repeat<T>> in rust-lang/rust#106943

It's also a specialization observable by users, which is unfortunate.

I think this is unnecessarily strict view of observable behaviour especially since documentation doesn’t state whether repeat clones every value or not.

I think the bigger issue is to consider timeline of specialisation becoming.

@joshtriplett
Copy link
Member

We discussed this in today's libs-api meeting. Seems good; it's clearer and a slight optimization.

Feel free to open a tracking issue and open a PR to rust-lang/rust to add it as an unstable feature.

@scottmcm
Copy link
Member Author

scottmcm commented Jan 16, 2024

Tracking issue: rust-lang/rust#104434

I'll make a PR to remove the doc(hidden) 🙂

EDIT: rust-lang/rust#120045

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 21, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 21, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jan 22, 2024
Rollup merge of rust-lang#120045 - scottmcm:unhide-repeat-n, r=Mark-Simulacrum

Un-hide `iter::repeat_n`

ACP accepted in rust-lang/libs-team#120 (comment)
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 16, 2024
…lnay

Implement DoubleEnded and ExactSize for Take<Repeat> and Take<RepeatWith>

Repeat iterator always returns the same element and behaves the same way
backwards and forwards.  Take iterator can trivially implement backwards
iteration over Repeat inner iterator by simply doing forwards iteration.

DoubleEndedIterator is not currently implemented for Take<Repeat<T>>
because Repeat doesn’t implement ExactSizeIterator which is a required
bound on DEI implementation for Take.

Similarly, since Repeat is an infinite iterator which never stops, Take
can trivially know how many elements it’s going to return.  This allows
implementing ExactSizeIterator on Take<Repeat<T>>.

While at it, observe that ExactSizeIterator can also be implemented for
Take<RepeatWhile<F>> so add that implementation too.  Since in contrast
to Repeat, RepeatWhile doesn’t guarante to always return the same value,
DoubleEndedIterator isn’t implemented.

Those changes render core::iter::repeat_n somewhat redundant.

Issue: rust-lang#104434
Issue: rust-lang#104729

- [ ] ACP: rust-lang/libs-team#120 (this is actually ACP for repeat_n but this is nearly the same functionality so hijacking it so both approaches can be discussed in one place)
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 17, 2024
…lnay

Implement DoubleEnded and ExactSize for Take<Repeat> and Take<RepeatWith>

Repeat iterator always returns the same element and behaves the same way
backwards and forwards.  Take iterator can trivially implement backwards
iteration over Repeat inner iterator by simply doing forwards iteration.

DoubleEndedIterator is not currently implemented for Take<Repeat<T>>
because Repeat doesn’t implement ExactSizeIterator which is a required
bound on DEI implementation for Take.

Similarly, since Repeat is an infinite iterator which never stops, Take
can trivially know how many elements it’s going to return.  This allows
implementing ExactSizeIterator on Take<Repeat<T>>.

While at it, observe that ExactSizeIterator can also be implemented for
Take<RepeatWhile<F>> so add that implementation too.  Since in contrast
to Repeat, RepeatWhile doesn’t guarante to always return the same value,
DoubleEndedIterator isn’t implemented.

Those changes render core::iter::repeat_n somewhat redundant.

Issue: rust-lang#104434
Issue: rust-lang#104729

- [ ] ACP: rust-lang/libs-team#120 (this is actually ACP for repeat_n but this is nearly the same functionality so hijacking it so both approaches can be discussed in one place)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

5 participants