Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impl IntoIterator for arrays #49000

Closed
wants to merge 1 commit into from
Closed

Conversation

cuviper
Copy link
Member

@cuviper cuviper commented Mar 13, 2018

This allows an array to move its values out through iteration.

This was attempted once before in #32871, but closed because the IntoIter<T, [T; $N]> type is not something we would want to stabilize. However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to IntoIter<T, const N: usize> before stabilization.

Adding the impl IntoIterator now will allows folks to go ahead and iterate arrays in stable code. They just won't be able to name the array::IntoIter type or use its inherent as_slice/as_mut_slice methods until they've stabilized.

Quite a few iterator examples were already using .into_iter() on arrays, getting auto-deref'ed to the slice iterator. These were easily fixed by calling .iter() instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this. Outside of examples, there was only one instance of in-tree code that had a problem.

Fixes #25725.

r? @alexcrichton

This allows an array to move its values out through iteration.  I find
this especially handy for `flat_map`, to expand one item into several
without having to allocate a `Vec`, like one of the new tests:

    fn test_iterator_flat_map() {
        assert!((0..5).flat_map(|i| [2 * i, 2 * i + 1]).eq(0..10));
    }

Note the array must be moved for the iterator to own it, so you probably
don't want this for large `T` or very many items.  But for small arrays,
it should be faster than bothering with a vector and the heap.
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 13, 2018
@cuviper
Copy link
Member Author

cuviper commented Mar 13, 2018

It comes up fairly often that users are surprised they can't directly iterate array values. For example, this reddit user didn't understand why the compiler was creating an iterator with reference items:
https://www.reddit.com/r/rust/comments/83z5zt/help_with_iterating_over_values_instead_of/

@kennytm
Copy link
Member

kennytm commented Mar 14, 2018

@bors try

@bors
Copy link
Contributor

bors commented Mar 14, 2018

⌛ Trying commit 032f93b with merge be75eb3...

bors added a commit that referenced this pull request Mar 14, 2018
impl IntoIterator for arrays

This allows an array to move its values out through iteration.

This was attempted once before in #32871, but closed because the `IntoIter<T, [T; $N]>` type is not something we would want to stabilize.  However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to `IntoIter<T, const N: usize>` before stabilization.

Adding the `impl IntoIterator` now will allows folks to go ahead and iterate arrays in stable code.  They just won't be able to name the `array::IntoIter` type or use its inherent `as_slice`/`as_mut_slice` methods until they've stabilized.

Quite a few iterator examples were already using `.into_iter()` on arrays, getting auto-deref'ed to the slice iterator.  These were easily fixed by calling `.iter()` instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this.  Outside of examples, there was only one instance of in-tree code that had a problem.

Fixes #25725.

r? @alexcrichton
@@ -210,6 +214,21 @@ macro_rules! array_impls {
}
}

#[unstable(feature = "array_into_iter", issue = "0")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trait implementations are still insta-stable, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah. I'll need to update the truly unstable items with a tracking issue (if this PR is accepted), so I'll change the impls to stable then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event if the IntoIter type is left as unstable, it will still be reachable through <<[u8; 0]> as IntoIter>::Iter>, right?

Copy link
Member Author

@cuviper cuviper Mar 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, you can write something like:

fn my_iter() -> <[u8; 0] as IntoIterator>::IntoIter {
    [].into_iter()
}

I think that's OK. We could change the associated type entirely without breaking such code.

@bors
Copy link
Contributor

bors commented Mar 14, 2018

☀️ Test successful - status-travis
State: approved= try=True

@ishitatsuyuki
Copy link
Contributor

ishitatsuyuki commented Mar 14, 2018

I'm curious whether the discussion in postponed rust-lang/rfcs#2185 applies to this as well.

@cuviper
Copy link
Member Author

cuviper commented Mar 14, 2018

@ishitatsuyuki

I'm curious whether the discussion in postponed rust-lang/rfcs#2185 applies to this as well.

I was just following up on my older PR. I wasn't aware of that RFC, but it would apply, yes. If nothing else, I guess we'll have this PR as a proof of concept, that the implementation is feasible.

I don't agree with the conclusion though. I think this lack is a usability wart that needs fixing, and it is easily fixed. The IntoIter signature is the only piece lacking, and I think it's fine to leave that unstable now, having a clear path to stabilization with const generics.

@novacrazy
Copy link

novacrazy commented Mar 15, 2018

Do you mind if I incorporate some of this into the generic-array crate? I like how you've handled the Clone implementation and as_slice methods in particular.

@cuviper
Copy link
Member Author

cuviper commented Mar 15, 2018

@novacrazy I don't mind at all - go for it!

#[inline]
fn next(&mut self) -> Option<T> {
if self.index < self.index_back {
let p = &self.array.as_slice()[self.index];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these indexes use ptr::offset so as not to acquire &[T] to a partially uninitialized [T]?

The same applies down in next_back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an uninitialized slice really a problem if we never read from those parts? I would think it's fine, otherwise sequences like mem::uninitialized() + ptr::write() would be a problem too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's certainly problematic under the types-as-contracts view. AFAICT the unsafe code guidelines aren't solid enough to really give a hard answer yet, but it would certainly be better to not create invalid slices. Incidentially, yes, mem::uninitialized is problematic :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, how can I even get the pointer base to offset from? Deref for ManuallyDrop will create a momentary reference here too. Or even with a local union type, getting the pointer to the field will require forming a reference first.

Copy link
Member

@cramertj cramertj Mar 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add an unstable ManuallyDrop::as_ptr and mark ManuallyDrop as repr(transparent). This will tell you that *const ManuallyDrop<T> points to the same place as*const T would.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, repr(transparent) is only allowed on a struct. The RFC left unions as future work: rust-lang/rfcs#1758 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need repr(transparent) for this, since this is just a question of memory layout, not calling convention. repr(C) would suffice, or inside libstd we could even simply rely on what rustc currently does (which should place the union contents at the same address as the union).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkruppe I've been thinking more about this, and if my conclusion in #49056 (comment) is correct, then this PR would be fine with the current "invalid slice" usage, since the values left in the slice would still be valid values of type T, just ones in the "dropped" typestate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, here's a comparison using pointers to avoid possibly dropped/uninit refs:
cuviper/rust@array-intoiter...cuviper:array-intoiter-ptr

I can submit the ManuallyDrop additions in a separate PR, if you think it's worth it.

since the values left in the slice would still be valid values of type T, just ones in the "dropped" typestate.

The Clone implementation will create truly uninitialized entries too. IMO the unsafe rules should not declare such references invalid if they're not read, but we can deal with it in pointers if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that clone implementation is certainly become UB due to the uninitialized, but that's true of basically every use of mem::uninitialized, so we'll have to go clean them all up later no matter what once the uninit RFC lands

#[inline]
#[unstable(feature = "array_into_iter", issue = "0")]
pub fn as_mut_slice(&mut self) -> &mut [T] {
&mut self.array.as_mut_slice()[self.index..self.index_back]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here-- I think this implementation should use ptr::offset and from_raw_parts so as not to acquire a reference to a partially uninitialized slice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above, how can I get a base pointer at all without forming some reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above-- if you make ManuallyDrop repr(transparent), then you know that pointers to it will point directly to its element.

novacrazy added a commit to novacrazy/generic-array that referenced this pull request Mar 15, 2018
@clarfonthey
Copy link
Contributor

Genuine question: considering how index and index_back are usize anyway, wouldn't it be reasonable to just replace them with pointers like the Iter and IterMut implementations? That way there's no extra offsets done in the iterator itself.

@ishitatsuyuki
Copy link
Contributor

@clarcharr The Iterator may be moved, which changes the address.

@pietroalbini
Copy link
Member

Ping from triage @alexcrichton! This PR needs your review.

// Only values in array[index..index_back] are alive at any given time.
// Values from array[..index] and array[index_back..] are already moved/dropped.
array: ManuallyDrop<A>,
index: usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I've personally enjoyed doing historically is storing these two indices as Range<usize> as it helps make implementations for the iterator methods easier

@@ -1242,7 +1242,7 @@ pub trait Iterator {
/// // let's try that again
/// let a = [1, 2, 3];
///
/// let mut iter = a.into_iter();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear, these changes mean that this is technically a breaking change, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly what's going on, this "breaking change" is no different from the way adding a new method to Iterator which (if e.g. itertools already implements that method) would break all users of itertools who used the method syntax instead of the UFC syntax to call said method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct yeah. This doesn't mean it's impossible to land, just that it needs more scrutiny/review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does feel like one of those changes that, while (allowed) breaking, people have wanted for a very long time and we definitely want to do at some point.

@alexcrichton
Copy link
Member

cc @rust-lang/libs, it looks like this may cause compiling code to stop compiling but fall under the umbrella of "accepted type inference breakage". Along those lines I wanted to cc others on this (especially as everything added is unstable) to make see if anyone's got concerns.

I'd personally prefer to hold off on stabilizing these for now until we can get the true IntoIter<T, N> rather than IntoIter<T, [T; N]> if we can, but I think otherwise we may want to run crater on this PR to assess the possible breakage

@oli-obk
Copy link
Contributor

oli-obk commented Mar 21, 2018

If we hold off we should at least create a future compat lint that suggests iter over into_iter, so the breakage will be less severe

@withoutboats
Copy link
Contributor

I think we should merge this, but not stabilize the types until they have the correct signature. However I think you'll be able to use them on stable (without naming the type) because the IntoIterator impl will be stable. I'm fine with this, the only part I'm not fine with stabilizing is the signature, which you can't get yet.

I'd also like to see a crater run, but unless its really bad I'm not that perturbed because it feels like you're walking into breakage by calling into_iter to get the slice iterator on an array.

@ishitatsuyuki
Copy link
Contributor

It's terrifying enough that we already had code subject to breakage in our code base. If traits are insta-stable (or they can't act if they doesn't exist if the feature is disabled), we cannot really just merge this.

@cuviper
Copy link
Member Author

cuviper commented Mar 21, 2018

We could use inherent methods to have the iterator without the IntoIterator collision, like IntoIter::new(A) or [T; $N]::iter_values(self). But these would be less ergonomic, and only usable on nightly until we stabilize the type signature.

Could we make the impl IntoIterator only visible to 2018 edition? It couldn't be a simple #[cfg(2108)], since core's edition is independent of the user's. Maybe some other attribute could mask its visibility?

I'm game for adding a compat/deprecation warning, but I'm not sure how to do this, at least not at the library level. I guess it would have to be checked as a special case within the compiler?

@SimonSapin
Copy link
Contributor

Shadowing existing […].into_iter() calls is business as usual when adding methods or impls to standard library types. (Though it looks like we have yet to do that crater run.)

I’m much more uncomfortable with merging this if the feature is effectively insta-stable while we have definite plans to change the "shape" of the return types later, even if these types can’t be named directly yet.

@alexcrichton
Copy link
Member

cc @rust-lang/infra, can this get a crater run?

@bors: try

@cramertj
Copy link
Member

It isn't possible today. I've opened rust-lang/crater#388 to track this. cc @pietroalbini.

@ishitatsuyuki
Copy link
Contributor

I think this lint has record of triggering on std (vague memory), and I think it's not possible to apply this patch without some isolation (however feature-gate does not apply to traits, and edition-gate cannot be used for this purpose as well), given that this lint has triggered on at least one of my project I contribute to.

@cuviper
Copy link
Member Author

cuviper commented Jan 14, 2019

I think this will technically qualify as a minor breaking change -- the new trait implementation could break some existing code, but is easily worked around. The same is true of many library additions. But crater can help judge how pervasive this is, so we can decide if it's too much headache.

A fallback position is that we forgo IntoIterator for arrays, but still provide some other explicit method to create that iterator. It would be a little sad to lose the magic associated with IntoIterator though.

@clarfonthey
Copy link
Contributor

Perhaps we could still add an into_iter method for arrays that isn't IntoIterator::into_iter?

@cuviper
Copy link
Member Author

cuviper commented Jan 14, 2019

No, it's precisely the method call syntax that gets us into trouble, where auto-deref currently resolves array.into_iter() to the slice method. An inherent array into_iter() would break that just as much as the trait method does. We would have to call it something else, and document why it's not idiomatic.

Direct uses of IntoIterator, like for loops and the zip argument, wouldn't be a problem because you can't use array values there at all yet.

@cuviper
Copy link
Member Author

cuviper commented Jan 14, 2019

Hmm, we could add an inherent array into_iter() that forwards to slice, and then have our array IntoIterator by value. So auto-deref method resolution would still find the "legacy" behavior, while arrays would be properly iterable elsewhere.

@Mark-Simulacrum
Copy link
Member

To expand on that a little, why not even mark the inherent into_iter as a deprecated method, perhaps in some future release? That would give us the warning to current users more firmly, and presumably, at some later point, we can change into_iter on arrays to return the by-value iterator. This would perhaps be done at the next edition if we figure out some way to cfg methods on edition boundaries, but I think figuring that out concretely isn't super important to do right now.

In hindsight, it might be good to recommend that any collection-like types implementing Deref to another collection (e.g., slices, vec) should always "reserve" their into_iter ahead of time (i.e., by implementing it with a panic or similar) if they don't want to implement it or can't implement it yet. Might be interesting for the libs team to consider adding to the API guidelines.

@scottmcm
Copy link
Member

This feels like one of those things we just need to do, even if it's a bit painful. There are so many places just calling out for this, like being able to do .flat_map(|x| [x, x]).

@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Jan 14, 2019

What about doing something like this?

  • Add a insta-deprecated method called into_iter method for arrays that returns slice::Iter<T>.
  • Implement IntoIterator for arrays.

Explicitly calling into_iter for arrays in such a scenario will call the deprecated into_iter method, as it's an inherent method, not a trait method. The depreciation warning will allow developers to update their code to not call into_iter method.

@cuviper
Copy link
Member Author

cuviper commented Jan 14, 2019

@xfix Yes, that's what @Mark-Simulacrum and I are talking about in our most recent comments.

I guess we would need a new #[lang = "array"] for this, but there might be more to it:

error[E0118]: no base type found for inherent implementation
   --> src/libcore/array.rs:194:21
    |
194 |               impl<T> [T; $N] {
    |                       ^^^^^^^ impl requires a base type

@pietroalbini
Copy link
Member

It is now possible to start Crater runs with mode=clippy. Thanks @ecstatic-morse!

@cramertj
Copy link
Member

I don't think I have permissions for this, but I can give it a shot and maybe someone else will pick it up for me when it fails ;)

@craterbot run name=clippy-test-run mode=clippy start=stable end=stable+rustflags=-Dclippy::into_iter_on_array

@craterbot
Copy link
Collaborator

👌 Experiment clippy-test-run created and queued.
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 28, 2019
@craterbot
Copy link
Collaborator

🚧 Experiment clippy-test-run is now running on agent aws-1.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added the S-waiting-on-crater Status: Waiting on a crater run to be completed. label Jan 28, 2019
@craterbot
Copy link
Collaborator

🎉 Experiment clippy-test-run is completed!
📊 11872 regressed and 1 fixed (50551 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Jan 29, 2019
@pietroalbini
Copy link
Member

Tons of failures with:

[INFO] [stderr] error[E0602]: unknown lint: `clippy::into_iter_on_array`
[INFO] [stderr]   |
[INFO] [stderr]   = note: requested on the command line with `-D clippy::into_iter_on_array`

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jan 29, 2019

It seems passing clippy lints via RUSTFLAGS fails for projects which have a build script.

@cuviper
Copy link
Member Author

cuviper commented Feb 11, 2019

We discussed the inherent-method trick at all-hands, but discovered that it doesn't actually work on bare values. See in this playground that the final bare call does not hit the inherent method. Method resolution will choose our new <[T; N] as IntoIterator>::into_iter(self) before auto-ref'ing to call an inherent into_iter(&self), and the latter must be a reference to borrow for slice::Iter. So it seems we can't actually cover and deprecate the call we need, unless we involve some further compiler hackery.

@cuviper cuviper deleted the array-intoiter branch August 9, 2020 23:37
suptalentdev added a commit to suptalentdev/generic-array that referenced this pull request Jan 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IntoIterator should be implemented for [T; N] not just references