Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add array::try_from_iter to improve discoverability and ergonomics #107979

Closed
wants to merge 3 commits into from

Conversation

emilHof
Copy link

@emilHof emilHof commented Feb 13, 2023

Hey everyone!

This is my first attempt at contributing to this project so please excuse all the faux pas I may be committing! It also means I will totally understand if this gets shot down as I am super happy to just learn! :)

In essence, this change takes advantage of the relatively new iter::next_chunk method to provide a more direct way of creating a fixed-size array [T; N] from any IntoIterator<Item = T>. More precisely it array::try_from_iter takes an IntoIterator<Item = T> which will result in a [T; N], if and only if the IntoIterator holds >= N items. Otherwise, it returns an array::IntoIter<T> with the original IntoIterator's items. The proposed method signature follows:

pub fn try_from_iter<I, const N: usize>(iter: I) -> Result<[I::Item; N], IntoIter<I::Item, N>>
where
    I: IntoIterator,

It could be used as follows:

let vec = vec![0, 1, 2, 3];

let Ok(array) = core::array::try_from_iter(vec) 
    /*
        handle insufficient items... 
    */
};

assert_eq!(array, [0, 1, 2, 3]);

This is essentially just a wrapper around iter::next_chunk and its main purpose is to increase ergonomics and discoverability in an effort to reduce the use of unsafe in crates and code bases.

Here is a more detailed RFC.

@rustbot
Copy link
Collaborator

rustbot commented Feb 13, 2023

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon.

Please see the contribution instructions for more information.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 13, 2023
@rustbot
Copy link
Collaborator

rustbot commented Feb 13, 2023

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@emilHof
Copy link
Author

emilHof commented Feb 13, 2023

@rustbot label +T-libs-api -T-libs

@rustbot rustbot added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 13, 2023
@emilHof
Copy link
Author

emilHof commented Feb 13, 2023

@rustbot label -T-libs-api +T-libs

@rustbot rustbot added T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Feb 13, 2023
/// ```
#[unstable(feature = "try_from_iter", issue = "none")]
#[inline]
pub fn try_from_iter<I, const N: usize>(iter: I) -> Result<[I::Item; N], IntoIter<I::Item, N>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This consumes iter, while next_chunk takes a &mut iter. Those are somewhat different semantics because it raises the question what happens when the remainder of the iterator could yield additional elements.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yess that is true! i was wondering about this as well when choosing the API layout, but ended up going with taking ownership of iter, as it seems to be the standard behavior of from functions and is probably what users would expect. Your point is still very valid, as most other collections that can be created through some form of from, From, try_from, or TryFrom don't have this size constraint that [T; N] has.

I'll add this to the RFC under Alterntives; yet, if this &mut self ends up making more sense then changing it to that should not be to difficult!

Copy link
Author

@emilHof emilHof Feb 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could also maybe imagine returning something like Result<([I::Item; N], I::Iterator), IntoIter<I::Item, N>> so that any remaining elements can be extracted from the original collection, but that might be a bit to contrived? It would mean you'd always have to expression match out of it, and would appear to severely reduce ergonomics.

@bors
Copy link
Contributor

bors commented Feb 13, 2023

☔ The latest upstream changes (presumably #107634) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@scottmcm
Copy link
Member

scottmcm commented Feb 14, 2023

It was great to see the Drawbacks section of your RFC. Definitely my biggest question here would be how it interacts with Iterator::next_chunk, and whether having both is useful.

Probably a bunch of this is related to optimization issues -- IIRC next_chunk was needed because the default implementation performed horribly for many existing iterators, even when they seemed like they should have been easy (like vec::IntoIter or Copied<slice::Iter>). I suppose one way to have both be worth it would be to tweak the Iterator method to be less convenient but more perf-optimal -- maybe it needs a passed-in buffer instead of returning a Result, for example? -- and then the module-level function could adapt that into the convenient unwrappable thing.

P.S. I added libs-api to the tags because a new pub fn exported from core is T-libs-api, even if still unstable.

@scottmcm scottmcm added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Feb 14, 2023
@emilHof
Copy link
Author

emilHof commented Feb 14, 2023

@scottmcm that is a valid question!

how it interacts with Iterator::next_chunk

In terms implementation interaction, it really is just a wrapper, so any change to next_chunk, for better or for worse, would have an impact in its performance and soundness. I am guessing this is not what you are hinting at though...

whether having both is useful

As for its usefulness alongside Iterator::next_chunk, the main idea is discoverability. While it is completely possible and not really much more verbose to use next_chunk directly, it seems like a level of indirection that would require additional research on the users part. I think a dedicated function would be what users would initially look for and the extra step of creating an iterator and using next_chunk may not be intuitive to some. So even if try_from_iter only prevents the use of unnecessary unsafe in a few public creates it would appear worth it while; to me at least.

IIRC next_chunk was needed because the default implementation performed horribly for many existing iterators

By default implementation, do you mean the default implementation of next_chunk? I am slightly confused hehe!

I suppose one way to have both be worth it would be to tweak the Iterator method to be less convenient but more perf-optimal -- maybe it needs a passed-in buffer instead of returning a Result, for example? -- and then the module-level function could adapt that into the convenient unwrappable thing.

By Iterator method, would you be referring to next_chunk or try_form_iter?

@emilHof
Copy link
Author

emilHof commented Feb 14, 2023

PS. I added libs-api to the tags because a new pub fn exported from core is T-libs-api, even if still unstable.

Ohh, I see, thank you haha! I was not quite sure which one applied here! That makes sense though!

@scottmcm
Copy link
Member

So, it's pretty rare that we have the same thing exposed in two ways. I can't actually think of a single example where one side is a normal function -- everything I can come up with it's traits on both sides. Let's look at a few and see if we can divine any patterns:

FromIterator::from_iter & Iterator::collect

Until recently, from_iter actually said not to call it: https://doc.rust-lang.org/1.60.0/std/iter/trait.FromIterator.html

FromIterator::from_iter() is rarely called explicitly, and is instead used through Iterator::collect() method.

The split was that FromIterator was for implementing, not for calling -- people liked the chaining .collect() better for calling.

However, it was added to the prelude for 2021 because it can be handy as a way to get the type you wanted: VecDeque::from_iter([1, 2, 4]) instead of [1, 2, 4].into_iter().collect::<VecDeque<_>>().

So how does that one map to try_from_iter? Well, .next_chunk() is already the chaining one, and the one that's already in scope (no use std::array::try_from_fn), and actually the one that's easier to type-annotate (.next_chunk::<4>() vs try_from_fn::<_, 4>), and the one that autorefs.

Display::fmt vs ToString::to_string

This one definitely started as being all about ergonomics, since setting up a mutable string and writing into it is a bunch of ceremony that's really not desirable for the simple cases. Being able to use .to_string() is very nice. (It then turned out to be helpful for performance too, since String: ToString is specialized internally.)

But .next_chunk() is already convenient, so try_from_iter doesn't save any boilerplate.

Extend::extend and Iterator::collect_into

collect_into isn't actually stable yet, but we can use it to look at why people are interested in it, though its value is precedent is lower since libs-api hasn't signed off on it.

This one's all about chaining. For long iterator chains, people would like to do

iter.blah()
    .some_adapter(complicated)
    .and_more_stuff()
    .collect_into(&mut v);

instead of

v.extend(
    iter.blah()
        .some_adapter(complicated)
        .and_more_stuff());

But that's the opposite of try_from_iter -- the next_chunk version is already the one that allows chaining.


Feel free to add anything I missed!

But looking at those, it suggests to me that try_from_iter shouldn't be added. That the method on Iterator is already the more convenient one, as it's in-scope and chains and needs less turbofish. It's tolerably discoverable, at least as much as all the other Iterator methods, and if stabilized could be made more discoverable by adding it to the error message for trying to collect into an array, for example. With it always being an array on one side, there isn't the same need for a double-trait dance like there is for FromIterator.

What might change that? Well, the one that jumps to mind is if next_chunk ended up needing more ceremony. (This is what I was "hinting" at above.) Suppose that instead of next_chunk, the iterator method ended up being

fn write_prefix_to_maybeuninit_buffer<'a>(
    &mut self, 
    buf: &'a mut [MaybeUninit<Self::Item>],
) -> PartialInitDropGuard<'a, Self::Item>;

Based on my explorations that resulted in 52df055, that would likely be optimized far better than the current -> Result<[T; N], IntoIter<T, N>> signature.

But it would also be really annoying if all you wanted was to get an array or panic. So at that point, various wrappers would absolutely make sense to avoid a bunch of boilerplate for the easy cases.

On the other hand, maybe the easy way will always just be to collect into an ArrayVec (rust-lang/rfcs#3316), rather than going directly to an array at all. So it's hard to say we should add more things right now, IMHO. (I'm not libs-api, though, so it's not my call 🙃)

@emilHof
Copy link
Author

emilHof commented Feb 14, 2023

So, it's pretty rare that we have the same thing exposed in two ways.

That is a detail I not considered, or really known about at all, but a good point. It may be to some extent due to me not having much experience using/resorting to next_chunk, as in the past (not so past as I am pretty new) I have usually just turned to unsafe. In a way that of course makes sense, as next_chunk is not yet stable and relatively new to the party. That the cases of the same functionality being exposed in more than one way are rare is quite sensible, since discoverability may not be much of an issue with already somewhat niche use cases.

But that's the opposite of try_from_iter -- the next_chunk version is already the one that allows chaining.

That pattern, of having an associated function first and then deriving (and maybe even specializing) a chain-able method is interesting. That was one of the things I enjoyed when first discovering Rust, the ability it gives you to do this functional-esque chaining. And them from_iter being slightly different but really just, as you said, because it can make typing easier/cleaner! Another really good point overall!

as it's in-scope

That is also very valid and something I had not thought of!

Based on my explorations that resulted in 52df055, that would likely be optimized far better than the current -> Result<[T; N], IntoIter<T, N>> signature.

Yess, I remember you mentioning that I your write up here #107634. Are you suggesting that the existing next_chunk method, while unstable, could be changed to adapt this:

fn write_prefix_to_maybeuninit_buffer<'a>(
    &mut self, 
    buf: &'a mut [MaybeUninit<Self::Item>],
) -> PartialInitDropGuard<'a, Self::Item>;

sort of interface? That would be a very interesting change to make, and implement, though would it not mean that next_chunk or what it name may be then, would be downgraded to be unsafe just by the very nature of this signature?

Just so I understand, Result<[T; N], ...> tends to optimize to poorly, so having an efficient, yet precarious unsafe method exposed would be useful. And then wrapping something and making it safe would be the next step? Would this really be enough, in your opinion, to warrant a change to the existing next_chunk API, if this is what you are implying? Is the speed up this significant?

Also, I really want to thank you for your super thoughtful response and patience with me haha!

@bors
Copy link
Contributor

bors commented Apr 22, 2023

☔ The latest upstream changes (presumably #106934) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 14, 2023
@scottmcm
Copy link
Member

As described above, I don't think I'm confident enough in this API to add it on my own initiative (not being a libs-api member), so I'm going to ask that you file an ACP https://std-dev-guide.rust-lang.org/feature-lifecycle/api-change-proposals.html to get it seconded if you're still interested in adding it.

@emilHof
Copy link
Author

emilHof commented May 15, 2023

hey @scottmcm, thank you for looking and talking over this PR with me! the points you raised were all extremely valid! it makes complete sense that you would not want to make a decisions such as this! thank you for giving me the option of the ACP, i'll look into this !!

i really really appreciate your feedback and views on this! :))

@Dylan-DPC Dylan-DPC added S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 19, 2023
@emilHof
Copy link
Author

emilHof commented May 25, 2023

Okie @scottmcm, so after following the guide (hopefully correctly), this is the ACP. hope this all looks correct, but definitely let me know if i made some errors with the general form of the document or in the details of the ACP! :)

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. labels Aug 16, 2023
@scottmcm
Copy link
Member

Hmm, so the ACP ended up with

Marking this ACP as accepted in the sense that we do want conversion APIs. But we also think further discussion (e.g. on the internals forum) is needed to find the right set of APIs, so the linked PR shouldn't necessarily land as proposed.

So I'm not really sure what should happen here. I guess it's waiting on more exploration of alternatives before it can land?

@Dylan-DPC
Copy link
Member

so since the ACP wasn't accepted, and no new ACP is created, it is better to close this till a new ACP is created, else it will remain stagnant.

@Dylan-DPC Dylan-DPC closed this Oct 18, 2023
@Dylan-DPC Dylan-DPC removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants