-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add is_sorted
to the standard library
#2351
Conversation
text/0000-is-sorted.md
Outdated
overhead while writing *and* reading the code. | ||
|
||
In [the corresponding issue on the main repository](https://github.com/rust-lang/rust/issues/44370) | ||
(from which I will reference a few comments) everyone seems to agree on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/from which I will reference a few comments/from which a few comments are referenced/
text/0000-is-sorted.md
Outdated
[unresolved]: #unresolved-questions | ||
|
||
|
||
### Is `Iterator::is_sorted_by_key` useless? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so, precisely for the reason you mention.
We already have quite many methods in Iterator
and I think that additions should only be made if they increase readability and helps you express things which are quite tedious and unidiomatic to express otherwise. In this case I think iter.map(extract).is_sorted()
is better than iter.is_sorted_by_key(extract)
.
If there is popular demand for this later we can always add it then, but for now I think it is premature.
text/0000-is-sorted.md
Outdated
C::Item: Ord, | ||
``` | ||
|
||
This can be seen as a better design as it avoids the question about which data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps if Rust has a built in operator for function composition and made point-free notation easy, but that is not the case, so let's keep the RFC as is.
text/0000-is-sorted.md
Outdated
|
||
fn is_sorted_by<F>(mut self, compare: F) -> bool | ||
where | ||
F: FnMut(&Self::Item, &Self::Item) -> Ordering, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case for FnMut
here as opposed to Fn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FnMut
is less restrictive than Fn
in where-clauses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am aware. But just because you can do something doesn't mean you ought to do something. So I'm wondering if there are any good reasons why you should be able to pass in a stateful closure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. But this would apply to almost all libstd functions that take closures, then. I don't think any method uses an Fn
bound when FnMut
suffices. This also applies to all sort methods we already have, so consistency is another argument for FnMut
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's exactly the reason I chose FnMut
: it's everywhere already. In particular: [T]::sort_by_key
.
Do you think I should mention this in the RFC? Maybe that would be a good idea to prevent future readers from asking the same question about FnMut
...
Oh, and just for protocol: I wondered about the usage of FnMut
s, too. Exactly for the "should be stateless" reason you, @Centril, mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I knew about the consistency reason beforehand, but I think that we should decide on this with a rationale more than "this is what we've done in other places", unless there's some other place where rationale has been given(?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Centril why should a function deliberately choose a more restrictive Fn
than it needs, unless it wants to open itself to implementation changes that may require those restrictions?
The rules for picking Fn traits are simple:
- If you only need to call it once, ask for
FnOnce
. - If you call it multiple times but do not require reentrancy, ask for
FnMut
. - If you need to call it concurrently, ask for
Fn
.
...and I rather doubt the stdlib needs to keep itself open to the possibility of changing this function to run in multiple threads.
By requiring Fn
you make it difficult for the user to do things like caching and memoization, forcing people to use RefCell and thereby deferring compile-time borrow checks to runtime. There is no upside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ExpHP great arguments, so FnMut it is.
text/0000-is-sorted.md
Outdated
> ```rust | ||
> fn is_sorted(self) -> bool | ||
> where | ||
> Self::Item: Ord, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why Ord
instead of PartialOrd
? Testing if a <= b <= c <= ... <= z
is a perfectly well-defined operation on a partially-ordered set, and to give floating point numbers the shaft yet again even for something as innocent as this just seems spiteful.
(note: I'm anthropomorphizing the std lib here; not calling you spiteful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, sorry for not properly thinking about that beforehand. So the comparator for the _by
versions may also return an Option<Ordering>
, right? And how do we want to handle a None
then? I might be too sleepy think... but it's not immediately obvious whether [1.0, NaN, 2.0]
should be considered ordered or not, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. This is a predicament.
If this existed in a vacuum, then I would say the comparator should produce bool
, because an <=
operation really is most natural definition for a partial ordering. Option<Ordering>
seems to me to be a sort of an allowance to make deriving Ord from PartialOrd possible.
But obviously, this does not exist in a vacuum, and so a bool comparitor could create unnecessary confusion for users. So yes, I suppose we should try Option<Ordering>
.
Re: [1.0, NaN, 2.0]
, I am about to post a much larger comment on this, but in summary, I am of the stance that there is only one sensible implementation of is_sorted
on a partially ordered set, and this implementation returns false
for a sequence like [1.0, NaN, 2.0]
.
In response to my request for Needless to say, there is more than one way that somebody could define The possible implementations of
|
@ExpHP So to summarize, you propose to use the stricter version I'm not quite sure about the "what should the closure return?" issue. I'd love to read more comments on these two questions. I'll change the RFC later :) |
Great comment, @ExpHP! I'd add that I'd expect the following, which also suggests definition 1: is_sorted(a) → ∀i,j (i ≤ j → ai ≤ aj ∧ ai < aj → i < j) (You do obliquely mention this, but I think it's more interesting as a property-but-not-implementation, since transitivity means a linear implementation provides a quadratic guarantee.) |
Yes, this is my recommendation. |
@ExpHP that pairwise function is essentially itertools' |
@clarcharr Maybe not |
@ExpHP What about |
@eddyb I'm not sure, what makes Comparing the two, I feel fn is_sorted<I>(iter: I) -> bool
where I: IntoIterator, I::Item: PartialOrd + Clone,
{
use itertools::Itertools;
iter.into_iter().tuple_windows().all(|(a, b)| a < b)
} N.B. my original intent was not to propose |
@ExpHP Do you think |
I am currently leaning towards considering them as an alternative. (i.e. for the Alternatives section). These are my thoughts:
It is perhaps less discoverable than |
is_sorted() is sufficiently common and sufficiently clear to deserve a function. If a programmer writes ".all(|(a, b)| a < b)" instead of ".all(|(a, b)| a <= b)" this could lead to troubles. |
I agree with @leonardo-m: writing Apart from that, I proposed to add |
Fully generic |
I don't think that |
Here are all of the "by key" methods in Iterator and Itertools: (i.e. any method that takes a
None of them could be accomplished with a |
Hmm, I suppose something like fn all_tuples<F>(&mut self, mut f: F) -> bool
where Self: Sized, F: FnMut<Self::Item, Output=bool>
{
self.all(|x| f.call_mut(x))
} |
(and while I will admit that I have needed With the specific way you have written it, the function takes multiple args instead of a tuple, which is neat, but also relies on unstable details about the |
@scottmcm Shouldn't it be possible to get the behaviors you mentioned by using
|
I just read an article about implementing I added a paragraph about this in the RFC. And since the discussion died down in the last month, a quick summary (although there aren't too many comments):
So to me it seems like all issues (that were mentioned) were addressed and there are not really major concerns left. So... maybe FCP? ^_^ |
So I just read the RFC. I wrote a crate that implements
For consistency with The only other comment I have is that the motivation of this RFC is pretty weak:
The most important case is probably implementing a sorting algorithm, since the first thing you want to do there is check whether the elements are sorted, and if the answer is yes, you are done. The check is The vectorized implementations in my library require a long list of unstable features: |
Seems a bit important to me to be left implicit 😕 Well if we assume that the This notational axiom also implies irreflexitivity for So maybe it is more fitting to think about
Additionally, two relations are derived:
Why?
I still think it should be fairly obvious what |
I've filled rust-lang/rust#50230 to ask for at least clarification. I agree with you that, at least the way things are currently documented, the requirements (and guarantees) of Please fill free to chime in there with other things that might need to be clarified, like the notational axiom: |
I'd propose to not let the trait bound decision block merging this RFC.
I also think that we should initially implement |
@LukasKalbertodt I think we can leave that as an unresolved question to resolve before stabilization, but whether it requires The most conservative thing to do is to require The ideal thing would be if we could just require |
text/0000-is-sorted.md
Outdated
The lack of an `is_sorted()` function in Rust's standard library has led to | ||
[countless programmers implementing their own](https://github.com/search?l=Rust&q=%22fn+is_sorted%22&type=Code&utf8=%E2%9C%93). | ||
While it is possible to write a one-liner using iterators (e.g. | ||
`(0..arr.len() - 1).all(|i| arr[i] < arr[i + 1])`), it is still unnecessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a <=
(thanks @orlp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my secret plan: hide a subtle bug in the "motivation" section that no one notices for months. Now that it has been discovered, it's the perfect motivation that is_sorted
is certainly needed! Muhaha!
In all seriousness, it happens all the time.
Will fix this bug in the RFC later. It's blocked right now anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping, given FCP :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I'd almost love to see a section in the motivation showcasing just how easy it is to make this mistake!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ExpHP Great idea, done!
The libs team discussed this today and we’d like to accept the RFC as-is: three methods for each of slices and iterators, with @rfcbot fcp merge This algorithm is just tricky enough to implement that having it in the standard library seems helpful enough to be worth the added API surface. Some comments have said that adding |
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
@LukasKalbertodt If we merge LukasKalbertodt/partial-cmp-docs#1 with the proposed tweaks pretty much as is, the blocker here is resolved anyways. So with those semantics merged, what would be the appropriate bound for IIRC:
EDIT: Floats Under a strict partial ordering relation, all of the following would be sorted (that is,
Even in the absence of
because I don't know, my intuition tells me that if we use In
Since we can't implement both ordering using Another example that's similar to floats is a It cannot implement That is, for all permutations of |
I don't believe that |
I mixed |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
@gnzlbg Unfortunately, I currently don't have the time to work on the I think then it would be the best to reboot the |
I would be very surprised when |
The final comment period, with a disposition to merge, as per the review above, is now complete. |
Huzzah! This RFC has been merged! Tracking issue: rust-lang/rust#53485 |
Add the methods
is_sorted
,is_sorted_by
andis_sorted_by_key
to[T]
andIterator
.Rendered
Tracking issue
CC rust-lang/rust#44370
EDIT: I posted a comment summarizing the discussion so far.