Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partition_point #73577

Merged
merged 10 commits into from
Jun 28, 2020
Merged

Add partition_point #73577

merged 10 commits into from
Jun 28, 2020

Conversation

VillSnow
Copy link
Contributor

Add partition_point in C++.
Although existing binary_search in rust does not suitable when the slice has multiple hits,
this function returns exact point of partition.
The definition of this function is very clear and able to accept general matter, therefore you can easily get index which you want like lower/upper_bound.

rust-lang/rfcs#2184

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @LukasKalbertodt (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 21, 2020
@matklad matklad added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jun 21, 2020
src/libcore/slice/mod.rs Outdated Show resolved Hide resolved
@arthurprs
Copy link
Contributor

Does it make sense for this to return 2 slices instead?

@VillSnow
Copy link
Contributor Author

Does it make sense for this to return 2 slices instead?

I'm not sure. It is easy to convert index to two slices, or two slices to index. There is almost no difference.

The two slices is consistent with Iterator::partition but it is because iterator should not be just consumed.

binary_search returns index and also Iterator::partition_in_place substantially returns index.

I think the index simple and is better.

@Amanieu
Copy link
Member

Amanieu commented Jun 23, 2020

LGTM, can you create a tracking issue and update the stability attribute to point to it?

@VillSnow
Copy link
Contributor Author

I'll be inactive a few days so I make tracking issue after that.

Copy link
Member

@LukasKalbertodt LukasKalbertodt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a potentially useful method. I'm fine with adding it unstably. Just a few inline comments.


while left != right {
let mid = left + (right - left) / 2;
let value = unsafe { self.get_unchecked(mid) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment about why this unsafe is fine. You can probably take the text from binary_search. Also see #66219 (let the comment start with // SAFETY:).

src/libcore/slice/mod.rs Outdated Show resolved Hide resolved
Comment on lines 2667 to 2673
/// Returns index of partition point according to the given predicate,
/// such that all those that return true precede the index and
/// such that all those that return false succeed the index.
///
/// The slice must be partitioned
/// so that all elements where the predicate returns true
/// precede the elements where the predicate returns false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this text could be improved. In particular, I had to check the example to understand whether the returned index is the last index of the first half or the first index of the second half. One suggestion:

Returns the index of the partition point according to the given predicate (the index of the first element of the second partition).

The slice is assumed to be partitioned according to the given predicate. This means that all elements for which the predicate returns true are at the start of the slice and all elements for which the predicate returns false are at the end. For example, [7, 15, 3, 5, 4, 12, 6] is a partitioned under the predicate x % 2 != 0 (all odd numbers are at the start, all even at the end).

If this slice is not partitioned, the returned result is unspecified and meaningless, as this method performs a kind of binary search.

@LukasKalbertodt LukasKalbertodt added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 24, 2020
@EFanZh
Copy link
Contributor

EFanZh commented Jun 27, 2020

There is a faster binary search algorithm implemented in #45333 used by binary_search_by. Also, if this gets merged, there will be two pieces of binary search codes. Maybe implement the binary search algorithm in another function and have both binary_search_by and partition_point calling it?

@VillSnow
Copy link
Contributor Author

Though I'm not sure which is better Amanieu's idea or LukasKalbertodt 's idea, I adopted LukasKalbertodt's idea for consistence

@VillSnow
Copy link
Contributor Author

@EFanZh I don't think the old binary_search code is same as my code and rather similar to new code.
I haven't tested which is faster my code or binary_search code.

Yes binary_search and partition_point can use common base function and should do in the furure. My first objective was to include pertiton_point in std.

src/libcore/slice/mod.rs Outdated Show resolved Hide resolved
Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>
@Amanieu
Copy link
Member

Amanieu commented Jun 28, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Jun 28, 2020

📌 Commit 6f8ad3b has been approved by Amanieu

@bors bors removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 28, 2020
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jun 28, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 28, 2020
…arth

Rollup of 9 pull requests

Successful merges:

 - rust-lang#73577 (Add partition_point)
 - rust-lang#73757 (Const prop: erase all block-only locals at the end of every block)
 - rust-lang#73774 (Make liveness more precise for assignments to fields)
 - rust-lang#73795 (Add some `const_compare_raw_pointers`-related regression tests)
 - rust-lang#73800 (Forward Hash::write_iN to Hash::write_uN)
 - rust-lang#73813 (Rename two `Resolver` traits)
 - rust-lang#73817 (Rename clashing_extern_decl to clashing_extern_declarations.)
 - rust-lang#73826 (Fix docstring typo)
 - rust-lang#73833 (Remove GlobalCtxt::enter_local)

Failed merges:

r? @ghost
@bors bors merged commit ec48989 into rust-lang:master Jun 28, 2020
// SAFETY:
// When left < right, left <= mid < right.
// Therefore left always increases and right always decreases,
// and eigher of them is selected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think eigher is a typo? It should be either :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops thank you.

/// For example, [7, 15, 3, 5, 4, 12, 6] is a partitioned under the predicate x % 2 != 0
/// (all odd numbers are at the start, all even at the end).
///
/// If this slice is not partitioned, the returned result is unspecified and meaningless,
Copy link
Contributor

@Folyd Folyd Jul 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// If this slice is not partitioned, the returned result is unspecified and meaningless,
/// as this method performs a kind of binary search.

I think we'd better change to "If this slice is not sorted ..." as we actually want to tell the end-user an ordered slice is required for this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is right -- the slice has to be partitioned, and not ordered. Raison d'être of partition_point is that it works with non-Ord things. To make this more precise, we might link to https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.is_partitioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed in 95% cases partition_point is used to get lower/upper_bound. However in 5% cases it looks more partition point than lower bound.
For example, we can get all valid values from [3, 1, 4, 1, 5, -1, -1, -1, -1, -1].
Actually this array is sorted by |&x| x != -1 in descending order (true is larger than false), but many user would think that the array is partitioned at index 5 by whether -1 or not.

What do you think about adding like this?

Sorted slice in ascending order is always partitioned by whether less than any value, and in descending order is always partitioned by whether greater than any value. For example, [1, 2, 4, 8, 16] is partitioned by |&x| x < -10, |&x| x < 2, |&x| x < 5, and |&x| x < -100 and other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the kind explanation and sorry for my misunderstanding. Yes, the slice requires partitioned instead of sorted.

However, I still confused by the comment If this slice is not partitioned, the returned result is unspecified and meaningless. If we did call this method on a not partitioned slice, the result should be an Option<size> rather than an unspecified or meaningless value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary search runs in O(log n) by assuming ordered. If it checks whether is sorted, it must slower than O(n).

@Folyd
Copy link
Contributor

Folyd commented Jul 3, 2020

Hi @VillSnow. Thanks for this great PR. However, I found two things that deserved to improve. Please check it out. Thanks! 😺

@VillSnow VillSnow mentioned this pull request Jul 4, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Feb 12, 2021
… r=matklad

Stabilize the partition_point feature

Stabilize the partition_point feature.
Tracking Issue: rust-lang#73831
First PR: rust-lang#73577
@cuviper cuviper added this to the 1.46 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.