-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add non-allocating sort function to libcore #790
Comments
I have no experience with Rust, but I am currently maintaining a C++ sorting library. I couldn't help but notice this issue about sorting. My library includes adapted versions of the GrailSort and WikiSort mentioned in the previous issue, which are the only O(n log n) in-place stable algorithms I know that can work without allocating memory (even though allocating a fixed buffer is better for both of them). You know how it is: benchmarks are full of surprises, and I couldn't reproduce the patterns for which GrailSort is better. In all my test cases, WikiSort is always better than GrailSort (it might come from my C++ port of GrailSort though), even when bigger buffers are given to GrailSort for the merge operations. Now, as mentioned in the previous issue, WikiSort is complex. However, it can be made a bit simpler: I managed to replace some of the operations with some algorithms from the C++ standard library (I don't know whether Rust has such algorithms), the embedded sort for 3 values can be replaced by a call to a vanilla insertion sort and IME sorting networks are only worth it with integers and when carefully implemented. Those in WikiSort are even more complex to maintain the stability. I didn't try it yet, but I believe that all the sorting networks part can be replaced by an insertion sort too. Moreover, if the plan is to use no additional memory at all, even a fixed-sized buffer, the algorithm can even be simplified a bit. That said, even with those modifications, I think that it's still around ~600 LOC and it really loses a bit of performance when it doesn't have even a fixed-size buffer. If you ever decide that you don't care about stability, or choose to provide separate stable and unstable sorting algorithms, then the pattern-defeating quicksort is the best in-place sorting algorithm I know of that does not use any additional memory, and it's around ~300 LOC. It's generally faster than introsort and may make its way into libc++ and libstdc++. Update: a new version of pattern-defeating quicksort beats branch prediction and can subsequently be ~twice as fast as introsort, but the algorithm is more complex (borrowing ideas from BlockQuicksort: How Branch Mispredictions don't affect Quicksort). Hope that will help you to choose the appropriate sorting algorithm :) |
The algorithm should make sure destructors run once if the comparator panics. The current algorithm in libcollections accounts for this. |
I independently came to the conclusion that it would be nice to be able to do this. Is this something that is still open for discussion, and does anyone know why the current algorithm was chosen? |
IIRC we wanted something stable and without an n^2 worst case. |
The grailsort and wikisort algorithms talked about in the previous issue seem to match those criteria. |
The pattern-defeating quicksort |
It isn't stable though. |
CS.SE Q&A here with good background. |
If we can't do it as Stability is obviously important, in both the sorting and backwards compatibility senses. But there's lots of things for which you don't care about stability, and our current implementations will grab a lot of ram for large arrays (at least, per the documentation). You could not feasibly sort a 1 GB vector with stdlib, unless you're operating on an atypically powerful machine. If you do and it fails, it panics; you can't even catch the error. If we can't add an allocating one, we should at least add one which returns Perhaps allocating is not actually as big a deal as I think, but adding these methods to stdlib seems to have very little long-term costs. Obviously, the best approach is |
To put some numbers into perspective, I did my best at implementing both stable and unstable sorts in Rust. The stable sort is a variant of timsort, which recently got merged into libstd. The differences in performance are pretty interesting. |
Non-allocating |
Issue by mahkoh
Saturday Nov 22, 2014 at 22:45 GMT
For earlier discussion, see rust-lang/rust#19221
This issue was labelled with: A-collections, A-libs, I-enhancement in the Rust repository
The text was updated successfully, but these errors were encountered: