Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PagedMergeSort, a merge sort using O(√n) space #71

Merged
merged 49 commits into from
Aug 18, 2023
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
8a63108
Add PagedMergeSort
LSchwerdt Feb 5, 2023
e0ebe94
Update references
LSchwerdt Feb 7, 2023
0d4aa65
Return sorted vectors in sort!
LSchwerdt Feb 7, 2023
4d6e518
Make midpoint() backwards compatible
LSchwerdt Feb 7, 2023
58b06fb
Use PagedMergeSort when calling ThreadedPagedMergeSort in version < 1.3
LSchwerdt Feb 7, 2023
ce9ad62
Improve style
LSchwerdt Feb 10, 2023
66315f7
Add PagedMergeSort
LSchwerdt Feb 5, 2023
df33439
Update references
LSchwerdt Feb 7, 2023
b33b4e8
Return sorted vectors in sort!
LSchwerdt Feb 7, 2023
a5d5cea
Make midpoint() backwards compatible
LSchwerdt Feb 7, 2023
c2ff440
Use PagedMergeSort when calling ThreadedPagedMergeSort in version < 1.3
LSchwerdt Feb 7, 2023
2da1f83
Improve style
LSchwerdt Feb 10, 2023
d6f40db
Rebase and apply initial optimizations
LSchwerdt Feb 12, 2023
a25003e
Remove unused variable in function signatures
LSchwerdt Feb 12, 2023
88e5b22
Add algorithms to readme.md
LSchwerdt Feb 12, 2023
436eb79
Rename _midpoint to midpoint
LSchwerdt Feb 12, 2023
dad2a2c
Clarify comments in twoended_merge!
LSchwerdt Feb 12, 2023
cd72df0
Remove unnecessary @inbounds in twoended_merge!
LSchwerdt Feb 12, 2023
90c0ca6
Explain why twoended_merge! is used in comment
LSchwerdt Feb 12, 2023
7841b6a
Use midpoint instead of _midpoint
LSchwerdt Feb 12, 2023
2465073
Merge remote-tracking branch 'LSchwerdt/pagedmergesort' into pagedmer…
LSchwerdt Feb 12, 2023
dab157f
Fix stability of twoended_merge!
LSchwerdt Feb 12, 2023
ea4eb9a
Change function argument locations to match base.Sort
LSchwerdt Feb 12, 2023
d9fcaef
Fix midpoint calculation to always round down
LSchwerdt Feb 12, 2023
7b2a653
Make variable names and function signatures more consistent
LSchwerdt Feb 13, 2023
297348f
Add compat for StaticArrays
LSchwerdt Feb 14, 2023
b51c7ce
Improve documentation
LSchwerdt Feb 14, 2023
72a9550
Bump version number
LSchwerdt Feb 14, 2023
47722cc
Add extra tests for stable sorting algorithms
LSchwerdt Feb 14, 2023
256d1ff
Change extra tests to not use all presorted inputs
LSchwerdt Feb 14, 2023
a45a054
Simplify twoended_merge!
LSchwerdt Feb 16, 2023
f66e799
Use function for core merging loop
LSchwerdt Mar 19, 2023
bb4e706
Replace getNextBlock! macro by function
LSchwerdt Mar 22, 2023
0d9e23f
Reduce number of arguments of next_page!()
LSchwerdt Mar 22, 2023
cb89f11
Refactor page permutation
LSchwerdt Mar 23, 2023
5bcdbb8
Lilith's refactor
Apr 10, 2023
5dab2a1
Apply @LSchwerdt's suggestions
May 28, 2023
bfbe0cf
Merge pull request #1 from LilithHafner/pagedmergesort-lilith-refactor
LSchwerdt May 28, 2023
71fc310
Simplify check for input lengths in paged_merge!
LSchwerdt Aug 11, 2023
2c9ef24
Simplifcations
LSchwerdt Aug 11, 2023
4bbc185
Fix indentation
LSchwerdt Aug 11, 2023
3a8a53b
Remove superfluous check for small inputs in threaded_pagedmergesort!
LSchwerdt Aug 11, 2023
a783092
Improve formatting
LSchwerdt Aug 11, 2023
d3ea719
Remove ThreadedPagedMergesort
LSchwerdt Aug 12, 2023
3f337f0
Use the name "scratch" instead of "buf" or "t"
LSchwerdt Aug 12, 2023
56a609c
Merge branch 'master' into pagedmergesort
LilithHafner Aug 13, 2023
d2ea0a7
Use custom _unsafe_copyto! instead of copyto!
LSchwerdt Aug 14, 2023
717f9ae
Fix: pageLocations is initialized with a negative number of elements
LSchwerdt Aug 14, 2023
d367ee8
Add test for small input without initial optimizations
LSchwerdt Aug 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ The `SortingAlgorithms` package provides three sorting algorithms that can be us
- [HeapSort] – an unstable, general purpose, in-place, O(n log n) comparison sort that works by heapifying an array and repeatedly taking the maximal element from the heap.
- [TimSort] – a stable, general purpose, hybrid, O(n log n) comparison sort that adapts to different common patterns of partially ordered input data.
- [CombSort] – an unstable, general purpose, in-place, O(n log n) comparison sort with O(n^2) pathological cases that can attain good efficiency through SIMD instructions and instruction level parallelism on modern hardware.
- [PagedMergeSort] – a stable, general purpose, O(n log n) time and O(sqrt n) space comparison sort.

[HeapSort]: https://en.wikipedia.org/wiki/Heapsort
[TimSort]: https://en.wikipedia.org/wiki/Timsort
[CombSort]: https://en.wikipedia.org/wiki/Comb_sort
[PagedMergeSort]: https://link.springer.com/chapter/10.1007/BFb0016253

## Usage

Expand Down
Binary file added docs/pagedMerge_130_130.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
279 changes: 278 additions & 1 deletion src/SortingAlgorithms.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,13 @@ using Base.Order
import Base.Sort: sort!
import DataStructures: heapify!, percolate_down!

export HeapSort, TimSort, RadixSort, CombSort
export HeapSort, TimSort, RadixSort, CombSort, PagedMergeSort

struct HeapSortAlg <: Algorithm end
struct TimSortAlg <: Algorithm end
struct RadixSortAlg <: Algorithm end
struct CombSortAlg <: Algorithm end
struct PagedMergeSortAlg <: Algorithm end

function maybe_optimize(x::Algorithm)
isdefined(Base.Sort, :InitialOptimizations) ? Base.Sort.InitialOptimizations(x) : x
Expand Down Expand Up @@ -51,6 +52,34 @@ Characteristics:
"""
const CombSort = maybe_optimize(CombSortAlg())

"""
PagedMergeSort

Indicates that a sorting function should use the paged merge sort
algorithm. Paged merge sort uses is a merge sort, that uses different
merge routines to achieve stable sorting with a scratch space of size O(√n).
The merge routine for merging large subarrays merges
pages of size O(√n) almost in place, before reordering them using a page table.
At deeper recursion levels, where the scratch space is big enough,
normal merging is used, where one input is copied into the scratch space.
When the scratch space is large enough to hold the complete subarray,
the input is merged interleaved from both sides, which increases performance
for random data.

Characteristics:
- *stable*: does preserve the ordering of elements which
compare equal (e.g. "a" and "A" in a sort of letters which
ignores case).
- *`O(√n)`* auxilary memory usage.
- *`O(n log n)`* garuanteed runtime.

## References
- Dvořák, S., Ďurian, B. (1986). Towards an efficient merging. In: Gruska, J., Rovan, B., Wiedermann,
J. (eds) Mathematical Foundations of Computer Science 1986. MFCS 1986. Lecture Notes in Computer Science, vol 233.
Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0016253
- https://max-arbuzov.blogspot.com/2021/10/merge-sort-with-osqrtn-auxiliary-memory.html
"""
const PagedMergeSort = maybe_optimize(PagedMergeSortAlg())

## Heap sort

Expand Down Expand Up @@ -652,4 +681,252 @@ else
end
end

###
# PagedMergeSort
###

# unsafe version of copyto!
# as workaround for https://github.com/JuliaLang/julia/issues/50900
function _unsafe_copyto!(dest, doffs, src, soffs, n)
@inbounds for i in 0:n-1
dest[doffs + i] = src[soffs + i]
end
dest
end

function _unsafe_copyto!(dest::Array, doffs, src::Array, soffs, n)
unsafe_copyto!(dest, doffs, src, soffs, n)
end

# merge v[lo:m] and v[m+1:hi] ([A;B]) using scratch[1:1+hi-lo]
# This is faster than merge! but requires twice as much auxiliary memory.
function twoended_merge!(v::AbstractVector{T}, lo::Integer, m::Integer, hi::Integer, o::Ordering, scratch::AbstractVector{T}) where T
@assert lo ≤ m ≤ hi
@assert abs((m-lo) - (hi-(m+1))) ≤ 1 "twoended_merge! only supports balanced merges"
len = 1 + hi - lo
# input array indices
a_lo = lo
a_hi = m
b_lo = m + 1
b_hi = hi
# output array indices
k_lo = 1
k_hi = len
@inbounds begin
# two ended merge
while k_lo <= len ÷ 2
if lt(o, v[b_lo], v[a_lo])
scratch[k_lo] = v[b_lo]
b_lo += 1
else
scratch[k_lo] = v[a_lo]
a_lo += 1
end
k_lo +=1
if !lt(o, v[b_hi], v[a_hi])
scratch[k_hi] = v[b_hi]
b_hi -= 1
else
scratch[k_hi] = v[a_hi]
a_hi -= 1
end
k_hi -=1
end
# if the input length is odd,
# one item remains
if a_lo <= a_hi
scratch[k_lo] = v[a_lo]
elseif b_lo <= b_hi
scratch[k_lo] = v[b_lo]
end
# copy back from t to v
offset = lo-1
for i = 1:len
v[offset+i] = scratch[i]
end
end
end

# core merging loop used throughout PagedMergeSort
Base.@propagate_inbounds function merge!(f::Function,
target::AbstractVector{T}, source_a::AbstractVector{T}, source_b::AbstractVector{T},
o::Ordering, a::Integer, b::Integer, k::Integer) where T
@inbounds while f(a,b,k)
if lt(o, source_b[b], source_a[a])
target[k] = source_b[b]
b += 1
else
target[k] = source_a[a]
a += 1
end
k += 1
end
a,b,k
end

# merge v[lo:m] and v[m+1:hi] using scratch[1:1+m-lo]
# based on Base.Sort MergeSort
function merge!(v::AbstractVector{T}, lo::Integer, m::Integer, hi::Integer, o::Ordering, scratch::AbstractVector{T}) where {T}
_unsafe_copyto!(scratch, 1, v, lo, m - lo + 1)
f(_, b, k) = k < b <= hi
a, b, k = merge!(f, v, scratch, v, o, 1, m + 1, lo)
_unsafe_copyto!(v, k, scratch, a, b - k)
end

struct Pages
current::Int # current page being merged into
currentNumber::Int # number of current page (=index in pageLocations)
nextA::Int # next possible page in A
nextB::Int # next possible page in B
end

next_page_A(pages::Pages) = Pages(pages.nextA, pages.currentNumber + 1, pages.nextA + 1, pages.nextB)
next_page_B(pages::Pages) = Pages(pages.nextB, pages.currentNumber + 1, pages.nextA, pages.nextB + 1)

Base.@propagate_inbounds function next_page!(pageLocations, pages, pagesize, lo, a)
if a > pages.nextA * pagesize + lo
pages = next_page_A(pages)
else
pages = next_page_B(pages)
end
pageLocations[pages.currentNumber] = pages.current
pages
end

Base.@propagate_inbounds function permute_pages!(f, v, pageLocations, page_offset, pagesize, page)
while f(page)
plc = pageLocations[page-3] # plc has data belonging to page
pageLocations[page-3] = page
_unsafe_copyto!(v, page_offset(page) + 1, v, page_offset(plc) + 1, pagesize)
page = plc
end
page
end

# merge v[lo:m] (A) and v[m+1:hi] (B) using scratch[] in O(sqrt(n)) space
function paged_merge!(v::AbstractVector{T}, lo::Integer, m::Integer, hi::Integer, o::Ordering, scratch::AbstractVector{T}, pageLocations::AbstractVector{<:Integer}) where {T}
@assert lo < m < hi
lenA = 1 + m - lo
lenB = hi - m

# this function only supports merges with length(A) <= length(B),
# which is guaranteed by pagedmergesort!
@assert lenA <= lenB

# regular merge if scratch is big enough
lenA <= length(scratch) && return merge!(v, lo, m, hi, o, scratch)

len = lenA + lenB
pagesize = isqrt(len)
nPages = len ÷ pagesize # a partial page at the end does not count
@assert length(scratch) >= 3pagesize
@assert length(pageLocations) >= nPages - 3

@inline page_offset(page) = (page - 1) * pagesize + lo - 1

@inbounds begin
##################
# merge
##################
# merge the first 3 pages into scratch
a, b, _ = merge!((_, _, k) -> k <= 3pagesize, scratch, v, v, o, lo, m + 1, 1)
# initialize variables for merging into pages
pages = Pages(-17, 0, 1, (m - lo) ÷ pagesize + 2) # first argument is unused
# more efficient loop while more than pagesize elements of A and B are remaining
while_condition1(offset) = (_, _, k) -> k <= offset + pagesize
while a < m - pagesize && b < hi - pagesize
pages = next_page!(pageLocations, pages, pagesize, lo, a)
offset = page_offset(pages.current)
a, b, _ = merge!(while_condition1(offset), v, v, v, o, a, b, offset + 1)
end
# merge until either A or B is empty or the last page is reached
k, offset = nothing, nothing
while_condition2(offset) = (a, b, k) -> k <= offset + pagesize && a <= m && b <= hi
while a <= m && b <= hi && pages.currentNumber + 3 < nPages
pages = next_page!(pageLocations, pages, pagesize, lo, a)
offset = page_offset(pages.current)
a, b, k = merge!(while_condition2(offset), v, v, v, o, a, b, offset + 1)
end
# if the last page is reached, merge the remaining elements into the final partial page
if pages.currentNumber + 3 == nPages && a <= m && b <= hi
a, b, k = merge!((a, b, _) -> a <= m && b <= hi, v, v, v, o, a, b, nPages * pagesize + lo)
_unsafe_copyto!(v, k, v, a <= m ? a : b, hi - k + 1)
else
use_a = a <= m
# copy the incomplete page
partial_page_size = offset + pagesize - k + 1
_unsafe_copyto!(v, k, v, use_a ? a : b, partial_page_size)
use_a && (a += partial_page_size)
use_a || (b += partial_page_size)
# copy the remaining full pages
while use_a ? a <= m - pagesize + 1 : b <= hi - pagesize + 1
pages = next_page!(pageLocations, pages, pagesize, lo, a)
offset = page_offset(pages.current)
_unsafe_copyto!(v, offset + 1, v, use_a ? a : b, pagesize)
use_a && (a += pagesize)
use_a || (b += pagesize)
LSchwerdt marked this conversation as resolved.
Show resolved Hide resolved
end
# copy the final partial page only if sourcing from A.
# If sourcing from B, it is already in place.
use_a && _unsafe_copyto!(v, hi - m + a, v, a, m - a + 1)
end

##################
# rearrange pages
##################
# copy pages belonging to the 3 permutation chains ending with a page in the scratch space
nextA, nextB = pages.nextA, pages.nextB

for _ in 1:3
page = (nextB > nPages ? (nextA += 1) : (nextB += 1)) - 1
page = permute_pages!(>(3), v, pageLocations, page_offset, pagesize, page)
_unsafe_copyto!(v, page_offset(page) + 1, scratch, (page - 1) * pagesize + 1, pagesize)
end

# copy remaining permutation cycles
for donePageIndex = 5:nPages
# linear scan through pageLocations to make sure no cycle is missed
page = pageLocations[donePageIndex-3]
page == donePageIndex && continue

# copy the data belonging to donePageIndex into scratch
_unsafe_copyto!(scratch, 1, v, page_offset(page) + 1, pagesize)

# follow the cycle starting with the newly freed page
permute_pages!(!=(donePageIndex), v, pageLocations, page_offset, pagesize, page)
_unsafe_copyto!(v, page_offset(donePageIndex) + 1, scratch, 1, pagesize)
end
end
end

# midpoint was added to Base.sort in version 1.4 and later moved to Base
# -> redefine for compatibility with earlier versions
midpoint(lo::Integer, hi::Integer) = lo + ((hi - lo) >>> 0x01)

function pagedmergesort!(v::AbstractVector{T}, lo::Integer, hi::Integer, o::Ordering, scratch::AbstractVector{T}, pageLocations) where {T}
len = hi + 1 - lo
if len <= Base.SMALL_THRESHOLD
return Base.Sort.sort!(v, lo, hi, Base.Sort.InsertionSortAlg(), o)
end
m = midpoint(lo, hi - 1) # hi-1: ensure midpoint is rounded down. OK, because lo < hi is satisfied here
pagedmergesort!(v, lo, m, o, scratch, pageLocations)
pagedmergesort!(v, m + 1, hi, o, scratch, pageLocations)
if len <= length(scratch)
twoended_merge!(v, lo, m, hi, o, scratch)
else
paged_merge!(v, lo, m, hi, o, scratch, pageLocations)
end
return v
end

function sort!(v::AbstractVector, lo::Integer, hi::Integer, ::PagedMergeSortAlg, o::Ordering)
lo >= hi && return v
n = hi + 1 - lo
pagesize = isqrt(n)
scratch = Vector{eltype(v)}(undef, 3pagesize)
nPages = n ÷ pagesize
pageLocations = Vector{Int}(undef, max(0, nPages - 3))
pagedmergesort!(v, lo, hi, o, scratch, pageLocations)
return v
end
end # module
25 changes: 19 additions & 6 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,13 @@ using Test
using StatsBase
using Random

stable_algorithms = [TimSort, RadixSort, PagedMergeSort]
unstable_algorithms = [HeapSort, CombSort]

a = rand(1:10000, 1000)
am = [rand() < .9 ? i : missing for i in a]

for alg in [TimSort, HeapSort, RadixSort, CombSort, SortingAlgorithms.TimSortAlg()]
for alg in [stable_algorithms; unstable_algorithms; SortingAlgorithms.TimSortAlg()]
b = sort(a, alg=alg)
@test issorted(b)
ix = sortperm(a, alg=alg)
Expand Down Expand Up @@ -94,8 +97,7 @@ for n in [0:10..., 100, 101, 1000, 1001]
invpermute!(c, pi)
@test c == v

# stable algorithms
for alg in [TimSort, RadixSort]
for alg in stable_algorithms
p = sortperm(v, alg=alg, order=ord)
@test p == pi
s = copy(v)
Expand All @@ -105,8 +107,7 @@ for n in [0:10..., 100, 101, 1000, 1001]
@test s == v
end

# unstable algorithms
for alg in [HeapSort, CombSort]
for alg in unstable_algorithms
p = sortperm(v, alg=alg, order=ord)
@test isperm(p)
@test v[p] == si
Expand All @@ -120,7 +121,7 @@ for n in [0:10..., 100, 101, 1000, 1001]

v = randn_with_nans(n,0.1)
for ord in [Base.Order.Forward, Base.Order.Reverse],
alg in [TimSort, HeapSort, RadixSort, CombSort]
alg in [stable_algorithms; unstable_algorithms]
# test float sorting with NaNs
s = sort(v, alg=alg, order=ord)
@test issorted(s, order=ord)
Expand All @@ -138,3 +139,15 @@ for n in [0:10..., 100, 101, 1000, 1001]
@test reinterpret(UInt64,vp) == reinterpret(UInt64,s)
end
end

for T in (Float64, Int, UInt8)
for alg in stable_algorithms
for ord in [Base.Order.By(identity), Base.Order.By(_ -> 0), Base.Order.By(Base.Fix2(÷, 100))]
for n in vcat(0:31, 40:11:100, 110:51:1000)
v = rand(T, n)
# use MergeSort to guarantee stable sorting in Julia 1.0
@test sort(v, alg=alg, order=ord) == sort(v, alg=MergeSort, order=ord)
end
end
end
end
Loading