Skip to content

Commit

Permalink
stdlib: Remove use of mergesort on qsort (BZ 21719)
Browse files Browse the repository at this point in the history
This patch removes the mergesort optimization on qsort implementation
and uses the introsort instead.  The mergesort implementation has some
issues:

  - It is as-safe only for certain types sizes (if total size is less
    than 1 KB with large element sizes also forcing memory allocation)
    which contradicts the function documentation.  Although not required
    by the C standard, it is preferable and doable to have an O(1) space
    implementation.

  - The malloc for certain element size and element number adds
    arbitrary latency (might even be worse if malloc is interposed).

  - To avoid trigger swap from memory allocation the implementation
    relies on system information that might be virtualized (for instance
    VMs with overcommit memory) which might lead to potentially use of
    swap even if system advertise more memory than actually has.  The
    check also have the downside of issuing syscalls where none is
    expected (although only once per execution).

  - The mergesort is suboptimal on an already sorted array (BZ#21719).

The introsort implementation is already optimized to use constant extra
space (due to the limit of total number of elements from maximum VM
size) and thus can be used to avoid the malloc usage issues.

Resulting performance is slower due the usage of qsort, specially in the
worst-case scenario (partialy or sorted arrays) and due the fact
mergesort uses a slight improved swap operations.

This change also renders the BZ#21719 fix unrequired (since it is meant
to fix the sorted input performance degradation for mergesort).  The
manual is also updated to indicate the function is now async-cancel
safe.

Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
  • Loading branch information
zatrazz committed Oct 31, 2023
1 parent 274a46c commit 03bf835
Show file tree
Hide file tree
Showing 7 changed files with 16 additions and 323 deletions.
2 changes: 0 additions & 2 deletions include/stdlib.h
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,6 @@ extern int __posix_openpt (int __oflag) attribute_hidden;
extern int __add_to_environ (const char *name, const char *value,
const char *combines, int replace)
attribute_hidden;
extern void _quicksort (void *const pbase, size_t total_elems,
size_t size, __compar_d_fn_t cmp, void *arg);

extern int __on_exit (void (*__func) (int __status, void *__arg), void *__arg);

Expand Down
2 changes: 1 addition & 1 deletion manual/argp.texi
Original file line number Diff line number Diff line change
Expand Up @@ -735,7 +735,7 @@ for options, bad phase of the moon, etc.
@c hol_set_group ok
@c hol_find_entry ok
@c hol_sort @mtslocale @acucorrupt
@c qsort dup @acucorrupt
@c qsort dup
@c hol_entry_qcmp @mtslocale
@c hol_entry_cmp @mtslocale
@c group_cmp ok
Expand Down
3 changes: 1 addition & 2 deletions manual/locale.texi
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@ The symbols in this section are defined in the header file @file{locale.h}.
@c calculate_head_size ok
@c __munmap ok
@c compute_hashval ok
@c qsort dup @acucorrupt
@c qsort dup
@c rangecmp ok
@c malloc @ascuheap @acsmem
@c strdup @ascuheap @acsmem
Expand All @@ -275,7 +275,6 @@ The symbols in this section are defined in the header file @file{locale.h}.
@c realloc @ascuheap @acsmem
@c realloc @ascuheap @acsmem
@c fclose @ascuheap @asulock @acsmem @acsfd @aculock
@c qsort @ascuheap @acsmem
@c alias_compare dup
@c libc_lock_unlock @aculock
@c _nl_explode_name @ascuheap @acsmem
Expand Down
7 changes: 3 additions & 4 deletions manual/search.texi
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ To sort an array using an arbitrary comparison function, use the

@deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare})
@standards{ISO, stdlib.h}
@safety{@prelim{}@mtsafe{}@assafe{}@acunsafe{@acucorrupt{}}}
@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
The @code{qsort} function sorts the array @var{array}. The array
contains @var{count} elements, each of which is of size @var{size}.

Expand Down Expand Up @@ -199,9 +199,8 @@ Functions}):
The @code{qsort} function derives its name from the fact that it was
originally implemented using the ``quick sort'' algorithm.

The implementation of @code{qsort} in this library might not be an
in-place sort and might thereby use an extra amount of memory to store
the array.
The implementation of @code{qsort} in this library is an in-place sort
and uses a constant extra space (allocated on the stack).
@end deftypefun

@node Search/Sort Example
Expand Down
2 changes: 0 additions & 2 deletions stdlib/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,6 @@ routines := \
mbtowc \
mrand48 \
mrand48_r \
msort \
nrand48 \
nrand48_r \
old_atexit \
Expand Down Expand Up @@ -380,7 +379,6 @@ generated += \
# generated

CFLAGS-bsearch.c += $(uses-callbacks)
CFLAGS-msort.c += $(uses-callbacks)
CFLAGS-qsort.c += $(uses-callbacks)
CFLAGS-system.c += -fexceptions
CFLAGS-system.os = -fomit-frame-pointer
Expand Down
309 changes: 0 additions & 309 deletions stdlib/msort.c

This file was deleted.

Loading

0 comments on commit 03bf835

Please sign in to comment.