v5.0
Release 5.0 adds new API's to support sorting arrays of custom defined objects and sorting key-value pairs of arrays. We also now have AVX2 acceleration for argsort
and argselect
methods. Here is a gist of the new features:
- x86-simd-sort now supports an API to sort custom defined C++ objects via
object_qsort
. Please refer to README file on how to use it. Its performance can vary depending on the definition of the custom class. For the sake of illustration, sorting a simplestruct
based on one of its members can be up-to 4-5x faster than usingstd::sort
on machines with AVX-512. Refer to the perf section for more details. - New API
keyvalue_qsort
to sort a pair of arrays representing key-value pairs (32-bit and 64-bit data types). These are an order of magnitude faster when compared to scalar ways of sorting them. - AVX2 support for
argsort
andargselect
methods. These have been merged into NumPy and will be available with NumPy v2.0.
What's Changed
- README.md: fix broken link by @rouault in #98
- Improve emulation of AVX2 min/max 64-bit by @r-devulap in #99
- fix numpy CI failures by @r-devulap in #100
- Add key-value sort to runtime dispatch by @r-devulap in #105
- Avoid masks when possible in AVX2 logic by @sterrettm2 in #104
- Add more benchmarks by @r-devulap in #106
- Support key-value sort for 32-bit dtypes by @r-devulap in #108
- Add API to sort array of custom objects by @r-devulap in #103
- Mark explicit template specializations inline by @r-devulap in #112
- BUG: bug fix in avx512_qsort_fp16 by @r-devulap in #113
- CI: build and test on 32-bit linux by @r-devulap in #114
- Support for AVX2 argsort/argselect/key-value sort by @sterrettm2 in #110
- Changes argsort/argselect to use generic networks by @sterrettm2 in #102
- Improve key-value sort performance by @r-devulap in #120
- Add IPP sort to benchmarks by @r-devulap in #121
- Build fix on macOS 64-bit by @r-devulap in #122
- Fix more build issues on macOS by @r-devulap in #123
- Add avx2_vector defintion for size_t on macOS by @r-devulap in #124
- [fix] update link in README.md by @icfaust in #125
- Use uint32_t instead of size_t for object sort by @r-devulap in #126
- Add simple test for objsort by @r-devulap in #128
- Update README with object_qsort by @r-devulap in #130
New Contributors
Full Changelog: v4.0...v5.0