Releases: ARM-software/optimized-routines
Releases · ARM-software/optimized-routines
v24.05 release
- Math routine changes
- Fixed AdvSIMD vector powf and log for the big-endian target.
- Fixed an undefined signed shift in the exp10 code, unlikely
to cause problems in practice. - AdvSIMD pow got minor optimizations.
- Now there is a build option to disable SIMD and exp10 tests
to allow testing libcs without those symbols.
- pl/ directory
- Several big-endian fixes and code cleanups.
- This continues to host many math routines with mixed quality.
v24.01 release
- String routine changes
- Added memcpy, memmove, memset for MOPS extension.
- Optimized memcpy by improving code alignment.
- Fixed GNU property note on ILP32.
- Math routine changes
- Vector math code now uses ACLE intrinsics and aarch64 only.
- Vector math code no longer builds scalar and base PCS variants.
- Optimized vector sin and cos.
- Added tgamma128, a binary128 tgammal implementation.
- pl/ directory
- This continues to host many math routines with mixed quality.
v23.01 release
- Project changes
- All files are under a new dual license now (MIT OR Apache-2.0 WITH LLVM-exception at the election of the user).
- Added MAINTAINERS file describing who maintains the subdirectories.
- Added README.contributors files documenting contribution requirements.
- Added new pl/ subdirectory for Arm's Performance Library related routines.
- String routine changes
- Added memset benchmark.
- Improved strlen and memcpy benchmarks.
- Added SVE memcpy.
- Updated arm string functions to support M-profile PACBTI.
- Merged the MTE and generic versions of strcmp, strncmp, strcpy and stpcpy into one implementation.
- Optimized memcmp, memchr-mte, memrchr, strchr-mte, strchrnul-mte, strrchr-mte, strlen, strlen-mte, strnlen, strcpy.
- Math routine changes
- Fixed constants in sinf, cosf and sincosf to be compile time computed even with gcc-12 -frounding-math.
- Fixed an invalid shift in logf.
- Support floating-point exceptions in vector math routines when WANT_SIMD_EXCEPT is set.
v21.02 release
- String routine changes
- Added AArch64 ILP32 ABI support.
- Fixed SVE strnlen return value.
- Added MTE related __mtag_tag_region.
- Added MTE related __mtag_tag_zero_region.
- Minor code cleanups.
v20.11 release
- New math routines
- Scalar erff and erf using fma.
v20.08 release
- Bug fixes
- strcmp-mte nul check
- strncmp-mte with large size
- arm memcpy with large size (CVE-2020-6096)
- String routines performance improvements
- strlen
- memmove with backward copy
- Benchmarking code for strings and memory routines
- strlen
v20.05 release
- New functionality (64-bit Arm)
- string: Optimized MTE variants of strlen, strnlen, strchr, strchrnul, strrchr, memchr, memrchr, strcpy, stpcpy, strcmp, strncmp
- string: Changes to support BTI
- string: New optimized memrchr, strnlen
- Performance improvements (Neoverse N1)
- strchr/strchrnul: 21% improvement on long strings
- strrchr: 11% improvement
- strnlen: 130% improvement on long strings, 50% on short strings
- Benchmark and tests
- string: New memcpy benchmark
- string: Cleanup testsuite and improve test coverage
v20.02 release
New functionality
- string: New strrchr and stpcpy routines
- string: New Memory Tagging Extension (MTE) variants of strlen and strchr
- math: New vector version of pow(double)
- networking: Optimized ones' complement checksum for 32-bit and 64-bit Arm
Performance improvements
- string: Improved memcpy and memmove (SIMD and non-SIMD) for 64-bit Arm
- string: Improved memset for 64-bit Arm