Non-musl `f32` functions? #298

jdh8 · 2024-08-11T21:45:44Z

Are you interested in non-musl implementations, especially f32 functions faster on modern architectures? I am writing math functions in Rust from scratch, based on my own research since 2017:
https://github.com/jdh8/metallic-rs

My functions are tested to be faithfully rounded (error < 1 ulp). Unlike most C libraries, I assume that f64 instructions are available even for f32 functions and also try to use FMA if possible. I also keep LUTs to a minimum with WebAssembly in mind.

Benchmarks show that my implementations are faster than musl (libm) and glibc (std on Linux) except for libm::logf
https://github.com/jdh8/metallic-rs/actions/runs/10342894582/job/28626374065

The text was updated successfully, but these errors were encountered:

tgross35 · 2024-08-11T22:47:43Z

@jdh8 do you have any specific examples of which specific functions you are talking about? Supplying math symbols that are not exposed in std is more or less a non-goal of our libm.

jdh8 · 2024-08-12T10:48:11Z

I have published metallic, so its docs are publicly available.
https://docs.rs/metallic/latest/metallic/index.html

For now I have implemented cbrt, exp, exp2, exp10, exp_m1, frexp, hypot, ldexp, ln, ln_1p, log2, log10, round for f32.

Functions not yet in `std`

`exp10`

f32::exp10/metallic::f32::exp10
                        time:   [1.4450 ns 1.4468 ns 1.4491 ns]
f32::exp10/libm::exp10f time:   [8.7490 ns 8.7607 ns 8.7763 ns]

`frexp`

f32::frexp/metallic::f32::frexp
                        time:   [522.51 ps 523.83 ps 525.21 ps]
f32::frexp/libm::frexpf time:   [2.1766 ns 2.1782 ns 2.1802 ns]

`ldexp`

f32::ldexp/metallic::f32::ldexp
                        time:   [731.73 ps 732.56 ps 733.40 ps]
f32::ldexp/libm::ldexpf time:   [8.5147 ns 8.5303 ns 8.5475 ns]

Examples of functions already in `std`

`cbrt`

f32::cbrt/metallic::f32::cbrt
                        time:   [1.2175 ns 1.2204 ns 1.2240 ns]
f32::cbrt/libm::cbrtf   time:   [1.5684 ns 1.5699 ns 1.5717 ns]
f32::cbrt/f32::cbrt     time:   [15.171 ns 15.177 ns 15.184 ns]

`exp`

f32::exp/metallic::f32::exp
                        time:   [1.4475 ns 1.4538 ns 1.4635 ns]
f32::exp/libm::expf     time:   [9.4960 ns 9.5048 ns 9.5178 ns]
f32::exp/f32::exp       time:   [4.2191 ns 4.2225 ns 4.2269 ns]

`hypot`

f32::hypot/metallic::f32::hypot
                        time:   [770.30 ps 771.21 ps 772.27 ps]
f32::hypot/libm::hypotf time:   [3.4183 ns 3.4193 ns 3.4204 ns]
f32::hypot/f32::hypot   time:   [3.7229 ns 3.7271 ns 3.7328 ns]

`log2`

f32::log2/metallic::f32::log2
                        time:   [1.6325 ns 1.6335 ns 1.6347 ns]
f32::log2/libm::log2f   time:   [5.7100 ns 5.7170 ns 5.7267 ns]
f32::log2/f32::log2     time:   [3.2606 ns 3.2647 ns 3.2721 ns]

burrbull · 2024-08-12T12:13:09Z

Is your crate no_std compatible?

jdh8 · 2024-08-12T13:20:27Z

I tried but not yet. I use functions from std to access FMA and rounding instructions.

f32::mul_add:
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/src/f32/mod.rs#L64

f32::round_ties_even:
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/src/f32/mod.rs#L204

f64::mul_add:
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/src/f64/mod.rs#L16

f64::round_ties_even:
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/src/f32/mod.rs#L182
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/src/f32/mod.rs#L239
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/src/f32/mod.rs#L275

beetrees · 2024-08-12T13:38:10Z

Which platform are you doing the performance comparison on? Looking at the performance numbers, you've math builtins don't appear to be coming from Rust libm (and are presumably coming from the system libc); it would be good to know which libc your system is using for comparison purposes.

You've stated that the accuracy of all the functions is error < 1 ULP; is this just an upper bound or can this amount of error actually occur for each functions? Using this paper it's possible to compare to the accuracy of the MUSL versions of cbrtf, expf, hypotf, log2f and exp10f (note that Rust libm's expf and log2f implementations are out-of-date in comparison to MUSL upstream); notably all MUSL's functions except for exp10f have a maximum error smaller than 1 ULP (with the caveat that it's impossible to exhaustively test the bivariate hypotf; see the paper for details), with cbrtf being correctly rounded (error <= 0.5 ULP) in all cases.

I ask because both the CORE-MATH project and LLVM libc both provide correctly rounded (error <= 0.5 ULP) implementations of the functions that have competitive performance (a performance comparison with glibc is displayed on the CORE-MATH project website). All else being equal, it would be good to move towards Rust libm having correctly-rounded implementations where possible.

jdh8 · 2024-08-12T15:24:19Z

I'm using ubuntu-latest, the GitHub Actions runner image.
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/.github/workflows/rust.yml#L23

I just tested if the errors are < 1 ulp. I have not yet computed the maximum error for each function.
https://github.com/jdh8/metallic-rs/blob/f8d05271b187acfe3c486be9e495c261c63739e6/tests/f32/main.rs#L36

Wow, correctly rounded transcendental functions exist! I dare not dream about this because of the Table Maker's Dilemma.

beetrees · 2024-08-13T15:19:13Z

Re. the correctly rounded transcendental functions and the table maker's dilemma: this paper goes in to how the dilemma is dealt with. (For single-argument f32 functions in particular, it's possible to do an exhaustive computation using every possible input.)

jdh8 · 2024-08-13T21:20:20Z

True, exhaustive computation for univariate f32 functions is pretty feasible as I've already tested them exhaustively with their f64 counterparts.

If correct rounding is a goal, I don't think I can make anything better than CORE-MATH. Maybe we should port CORE-MATH or LLVM libc instead? I've read several math functions from CORE-MATH. These functions are already using my assumptions such as calling f64 instructions in f32 functions. (C has floating-point contraction so it does not need special care for FMA.)

beetrees · 2024-08-14T00:19:31Z

I've also been thinking of porting from CORE-MATH: the implementations seem to be self contained (making it easier to port) and well tested against the GNU MPFR library. FYI per rust-lang/rust#128386 (comment) I believe @alexpyattaev is currently looking into porting a new implementation of f64 log.

burrbull · 2024-08-14T02:50:58Z

For note, Rust has good wrappers for MPFR. See rug

jdh8 · 2024-08-14T11:54:54Z

I gain my confidence back! There is still room for optimization because we can ignore the floating-point environment. I'm upgrading my library to correct rounding.

JSorngard · 2024-08-27T15:25:25Z

I also have a project that's semi-related to this issue: I have developed an implementation of the Lambert W function: https://crates.io/crates/lambert_w. Would integrating this capability be interesting? It is no_std and has 50 bits of accuracy according to the paper it is based on. It currently only has support for f64, hence "semi-related".

alexpyattaev · 2024-08-28T06:59:08Z

Thanks, but as far as I know Lambert's W is not in standard library, and this issue is about functions in std. Whether it belongs in std or not is a separate question entirely.

jdh8 mentioned this issue Aug 14, 2024

Upgrade to correct rounding jdh8/metallic-rs#1

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-musl `f32` functions? #298

Non-musl `f32` functions? #298

jdh8 commented Aug 11, 2024

tgross35 commented Aug 11, 2024

jdh8 commented Aug 12, 2024

burrbull commented Aug 12, 2024

jdh8 commented Aug 12, 2024

beetrees commented Aug 12, 2024

jdh8 commented Aug 12, 2024

beetrees commented Aug 13, 2024 •

edited

Loading

jdh8 commented Aug 13, 2024 •

edited

Loading

beetrees commented Aug 14, 2024 •

edited

Loading

burrbull commented Aug 14, 2024

jdh8 commented Aug 14, 2024

JSorngard commented Aug 27, 2024

alexpyattaev commented Aug 28, 2024

Non-musl f32 functions? #298

Non-musl f32 functions? #298

Comments

jdh8 commented Aug 11, 2024

tgross35 commented Aug 11, 2024

jdh8 commented Aug 12, 2024

Functions not yet in std

exp10

frexp

ldexp

Examples of functions already in std

cbrt

exp

hypot

log2

burrbull commented Aug 12, 2024

jdh8 commented Aug 12, 2024

beetrees commented Aug 12, 2024

jdh8 commented Aug 12, 2024

beetrees commented Aug 13, 2024 • edited Loading

jdh8 commented Aug 13, 2024 • edited Loading

beetrees commented Aug 14, 2024 • edited Loading

burrbull commented Aug 14, 2024

jdh8 commented Aug 14, 2024

JSorngard commented Aug 27, 2024

alexpyattaev commented Aug 28, 2024

Non-musl `f32` functions? #298

Non-musl `f32` functions? #298

Functions not yet in `std`

`exp10`

`frexp`

`ldexp`

Examples of functions already in `std`

`cbrt`

`exp`

`hypot`

`log2`

beetrees commented Aug 13, 2024 •

edited

Loading

jdh8 commented Aug 13, 2024 •

edited

Loading

beetrees commented Aug 14, 2024 •

edited

Loading