Introduce generators that respect function domains #348

tgross35 · 2024-10-29T10:38:49Z

Introduce a Domain trait that allows us to define what the interesting inputs are on a per-function basis. This trait is used in gen::domain to create sequences of values that are either (1) around interesting points of this domain, or (2) logarithmically spaced within the domain.

Compared to the random generators, this means that we don't waste time checking large quantities of different NaNs or out of bound inputs (e.g. negative numbers and NaNs take up more than half the float space, this would be wasted checking sqrt which is only defined for x >= 0). It also means we know that coverage is uniform across the entire domain.

Currently only unary operations are supported.

This also includes a f8 type that is just helpful for testing ULP ops since it is easily possible to list all values. I was going to remove this, but it turned out to be useful enough that I think I'll keep it around for future development.

tgross35 · 2024-11-01T01:39:29Z

Rebased to build on #349

tgross35 · 2024-11-02T14:36:19Z

I think the current logic is pretty good. Still needs cleanup but these are the check points in debug mode:

Release numbers take forever to generate the plots but it just shows more even coverage as expected

tgross35 · 2024-11-02T22:41:59Z

@beetrees @quaternic you both suggested this at different points, would you mind reviewing this? I still have to wire up tests but I think the generator itself is good, see the above plots.

The interesting parts are crates/libm-test/src/domain.rs (domain definitions) crates/libm-test/src/gen/domain.rs (making use of the domain to generate test cases) and crates/libm-test/src/num.rs (operations for skipping up/down a fixed number of ULP).

quaternic · 2024-11-03T05:04:04Z

crates/libm-test/src/domain.rs

+    const DEFINED: (Bound<T>, Bound<T>);
+
+    /// The region, if any, for which the function repeats. Used to test within.
+    const PERIODIC: Option<(Bound<T>, Bound<T>)> = None;


I'm not convinced periodicity will be that useful here, since not that many of the functions actually are periodic. Mainly sin, cos, tan, and fract I suppose. Out of those, only fract has a period (1.0) which is actually representable as a float. The functions periodic in π are problematic exactly because of that; the worst-case inputs are very large ones where the function just happens to land very close to zero. Unlike fract which is just exactly zero for numbers large enough.

I got this from another math library (I thought it was core-math maybe not since I can't find it now...) that suggested the heaviest testing within one period to get better coverage with the larger float types. But in retrospect this doesn't make a lot of sense to me given that -π..π already includes more than 50% of the float's representable values.

Do you think it is better to just drop any references to periodicity and instead test with Unbounded? Or is there still some advantage to heavier testing within this range, or keeping the information around for some future use.

I'd just drop it, but it's not something I've thought about all that much. It could be useful to have the information somewhere (e.g. [0,2π] or [-π,π] might be very reasonable input ranges to have on some benchmarks) but it's mostly only relevant for sin/cos/tan.

Yeah, it applies to a lot less than I expected it to when starting this. Okay, I'll remove it (but not immediately in case other refactoring comes up in review). Keeping check_points with multiples of π/2 seems useful to generate extra tests around there, considering I think the approximations tend to blow up at poles & zeros of $f^{(n)}(x)$.

Github thinks you're going to comment in the future? 🤔 edit: oh, must be related to DST in the US

I was wondering about that as well, good to see it's just a visual glitch. Yours is too 🤔 Apparently the US is turning clocks back at this time (daylight savings)

👍 for removing PERIODIC.

Fix up traits Wip up down Tests pass! Cleanup Enable full test range More tests Code cleanup More tests, all pass Cleanup, passing Add visualization scripts Add visualization scripts updates to script clippy clippy clippy Docs update docs docs

tgross35 · 2024-11-03T09:09:08Z

crates/libm-test/src/gen/domain.rs

+/// Number of values near an interesting point to check.
+const AROUND: usize = 100;
+
+/// Number of tests to run.
+const NTESTS: usize = {
+    if cfg!(optimizations_enabled) {
+        if crate::emulated()
+            || !cfg!(target_pointer_width = "64")
+            || cfg!(all(target_arch = "x86_64", target_vendor = "apple"))
+        {
+            // Tests are pretty slow on non-64-bit targets, x86 MacOS, and targets that run
+            // in QEMU.
+            100_000
+        } else {
+            5_000_000
+        }
+    } else {
+        // Without optimizations just run a quick check
+        800
+    }
+};
+
+/// Some functions have infinite asymptotes, limit how many we check.
+const MAX_ASYMPTOTES: usize = 10;


I need to refactor this all a bit considering I just copied and pasted NTESTS from the random tests. Still brainstorming how to do this (input appreciated) but I'm thinking maybe something like:

Have N number of tests by default, which is 5M in the existing code (but probably needs to be reduced based on the below)

f32 unary functions run N random tests

f64 and f128 unary functions run N * 4 random tests

If the function takes two inputs, multiply the number of tests by 4. If it takes three inputs, multiply by 8.

If a domain test exists for the function, run N domain-based tests and reduce the number of random tests by a factor of 100

If an exhaustive test for f32 (not yet implemented) or high-iteration test (f64/f128 or multi-input functions, also not implemented) should be run, still run the "interesting points" portion of domain-based tests but replace the logspace tests with whatever fits. I should probably split this into two generators rather than chaining...

Basically in all cases, either the exhaustive tests or the logspace are going to consume the bulk of time. But we still want to run "interesting points" and the random tests in hopes that they will find errors earlier than waiting for the whole exhaustive check to run. Also random tests should cover some signaling NaNs.

Sounds like a reasonable plan to start with, especially checking the "interesting" cases first will make development easier. Ultimately the total number of test cases that get run is mainly a function of how much CI time we're willing to spend running them; it might be worth deciding on a rough estimate for that and then tuning the total test count to fit within it. Also, all the "magic" somewhat-arbitrary constants (AROUND, NTESTS, MAX_ASYMPTOTES etc.) for how many of each type of test to run should probably be centralised in a single file somewhere to make it easier to keep track of what tests are being run.

beetrees · 2024-11-21T09:47:10Z

src/math/support/float_traits.rs

+        let is_nan = |x: Self| -> bool {
+            // }
+            // fn is_nan(x: Self) -> bool {
+            // When using mangled-names, the "real" compiler-builtins might not have the
+            // necessary builtin (__unordtf2) to test whether `f128` is NaN.
+            // FIXME(f16_f128): Remove once the nightly toolchain has the __unordtf2 builtin
+            // x is NaN if all the bits of the exponent are set and the significand is non-0
+            x.to_bits() & Self::EXP_MASK == Self::EXP_MASK
+                && x.to_bits() & Self::SIG_MASK != Self::Int::ZERO
+        };
+        if is_nan(self) && is_nan(rhs) { true } else { self.to_bits() == rhs.to_bits() }


Suggested change

let is_nan = |x: Self| -> bool {

// }

// fn is_nan(x: Self) -> bool {

// When using mangled-names, the "real" compiler-builtins might not have the

// necessary builtin (__unordtf2) to test whether `f128` is NaN.

// FIXME(f16_f128): Remove once the nightly toolchain has the __unordtf2 builtin

// x is NaN if all the bits of the exponent are set and the significand is non-0

x.to_bits() & Self::EXP_MASK == Self::EXP_MASK

&& x.to_bits() & Self::SIG_MASK != Self::Int::ZERO

};

if is_nan(self) && is_nan(rhs) { true } else { self.to_bits() == rhs.to_bits() }

if self.is_nan() && rhs.is_nan() { true } else { self.to_bits() == rhs.to_bits() }

__unordtf2 is in nightly now, so may as well remove the workaround.

Oh right, we could do this in builtins as well.

beetrees · 2024-11-21T10:07:00Z

crates/libm-test/src/f8_impl.rs

+    pub const ALL_LEN: usize = 240;
+
+    /// All non-infinite non-NaN values of `f8` excluding `-0`.
+    pub const ALL: [Self; Self::ALL_LEN] = [


Could a compile time for loop be used to generate this array instead of listing all the values out manually? Something like:

pub const ALL: [Self; Self::ALL_LEN] = { let mut all = [Self(0); Self::ALL_LEN]; let mut i = 0; let mut next = 0b1_1110_111; while next >= 0b1_0000_000 { all[i] = Self(next); i += 1; next -= 1; } let mut next = 0b0_0000_000; while next <= 0b0_1110_111 { all[i] = Self(next); i += 1; next += 1; } assert!(i == Self::ALL_LEN); all };

beetrees · 2024-11-21T10:07:24Z

crates/libm-test/src/f8_impl.rs

+impl f8 {
+    pub const ALL_LEN: usize = 240;
+
+    /// All non-infinite non-NaN values of `f8` excluding `-0`.


This comment says the array excludes -0, but the array includes -0.

Thanks, I did change that at some point.

beetrees · 2024-11-21T10:09:27Z

crates/libm-test/src/f8_impl.rs

+            // comparison to get the correct result.  (This assumes a twos- or ones-
+            // complement integer representation; if integers are represented in a
+            // sign-magnitude representation, then this flip is incorrect).


Suggested change

// comparison to get the correct result. (This assumes a twos- or ones-

// complement integer representation; if integers are represented in a

// sign-magnitude representation, then this flip is incorrect).

// comparison to get the correct result.

Rust integers are always twos-complement.

beetrees · 2024-11-21T10:13:26Z

crates/libm-test/src/domain.rs

+    const DEFINED: (Bound<T>, Bound<T>);
+
+    /// The region, if any, for which the function repeats. Used to test within.
+    const PERIODIC: Option<(Bound<T>, Bound<T>)> = None;


👍 for removing PERIODIC.

beetrees · 2024-11-21T10:29:34Z

crates/libm-test/src/num.rs

+        for i in 0..f8::ALL_LEN {
+            let v = f8::ALL[i];


Suggested change

for i in 0..f8::ALL_LEN {

let v = f8::ALL[i];

for (i, v) in f8::ALL.into_iter().enumerate() {

beetrees · 2024-11-21T10:31:26Z

crates/libm-test/src/num.rs

+        for i in 0..f8::ALL_LEN {
+            let v = f8::ALL[i];


Suggested change

for i in 0..f8::ALL_LEN {

let v = f8::ALL[i];

for (i, v) in f8::ALL.into_iter().enumerate() {

beetrees · 2024-11-21T10:33:03Z

crates/libm-test/src/num.rs

+        for i in 0..f8::ALL_LEN {
+            for j in 0..f8::ALL_LEN {
+                let x = f8::ALL[i];
+                let y = f8::ALL[j];


Suggested change

for i in 0..f8::ALL_LEN {

for j in 0..f8::ALL_LEN {

let x = f8::ALL[i];

let y = f8::ALL[j];

for (i, x) in f8::ALL.into_iter().enumerate() {

for (j, y) in f8::ALL.into_iter().enumerate() {

beetrees · 2024-11-21T10:39:41Z

crates/libm-test/src/gen/domain.rs

+    values.extend(count_down(F::MAX).take(AROUND));
+
+    // Check some special values that aren't included in the above ranges
+    values.push(F::NAN);


Maybe check +/- an sNaN and -F::NAN here as well?

beetrees · 2024-11-21T10:51:12Z

crates/libm-test/src/gen/domain.rs

+/// Number of values near an interesting point to check.
+const AROUND: usize = 100;
+
+/// Number of tests to run.
+const NTESTS: usize = {
+    if cfg!(optimizations_enabled) {
+        if crate::emulated()
+            || !cfg!(target_pointer_width = "64")
+            || cfg!(all(target_arch = "x86_64", target_vendor = "apple"))
+        {
+            // Tests are pretty slow on non-64-bit targets, x86 MacOS, and targets that run
+            // in QEMU.
+            100_000
+        } else {
+            5_000_000
+        }
+    } else {
+        // Without optimizations just run a quick check
+        800
+    }
+};
+
+/// Some functions have infinite asymptotes, limit how many we check.
+const MAX_ASYMPTOTES: usize = 10;


Sounds like a reasonable plan to start with, especially checking the "interesting" cases first will make development easier. Ultimately the total number of test cases that get run is mainly a function of how much CI time we're willing to spend running them; it might be worth deciding on a rough estimate for that and then tuning the total test count to fit within it. Also, all the "magic" somewhat-arbitrary constants (AROUND, NTESTS, MAX_ASYMPTOTES etc.) for how many of each type of test to run should probably be centralised in a single file somewhere to make it easier to keep track of what tests are being run.

tgross35 force-pushed the function-domains branch from 6838d3b to 7c236b8 Compare November 1, 2024 01:39

tgross35 force-pushed the function-domains branch 2 times, most recently from 107b4ab to a16e073 Compare November 2, 2024 06:33

tgross35 force-pushed the function-domains branch 7 times, most recently from dc1a1da to 385e850 Compare November 2, 2024 22:40

tgross35 changed the title ~~Function domains~~ Generators that respect function domains Nov 2, 2024

quaternic reviewed Nov 3, 2024

View reviewed changes

tgross35 added 3 commits November 3, 2024 02:35

Basic domain

bf22040

Fix up traits Wip up down Tests pass! Cleanup Enable full test range More tests Code cleanup More tests, all pass Cleanup, passing Add visualization scripts Add visualization scripts updates to script clippy clippy clippy Docs update docs docs

First pass at wiring things up

d2ab22c

todo -> unimplemented for f8 since we probably won't implement anything

668ea32

tgross35 commented Nov 3, 2024

View reviewed changes

tgross35 force-pushed the function-domains branch from 86fe390 to 668ea32 Compare November 3, 2024 09:36

tgross35 changed the title ~~Generators that respect function domains~~ Introduce generators that respect function domains Nov 3, 2024

beetrees reviewed Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce generators that respect function domains #348

Introduce generators that respect function domains #348

tgross35 commented Oct 29, 2024 •

edited

Loading

tgross35 commented Nov 1, 2024

tgross35 commented Nov 2, 2024 •

edited

Loading

tgross35 commented Nov 2, 2024 •

edited

Loading

quaternic Nov 3, 2024

tgross35 Nov 3, 2024

quaternic Nov 3, 2024

tgross35 Nov 3, 2024 •

edited

Loading

quaternic Nov 3, 2024

beetrees Nov 21, 2024

tgross35 Nov 3, 2024

beetrees Nov 21, 2024 •

edited

Loading

beetrees Nov 21, 2024

tgross35 Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024

tgross35 Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024

beetrees Nov 21, 2024 •

edited

Loading

	for i in 0..f8::ALL_LEN {
	let v = f8::ALL[i];
	for (i, v) in f8::ALL.into_iter().enumerate() {

Introduce generators that respect function domains #348

Are you sure you want to change the base?

Introduce generators that respect function domains #348

Conversation

tgross35 commented Oct 29, 2024 • edited Loading

tgross35 commented Nov 1, 2024

tgross35 commented Nov 2, 2024 • edited Loading

tgross35 commented Nov 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross35 Nov 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beetrees Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beetrees Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

tgross35 commented Oct 29, 2024 •

edited

Loading

tgross35 commented Nov 2, 2024 •

edited

Loading

tgross35 commented Nov 2, 2024 •

edited

Loading

tgross35 Nov 3, 2024 •

edited

Loading

beetrees Nov 21, 2024 •

edited

Loading

beetrees Nov 21, 2024 •

edited

Loading