Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

float::not_nan #507

Closed
dev-ardi opened this issue Dec 16, 2024 · 30 comments
Closed

float::not_nan #507

dev-ardi opened this issue Dec 16, 2024 · 30 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@dev-ardi
Copy link

Proposal

Problem statement

We could make working with floats a bit nicer by leveraging ?

Motivating examples or use cases

rust-lang/rust#84277 (comment)

fn do_thing_checked(rhs: f32, lhs: f32) -> Option<f32> {
  if (rhs.is_nan() || lhs.is_nan()) { return None }
  let res = do_thing(rhs, lhs);
  if res.is_nan() { return None }
  Some(res)
}

Solution sketch

fn not_nan(self) -> Option<Self> {
    (!num.is_nan()).then_some(num)
}

The example would look like the following:

fn do_thing_checked(rhs: f32, lhs: f32) -> Option<f32> {
  rhs.not_nan()?;
  lhs.not_nan()?;
  do_thing(rhs, lhs).not_nan()
  // or continue using it with
  let res = do_thing(rhs, lhs).not_nan()?;
  ...
}

Alternatives

This could also be an extension trait implemented in a third party crate.

@dev-ardi dev-ardi added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Dec 16, 2024
@bjoernager
Copy link

On a related note: We already have the NonZero and NonNull wrappers, so would there be any merit in adding a NonNan type as well?

@dev-ardi
Copy link
Author

On a related note: We already have the NonZero and NonNull wrappers, so would there be any merit in adding a NonNan type as well?

https://internals.rust-lang.org/t/pre-rfc-nonnan-type/8418/25

@oshaboy
Copy link

oshaboy commented Dec 16, 2024

I like not_nan but sadly there's no optionhaha to go from Option to f64.

So perhaps this should be an IsNan trait that requires an is_nan impl. Or you could just implement Try on f64.

@bjoernager
Copy link

bjoernager commented Dec 16, 2024

https://internals.rust-lang.org/t/pre-rfc-nonnan-type/8418/25

Since the linked discussion, NonZero itself has been extended with normal, checking, and saturating arithmetic methods. The checking methods specifically return an Option<Self> object, a pattern which I feel we could use as the basis for a similar NonNan type, e.g.:

impl NonNan<f32> {
    pub fn checked_sqrt(self) -> Option<Self>;
}

Option<NonNan<f32>> would then have the exact same representation as just f32, but at the same time implementing Try (although I am not sure if the compiler can currently work with multiple representations for None like it does with NaN).

On the other hand, most floating-point operations do not yield NaN as long as all of the operands are numerical. In these cases we can simply define them like we already do with the primitive floats.

@pitaj
Copy link

pitaj commented Dec 16, 2024

I think we need a stronger motivating example. In most cases, checking for NaN isn't necessary because most float operations will happily take a NaN and just output a NaN in response. In some cases, checking for NaN will even prevent optimizations such as vectorization.

@bjoernager
Copy link

bjoernager commented Dec 16, 2024

I think the same can somewhat be said for NonZero. The main use of NonNan as I see it is in interfaces that want to enforce numericality at compile-time, with the most prominent advantage of this being avoiding FPEs from signalling NaNs (a little like how NonZero avoids panics from divide-by-zero).

@cuviper
Copy link
Member

cuviper commented Dec 17, 2024

Option<NonNan<f32>> would then have the exact same representation as just f32, but at the same time implementing Try (although I am not sure if the compiler can currently work with multiple representations for None like it does with NaN).

No, currently the compiler will pick exactly one niche value for None. This also extends to multiple levels, like Option<Option<NonNan>> would pick one niche for None, another for Some(None), and all the non-niche values for the Some(Some(non_nan)). If the compiler changed to be more flexible in the first case, it would lose the extended cases.

So, there's no "free" conversion for NonNan::new(f32) -> Option<Self>, whereas NonZero<int> can just copy it.

@programmerjake
Copy link
Member

programmerjake commented Dec 17, 2024

I think the same can somewhat be said for NonZero. The main use of NonNan as I see it is in interfaces that want to enforce numericality at compile-time, with the most prominent advantage of this being avoiding FPEs from signalling NaNs (a little like how NonZero avoids panics from divide-by-zero).

Well, Rust assumes that all FP exceptions are masked and the status bits are ignored, so signalling NaNs are treated as any other NaN and basically all hardware will not cause any traps or go any slower because the NaN happens to be signalling.

This is different than division by zero, where x86's division instructions will cause interrupts, and presumably because of that it is UB in LLVM, so checking for division by zero is required.

@oshaboy
Copy link

oshaboy commented Dec 17, 2024

Another 2 benefits I mentioned in the other thread are that firstly all NonNan will implement Ord and Eq. Allowing them to be used as keys. Secondly a NaN type lets you create NaNs out of payloads and to get the payload from the NaN allowing for NaN boxing.

@tgross35
Copy link

ordered_float provides a NotNan wrapper that adds Ord, Eq, and Hash.

@kennytm
Copy link
Member

kennytm commented Dec 17, 2024

On the other hand, most floating-point operations do not yield NaN as long as all of the operands are numerical. In these cases we can simply define them like we already do with the primitive floats.

It is quite easy to yield infinities from finite numbers (e.g. MAX + MAX = INFINITY) and then yield NaNs from infinities (e.g. INFINITY - INFINITY = NAN). Similar to NonZero most operations do indeed get filtered out especially the useful ones (+−×÷).

Operation NonNaN Finite
Add, Sub, Sum ❌ (∞ - ∞) ❌ (MAX + MAX)
Mul, Product ❌ (∞ × 0) ❌ (MAX × MAX)
mul_add ❌ (∞×0+0) ❌ (MAX×MAX+0)
Div, Rem, div_euclid, rem_euclid ❌ (0 ÷ 0) ❌ (0 ÷ 0)
Neg, abs, signum, copysign
floor, ceil, round, round_ties_even, trunc
fract ❌ (fract(∞))
powi, recip ❌ (0-1)
powf, sqrt ❌ ((-1)0.5) ❌ ((-1)0.5)
cbrt
exp, exp2, exp_m1 ❌ (exp(MAX))
ln, log, log2, log10, ln_1p ❌ (log(-2)) ❌ (log(-2))
hypot ❌ (hypot(MAX, MAX))
sin, cos, tan, sin_cos ❌ (sin(∞)) 1
asin, acos ❌ (asin(2)) ❌ (asin(2))
atan
atan2
sinh, cosh ❌ (sinh(MAX))
tanh
asinh
acosh, atanh ❌ (acosh(-2)) ❌ (acosh(-2))
gamma ❌ (Γ(-1)) ❌ (Γ(-1))
ln_gamma 2 ❌ (ln Γ(MAX))
next_up, next_down ❌ (next_up(MAX))
to_degrees ❌ (to_degrees(MAX))
to_radians
max, min, maximum, minimum, clamp
midpoint ❌ (midpt(∞, -∞))

Footnotes

  1. Mathematically tan(π/2 + nπ) = ±∞, but a binary floating point number can never accurately represent whole non-zero multiples of π/2, so tan(x) will always produce finite output given finite f32/f64 x.

  2. I'd argue that the sign of ln Γ(-n) is undefined, but on Linux x86_64 and macOS arm64 it seems to always pick the +∞ side. Not sure if this is platform-specific. The manpage of lgamma suggested when computing ln Γ(-n) the value of signgam is unspecified. I think this also applied to lgamma_r.

@bjoernager
Copy link

bjoernager commented Dec 17, 2024

Well, Rust assumes that all FP exceptions are masked and the status bits are ignored, so signalling NaNs are treated as any other NaN and basically all hardware will not cause any traps or go any slower because the NaN happens to be signalling.

I was not aware of this. Thank you. This dismisses my point.

No, currently the compiler will pick exactly one niche value for None. This also extends to multiple levels, like Option<Option> would pick one niche for None, another for Some(None), and all the non-niche values for the Some(Some(non_nan)). If the compiler changed to be more flexible in the first case, it would lose the extended cases.

Could it be worthwhile creating an exception in this case? Even if this costs us the optimisation at higher, nested levels... But the memory concerns as pointed out by @oshaboy would at least be mitigated with Option<NonNan<f32>>, where conversions to/from f32 would then be no-ops.

It is quite easy to yield infinities from finite numbers (e.g. MAX + MAX = INFINITY) and then yield NaNs from infinities (e.g. INFINITY - INFINITY = NAN). Similar to NonZero most operations do indeed get filtered out especially the useful ones (+−×÷).

Wouldn't Finite have to be a subset of NonNan as NaNs are (pedantically speaking) non-numbers? Otherwise thank you for the table.

@kennytm
Copy link
Member

kennytm commented Dec 17, 2024

Wouldn't Finite have to be a subset of NonNan as NaNs are (pedantically speaking) non-numbers?

Yes Finite excluded NaN and Infinities. But you could easily produce both from simple operations.

TBH I find it strange that the original request only rejected NaN. For instance, if you are interfacing with standard SQL databases you will need to reject both NaN and Infinity.

@bjoernager
Copy link

I personally would be for a Finite type instead if it has a greater technical rationale.

@oshaboy
Copy link

oshaboy commented Dec 17, 2024

Could it be worthwhile creating an exception in this case? Even if this costs us the optimisation at higher, nested levels... But the memory concerns as pointed out by @oshaboy would at least be mitigated with Option<NonNan<f32>>, where conversions to/from f32 would then be no-ops.

The problem with that will be that you'll have to unwrap it before running any operations on the NonNan. Which like... Why? The hardware already pretty much deals with it as every operation on a NaN results in NaN (besides max for some reason, they fixed it in the 2019 revision though). It already pretty much acts like map.

I think it would be cleaner to just have f32 implement Try as opposed to having a special case in the compiler just for floats.

IEEE-754 floats are already option types in all but name.

@bjoernager
Copy link

bjoernager commented Dec 17, 2024

You will not need to unwrap it if the compiler can simply interpret it and handle it like an ordinary float (provided the same representation is used). You will need it for getting the result of checking methods (if you're not just using Try), but the same can be said with calling is_nan on normal floats. But I do get your point in that they are mostly equivalent under the hood.

Implementing Try for the primitive types might seem obvious at first, but I believe it would open another can of worms. For example, what would <f32 as Try>::Output be? Because simply setting it to f32 would yield an infinitely long chain of non-irrefutable patterns (although this could be disregarded as a non-issue).

@oshaboy
Copy link

oshaboy commented Dec 17, 2024

You will not need to unwrap it if the compiler can simply interpret it and handle it like an ordinary float (provided the same representation is used).

How? Option doesn't implement arithmetic operations so Option<NonNan<f32>> won't either

Implementing Try for the primitive types might seem obvious at first, but I believe it would open another can of worms. For example, what would ::Output be?

NonNan<f32>?

@the8472
Copy link
Member

the8472 commented Dec 17, 2024

This was discussed during today's libs-api meeting and we're going to reject this and suggest to implement this in a crate instead. As the discussion above suggests this would work better with a NonNan type which we have previously rejected too (#238). Perhaps this could even live as a conversion helper in a crate providing NonNan and similar types.

Additionally we didn't find the examples particularly convincing. It would make combining several arithmetic operations very verbose and in many cases it'd likely be sufficient to check for NaN at the end rather than having to ? at every step.

We also discussed that it would be more palatable if we had pattern types but even those would have difficulty representing NonNan without yet more language support such as hex floats. So overall it'd be too far off from stabilization.

@the8472 the8472 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 17, 2024
@bjoernager
Copy link

bjoernager commented Dec 17, 2024

How? Option doesn't implement arithmetic operations so Option<NonNan> won't either

Yes, I agree that I wasn't very clear on what I was saying. What I think I was trying to say was that Try would be used instead of unwrap when calling arithmetic functions, and that the compiler could optimise this to be relatively equivalent to if the regular primitives had been used instead. Sorry. :/

NonNan?

Okay, but it seems I misunderstood your point then. I thought you wanted to completely avoid adding NonNan and directly implement Try on floats.

@oshaboy
Copy link

oshaboy commented Dec 18, 2024

The details should probably be left to someone who is smarter and better at Rust than me. I cobbled together a proof of concept library and what I came up with was to split the floats into "NaNable" "NonNan" and "Nan". I can't implement Try directly on the primitives because Rust doesn't let you implement traits unless either the trait or the type is defined in the current crate.

@oshaboy
Copy link

oshaboy commented Dec 22, 2024

This was discussed during today's libs-api meeting and we're going to reject this and suggest to implement this in a crate instead.

The problem is crates can't implement Try on floats because neither are local. I tried using a wrapper type but that caused a big nest of intos.

@tgross35
Copy link

Would Try on native floats even do much? Most soft float routines check for NaN as the first thing they do, returning another NaN if that is what they get, so checking for a NaN and then returning is more or less redundant .

The example in the top post isn't concrete enough to be convincing, I think we would understand the goals a bit better if somebody could point to some examples from the real world.

@bjoernager
Copy link

bjoernager commented Dec 22, 2024

Implementing Try here would make short-circuiting cleaner by reducing boilerplate. Now, without coming with any concrete examples, I can't say whether doing this would actually result in faster machine code compared to simply passing non-numbers on (which is already quite optimised on a hardware level) – at least for the general user. But it's not hard to imagine a scenario with a sufficiently complicated function where a few well-placed ? operators could prove beneficial (without necessarily using them on every floating-point operation).

@oshaboy
Copy link

oshaboy commented Dec 22, 2024

Well there's this time a NaN caused a self driving car to crash into a wall

If the "adjust steer" function was marked NonNan this wouldn't happen. And if you have a NonNan type implementing Try seems like the obvious next step.

@oshaboy
Copy link

oshaboy commented Dec 22, 2024

Most soft float routines check for NaN as the first thing they do, returning another NaN

Key word "Most". You don't know whether the dev of the library heard of NaNs and also had enough coffee to remember that NaNs need to be checked. Postel's Law kinda requires you to perform two NaN checks, one by the caller and one by the callee. The compiler will optimize it anyway.

Also from my testing this can't just be made into a crate. You either have to wrap the float types in a "Nanable" type or just give up on Try which would require non zero-cost conversions to and from Option. Though it could be a case of "I suck at Rust"

@tgross35
Copy link

tgross35 commented Dec 22, 2024

The two aspects here are optimization (by short circuiting) and NaN-safety. My comment was mostly addressing the optimization case, let bar = (foo.sin()?.sqrt()?.tan()? / f32::consts::PI)?; is unlikely to wind up significantly faster than let bar = foo.sin().sqrt().tan() / f32::consts::PI;. Indeed if you have dozens of operations chained then you may benefit from returning early, but I don't think this is a useful enough optimization in the general case to merit adding a shorthand. The average user shouldn't be wondering if they should ? every float operation.

Regarding NaN-safety, ? on float primitives doesn't gain anything because the return type is still a float; you could accidentally skip a ? and not know about it. A wrapper like NotNaN is the correct thing to do here because it indicates precisely where NaNs need to be handled, with compiletime verification that nothing got missed. Implementing Try on this kind of type does make sense, once that is stable.

It is unfortunately that the Try trait isn't yet stable, but if the NotNan ACP was rejected then there really isn't much to be done in the standard library. Crates providing NotNaN maybe-NaN wrapper types could consider adding a t! macro, like Rust had before ? came around https://doc.rust-lang.org/std/macro.try.html.

(I wish there was a better name like YesANumber rather than NotNotANumber)

@oshaboy
Copy link

oshaboy commented Dec 22, 2024

Regarding NaN-safety, ? on float primitives doesn't gain anything because the return type is still a float; you could accidentally skip a ? and not know about it. A wrapper like NotNaN is the correct thing to do here because it indicates precisely where NaNs need to be handled, with compiletime verification that nothing got missed. Implementing Try on this kind of type does make sense, once that is stable.

Yes that's what I am trying to do. You have Nan and NonNan type templates. Nan is basically just a container for the NaN payload. NonNan is a thin wrapper around the underlying float type.

It is unfortunately that the Try trait isn't yet stable, but if the #238 was rejected then there really isn't much to be done in the standard library. Crates providing NotNaN could consider adding a t! macro, like Rust had before ? came around https://doc.rust-lang.org/std/macro.try.html.

The problem with making it a crate is that impl Try for f64 doesn't work because both Try and f64 are standard library. I tried using a wrapper type and it just makes a mess when you try to interop with any function that takes or returns a primitive float.

I think William Kahan never intended high level programming languages to just send around NaNs between functions. It's an assembly level construct for an assembly level language. Sadly it got shoved up the abstraction layers just like machine integers (which are another can of worms)

@tgross35
Copy link

The problem with making it a crate is that impl Try for f64 doesn't work because both Try and f64 are standard library.

Right. But as I mentioned above, this would not be very useful because it provides no additional type safety, meaning the only thing it adds is a way to short circuit.

I tried using a wrapper type and it just makes a mess when you try to interop with any function that takes or returns a primitive float.

That mess is exactly what you want if you are trying to enforce NaN safety because it indicates places where NaNs could enter your program. If it seems excessive then you may be trying to convert too often, really the type invariants only need to be enforced at function boundaries:

use ordered_float::{FloatIsNan, NotNan};

// Demonstrate converting once after operations. This is probably what you want.
pub fn foo(input: f64) -> Result<NotNan<f64>, FloatIsNan> {
    let mut x = input.sin();
    x = x.sqrt();
    x *= 1234.56;
    NotNan::new(x)
}

// Check for NaN at multiple points, bail early if found. Not much reason to do this.
pub fn bar(input: f64) -> Result<NotNan<f64>, FloatIsNan> {
    let mut x = NotNan::new(input.sin())?;
    x = NotNan::new(x.sqrt())?;
    x *= NotNan::new(1234.56).expect("1234.56 is not a NaN");
    Ok(x)
}

pub fn must_not_be_nan_or_bad_things_happen(x: NotNan<f64>) {
    if x.into_inner().is_nan() {
        // guaranteed to never happen because of the type invariant
        drive_the_car_into_the_wall();
    }
}

playground

@oshaboy
Copy link

oshaboy commented Dec 22, 2024

That mess is exactly what you want if you are trying to enforce NaN safety because it indicates places where NaNs could enter your program.

No the mess mostly came from conversion between the Nanable wrapper type and the primitive type and back again. That conversion is always safe and zero cost but I can't figure out how to make it implicit.

@oshaboy
Copy link

oshaboy commented Dec 23, 2024

But as I mentioned above, this would not be very useful because it provides no additional type safety, meaning the only thing it adds is a way to short circuit.

It also adds a way for it to interface with Try templates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

9 participants