Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for num_midpoint #110840

Open
3 of 7 tasks
Urgau opened this issue Apr 26, 2023 · 9 comments · May be fixed by #134340
Open
3 of 7 tasks

Tracking Issue for num_midpoint #110840

Urgau opened this issue Apr 26, 2023 · 9 comments · May be fixed by #134340
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Urgau
Copy link
Member

Urgau commented Apr 26, 2023

Feature gate: #[feature(num_midpoint)], #[feature(num_midpoint_signed)] and #![feature(const_num_midpoint)]

This is a tracking issue for the midpoint function to {u,i}{8,16,32,64,128,size}, NonZeroU{8,16,32,64,size} and f{32,64}.

The midpoint function calculates the middle point of lhs and rhs.

Public API

impl {u,i}{8,16,32,64,128,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl NonZeroU{8,16,32,64,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl f{32,64} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

Steps / History

Unresolved Questions

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@Urgau Urgau added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 26, 2023
@barryfam
Copy link

If this is still open for comments, I'd like to support rounding towards the first argument.

In binary search, a.midpoint(b) will eventually be called where a and b are adjacent integers. If b is an exclusive range bound, it is helpful to know the function will always return a and not b. [insert Zeno's Paradox meme]

Another reason is for symmetry of step size; see CppCon 2019 talk on C++'s std::midpoint, question at 1:00:00:

Q: [Why is midpoint(a, b) different from midpoint(b, a)?]

A: Partially it's, when you think about it, you're going midway from a to b or midway from b to a. But — I suspect that this comes from the people, who were interested in the beginning — if you're doing simulations and you're stepping around a space where you're measuring things as you step, you start at 1,000 and you step halfway over and over again, you want those steps to be consistent. It doesn't matter which direction you're coming from. If you're coming from +1,000 toward 0 or -1,000 up to 0, you want those steps to be the same size whether you're going counting down or counting up.

In other words, they want:

midpoint(7, 0) = 4    // vs. "round to -∞" would be 3
midpoint(-7, 0) = -4

@scottmcm
Copy link
Member

scottmcm commented Jun 6, 2023

In binary search, a.midpoint(b) will eventually be called where a and b are adjacent integers.

I don't find this part persuasive, as ever binary search I've ever seen has had a ≤ b as an invariant, and thus round-to-first and round-to-negative-infinity are the same.

you want those steps to be consistent

Can you elaborate on the value of "consistent" here? I would have thought that if one needed predictable and consistent steps, they'd use a power-of-two number of quanta, since otherwise each step isn't half the size of the previous one.


I guess this boils down to midpoint(a, b) == midpoint(b, a) and midpoint(-a, -b) == -midpoint(a, b) both being "obvious" expectations, but they're incompatible with each other. (Kind of like how -a / -b = -(a / b) and a % b ε [0, |b|) are also "obvious" but incompatible.)

I'm personally still of the opinion that being consistent with (a + b) / 2 (for unsigned values where that doesn't overflow) is the way to go. After all, even P0811R3 used that as the example of how people would write it if that didn't overflow. And it sure would be nice to let clippy say "replace (a + b) / 2 with midpoint" without needing to warn that it might change the behaviour in non-overflow cases too.


And since it was in the PR, not the tracking issue, cc #92048 (comment) where I argued for rounding to -∞ instead of to first.


Also, FWIW, cranelift has https://docs.rs/cranelift/latest/cranelift/prelude/trait.InstBuilder.html#method.avg_round for (a + b + 1) >> 1 without overflow, which is rounding to +∞. I'm not sure why it picked that one, but if it has it from WASM that might be an interesting proposal to explore for rationale too.


EDIT: I happened to stumble on this post accidentally today https://devblogs.microsoft.com/oldnewthing/20220207-00/?p=106223, which is also about being consistent with (a + b)/2.

@bluebear94
Copy link
Contributor

For x86 targets, wouldn’t it be possible to implement {u32, u64}::num_midpoint using the RCR instruction?

@scottmcm
Copy link
Member

scottmcm commented Jan 13, 2024

@bluebear94 Exactly what assembly instructions is more up to LLVM than to us. For now the SHLD approach it gives us is pretty good -- only one instruction more -- someone could always open an LLVM bug for a different instruction later.

@RustyYato
Copy link
Contributor

RustyYato commented Feb 13, 2024

Why doesn't f32::midpoint upcast to f64 to calculate the midpoint? This should give the same results (excluding NaNs, which may have different bit values).
This should enable a very simple branchless implementation, Godbolt

fn midpoint(a: f32, b: f32) {
    ((a as f64) + (b as f64) / 2.0) as f32
}

I checked this with kani, and for all finite values it is bit-wise identical to the current implementation and they both output NaNs at the same time.

NOTE: that there aren't any calls to kani::assume, so this works for every pair of f32.

#[kani::proof]
fn test_midpoint() {
    let a = kani::any::<f32>();
    let b = kani::any::<f32>();

    // my proposed implementation
    let c = (a as f64 + b as f64) / 2.0;
    let c = c as f32;

    let d = a.midpoint(b); // the current std implementation

    if c.is_nan() {
        assert!(d.is_nan())
    } else {
        assert_eq!(c, d)
    }
}

Run kani with this command
NOTE: this needs --no-overflow-checks because by default kani checks that addition doesn't cause NaNs, but we don't care about that here and there isn't any possibility of overflow anyways.

cargo kani --no-overflow-checks

@Urgau
Copy link
Member Author

Urgau commented Feb 13, 2024

@RustyYato As noted in the implementation PR, f64 and f32 is derivated from the libcxx implementation.

This should enable a very simple branchless implementation, Godbolt

Feel free to send a PR changing the implementation (code source).

RustyYato added a commit to RustyYato/rust that referenced this issue Feb 14, 2024
This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless, and only differs in which NaN
values are produced (if any are produced at all). Which is fine to change.
Aside from NaN handling, this implementation produces bitwise identical
results to the original implementation.
@RustyYato
Copy link
Contributor

Alright, I've put up #121062, 🤞

bors added a commit to rust-lang-ci/rust that referenced this issue May 21, 2024
Change f32::midpoint to upcast to f64

This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless and only differs in which NaN values are produced (if any are produced at all), which is fine to change. Aside from NaN handling, this implementation produces bitwise identical results to the original implementation.

Question: do we need a codegen test for this? I didn't add one, since the original PR rust-lang#92048 didn't have any codegen tests.
bors added a commit to rust-lang-ci/rust that referenced this issue May 24, 2024
Change f32::midpoint to upcast to f64

This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless and only differs in which NaN values are produced (if any are produced at all), which is fine to change. Aside from NaN handling, this implementation produces bitwise identical results to the original implementation.

Question: do we need a codegen test for this? I didn't add one, since the original PR rust-lang#92048 didn't have any codegen tests.
RustyYato added a commit to RustyYato/rust that referenced this issue Jun 2, 2024
This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless, and only differs in which NaN
values are produced (if any are produced at all). Which is fine to change.
Aside from NaN handling, this implementation produces bitwise identical
results to the original implementation.

Reintroduce f32-based midpoint on some ARM targets

Add tests for large differences in magnitude

fix typo

switch to a whitelist based on platform

make test generic over f32/f64

add wasm to whitelisted target features

Co-authored-by: Alphyr <47725341+a1phyr@users.noreply.github.com>

switch from target_family to target_arch
RustyYato added a commit to RustyYato/rust that referenced this issue Jun 2, 2024
This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless, and only differs in which NaN
values are produced (if any are produced at all). Which is fine to change.
Aside from NaN handling, this implementation produces bitwise identical
results to the original implementation.

The new implementation is gated on targets that have a fast 64-bit
floating point implementation in hardware, and on WASM.
jieyouxu added a commit to jieyouxu/rust that referenced this issue Jun 2, 2024
Change f32::midpoint to upcast to f64

This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless and only differs in which NaN values are produced (if any are produced at all), which is fine to change. Aside from NaN handling, this implementation produces bitwise identical results to the original implementation.

Question: do we need a codegen test for this? I didn't add one, since the original PR rust-lang#92048 didn't have any codegen tests.
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Jun 2, 2024
Change f32::midpoint to upcast to f64

This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless and only differs in which NaN values are produced (if any are produced at all), which is fine to change. Aside from NaN handling, this implementation produces bitwise identical results to the original implementation.

Question: do we need a codegen test for this? I didn't add one, since the original PR rust-lang#92048 didn't have any codegen tests.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jun 3, 2024
Rollup merge of rust-lang#121062 - RustyYato:f32-midpoint, r=the8472

Change f32::midpoint to upcast to f64

This has been verified by kani as a correct optimization

see: rust-lang#110840 (comment)

The new implementation is branchless and only differs in which NaN values are produced (if any are produced at all), which is fine to change. Aside from NaN handling, this implementation produces bitwise identical results to the original implementation.

Question: do we need a codegen test for this? I didn't add one, since the original PR rust-lang#92048 didn't have any codegen tests.
@tgross35
Copy link
Contributor

tgross35 commented Jul 18, 2024

const LO: f64 = f64::MIN_POSITIVE * 2.;
const HI: f64 = f64::MAX / 2.;
let (a, b) = (self, other);
let abs_a = a.abs_private();
let abs_b = b.abs_private();
if abs_a <= HI && abs_b <= HI {
// Overflow is impossible
(a + b) / 2.
} else if abs_a < LO {
// Not safe to halve a
a + (b / 2.)
} else if abs_b < LO {
// Not safe to halve b
(a / 2.) + b
} else {
// Not safe to halve a and b
(a / 2.) + (b / 2.)
}

I think the first condition could have an || a.is_sign_positive() !- b.is_sign_positive(), since this is another never overflowing case.

Also, MIN_POSITIVE is the min positive normal number, like in the C++ implementation, but we might be able to adjust this to f32::From_bits(0b10) (2 * min subnormal) since Rust shouldn't flush subnormals (cc rust-lang/rfcs#3514).

The default branch's comment is incorrect too, that operation is the safe one. I have a commit to update it in #127027.

@ronnodas
Copy link

ronnodas commented Sep 8, 2024

I guess this boils down to midpoint(a, b) == midpoint(b, a) and midpoint(-a, -b) == -midpoint(a, b) both being "obvious" expectations, but they're incompatible with each other. (Kind of like how -a / -b = -(a / b) and a % b ε [0, |b|) are also "obvious" but incompatible.)

Round-towards-zero (or round-away-from-zero) satisfies both of those properties so they're not incompatible.

Urgau added a commit to Urgau/rust that referenced this issue Oct 26, 2024
Instead of towards negative infinity as is currently the case.

This done so that the obvious expectations of
`midpoint(a, b) == midpoint(b, a)` and
`midpoint(-a, -b) == -midpoint(a, b)` are true, which makes the even
more obvious implementation `(a + b) / 2` true.

rust-lang#110840 (comment)
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 27, 2024
…r=dtolnay

Round negative signed integer towards zero in `iN::midpoint`

This PR changes the implementation of `iN::midpoint` (the signed variants) to round negative signed integers **towards zero** *instead* of negative infinity as is currently the case.

This is done so that the obvious expectations[^1] of `midpoint(a, b) == midpoint(b, a)` and `midpoint(-a, -b) == -midpoint(a, b)` are true, which makes the even more obvious implementation `(a + b) / 2` always true.

The unsigned variants `uN::midpoint` (which are being [FCP-ed](rust-lang#131784 (comment))) already rounds towards zero, so there is no consistency issue.

cc `@scottmcm`
r? `@dtolnay`

[^1]: rust-lang#110840 (comment)
RalfJung added a commit to RalfJung/rust that referenced this issue Nov 30, 2024
Stabilize unsigned and float variants of `num_midpoint` feature

This PR proposes that we stabilize the unsigned variants of the [`num_midpoint`](rust-lang#110840 (comment)) feature as well as the floats variants, since they are not subject to any unresolved questions, which is equivalent to doing `(a + b) / 2` (and `(a + b) >> 1`) in a sufficiently large number.

The stabilized API surface would be:

```rust
/// Calculates the middle point of `self` and `rhs`.
///
/// `midpoint(a, b)` is `(a + b) / 2` as if it were performed in a sufficiently-large unsigned integral type.
/// This implies that the result is always rounded towards negative infinity and that no overflow will ever occur.

impl u{8,16,32,64,128,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl NonZeroU{8,16,32,64,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl f{32,64} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}
```

The signed variants `u{8,16,32,64,128,size}` would remain gated, until a decision is made about the rounding mode, in other words that the [unresolved questions](rust-lang#110840 (comment)) are resolved.

cc `@rust-lang/libs-api`
cc `@scottmcm`
r? libs-api
jhpratt added a commit to jhpratt/rust that referenced this issue Dec 2, 2024
Stabilize unsigned and float variants of `num_midpoint` feature

This PR proposes that we stabilize the unsigned variants of the [`num_midpoint`](rust-lang#110840 (comment)) feature as well as the floats variants, since they are not subject to any unresolved questions, which is equivalent to doing `(a + b) / 2` (and `(a + b) >> 1`) in a sufficiently large number.

The stabilized API surface would be:

```rust
/// Calculates the middle point of `self` and `rhs`.
///
/// `midpoint(a, b)` is `(a + b) / 2` as if it were performed in a sufficiently-large unsigned integral type.
/// This implies that the result is always rounded towards negative infinity and that no overflow will ever occur.

impl u{8,16,32,64,128,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl NonZeroU{8,16,32,64,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl f{32,64} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}
```

The signed variants `u{8,16,32,64,128,size}` would remain gated, until a decision is made about the rounding mode, in other words that the [unresolved questions](rust-lang#110840 (comment)) are resolved.

cc `@rust-lang/libs-api`
cc `@scottmcm`
r? libs-api
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Dec 2, 2024
Rollup merge of rust-lang#131784 - Urgau:stabilize-midpoint, r=dtolnay

Stabilize unsigned and float variants of `num_midpoint` feature

This PR proposes that we stabilize the unsigned variants of the [`num_midpoint`](rust-lang#110840 (comment)) feature as well as the floats variants, since they are not subject to any unresolved questions, which is equivalent to doing `(a + b) / 2` (and `(a + b) >> 1`) in a sufficiently large number.

The stabilized API surface would be:

```rust
/// Calculates the middle point of `self` and `rhs`.
///
/// `midpoint(a, b)` is `(a + b) / 2` as if it were performed in a sufficiently-large unsigned integral type.
/// This implies that the result is always rounded towards negative infinity and that no overflow will ever occur.

impl u{8,16,32,64,128,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl NonZeroU{8,16,32,64,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl f{32,64} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}
```

The signed variants `u{8,16,32,64,128,size}` would remain gated, until a decision is made about the rounding mode, in other words that the [unresolved questions](rust-lang#110840 (comment)) are resolved.

cc `@rust-lang/libs-api`
cc `@scottmcm`
r? libs-api
@Urgau Urgau linked a pull request Dec 15, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants