-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LLVM minnum
/maxnum
intrinsics
#46926
Conversation
LLVM’s intrinsics `minnum` and `maxnum` are now used for `min` and `max` on `f32` and `f64`. This resolves rust-lang#18384.
(rust_highfive has picked a reviewer for you, use r? to override) |
LGTM, as soon as CI passes. |
src/libcore/intrinsics.rs
Outdated
/// Returns the minimum of two `f32` values. | ||
#[cfg(stage0)] | ||
pub unsafe fn minnumf32(x: f32, y: f32) -> f32 { | ||
(if x < y || y != y { x } else { y }) * 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you copy the deleted comments from libcore
at least once here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Even though these are only used for bootstrapping, they still ought to have that correctness warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I was wondering whether they were necessary if just used for bootstrapping. I'll add them back!
src/libcore/intrinsics.rs
Outdated
/// Returns the minimum of two `f32` values. | ||
#[cfg(stage0)] | ||
pub unsafe fn minnumf32(x: f32, y: f32) -> f32 { | ||
(if x < y || y != y { x } else { y }) * 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Even though these are only used for bootstrapping, they still ought to have that correctness warning.
@bors r=joshtriplett rollup |
📌 Commit a060814 has been approved by |
Add LLVM `minnum`/`maxnum` intrinsics LLVM’s intrinsics `minnum` and `maxnum` are now used for `min` and `max` on `f32` and `f64`. This resolves rust-lang#18384.
@bors rollup- r- These intrinsics ( |
It looks like in order to get these working with asm.js, I'll probably need to create a PR on the emscripten repository to implement them (I'm not sure how quick the turnaround is after a PR is merged on the repository — any idea what the process is after that @kennytm?). I'll try to do this soon. EDIT: I've opened emscripten-core/emscripten#5978 and emscripten-core/emscripten-fastcomp#210 to address this issue. Let's see what happens there. |
From #42423 (comment)
Is the x64 assembly better now? |
Running the test case from the linked comment through godbolt reveals that the x86 situation is still similar: identical asm for general min/max, but the intrinsic optimizes less well when used for clamping. |
This is quite interesting — before submitting the PR, I checked that the intrinsics generated better instructions in isolation than the existing Rust implementation, but in @rkruppe's example, they're identical. Here's a very small change to the example — the only thing that's changed is the order of the conditions, but the generated asm is significantly worse. if other > self || other.is_nan() { self } else { other } // 16 instructions (min)
if other.is_nan() || other > self { self } else { other } // 11 instructions (min)
I hadn't spotted the issue shown by clamping, but given that the advantage the intrinsics demonstrate seems to be achievable by the Rust code, I imagine it'd be better to hold off from using the intrinsics for What's the best way to proceed? Use Rust for |
Optimizing the Rust implementation would be great. I don't think there's any point in having the intrinsics if we're not going to use them. Very unfortunate that you've already spent the time on them, but at least another optimization came out of it :) |
Swapping the conditions generates more efficient x86 assembly. See rust-lang#46926 (comment).
Let's close this PR for the time being, then. When LLVM gets a little bit better at optimising |
Optimise min/max Swapping the conditions generates more efficient x86 assembly. See #46926 (comment). r? @rkruppe
Use LLVM intrinsics for floating-point min/max Resurrection of #46926, now that the optimisation issues are fixed. I've confirmed locally that #61384 solves the issues. I'm not sure if we're allowed to move the `min`/`max` methods from libcore to libstd: I can't quite tell what the status is from #50145. However, this is necessary to use the intrinsics. Fixes #18384. r? @SimonSapin cc @rkruppe @nikic
LLVM’s intrinsics
minnum
andmaxnum
are now used formin
andmax
onf32
andf64
. This resolves #18384.