Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LLVM minnum/maxnum intrinsics #46926

Closed
wants to merge 2 commits into from
Closed

Conversation

varkor
Copy link
Member

@varkor varkor commented Dec 21, 2017

LLVM’s intrinsics minnum and maxnum are now used for min and
max on f32 and f64. This resolves #18384.

LLVM’s intrinsics `minnum` and `maxnum` are now used for `min` and
`max` on `f32` and `f64`. This resolves rust-lang#18384.
@rust-highfive
Copy link
Collaborator

r? @joshtriplett

(rust_highfive has picked a reviewer for you, use r? to override)

@joshtriplett
Copy link
Member

LGTM, as soon as CI passes.

/// Returns the minimum of two `f32` values.
#[cfg(stage0)]
pub unsafe fn minnumf32(x: f32, y: f32) -> f32 {
(if x < y || y != y { x } else { y }) * 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you copy the deleted comments from libcore at least once here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Even though these are only used for bootstrapping, they still ought to have that correctness warning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I was wondering whether they were necessary if just used for bootstrapping. I'll add them back!

/// Returns the minimum of two `f32` values.
#[cfg(stage0)]
pub unsafe fn minnumf32(x: f32, y: f32) -> f32 {
(if x < y || y != y { x } else { y }) * 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Even though these are only used for bootstrapping, they still ought to have that correctness warning.

@estebank
Copy link
Contributor

@bors r=joshtriplett rollup

@bors
Copy link
Contributor

bors commented Dec 21, 2017

📌 Commit a060814 has been approved by joshtriplett

@kennytm kennytm added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Dec 22, 2017
kennytm added a commit to kennytm/rust that referenced this pull request Dec 23, 2017
Add LLVM `minnum`/`maxnum` intrinsics

LLVM’s intrinsics `minnum` and `maxnum` are now used for `min` and
`max` on `f32` and `f64`. This resolves rust-lang#18384.
@kennytm
Copy link
Member

kennytm commented Dec 23, 2017

@bors rollup- r-

These intrinsics (llvm_minnum_f64) are undefined on asm.js. See https://travis-ci.org/rust-lang/rust/jobs/320632051 for details.

@kennytm kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 23, 2017
@varkor
Copy link
Member Author

varkor commented Dec 23, 2017

It looks like in order to get these working with asm.js, I'll probably need to create a PR on the emscripten repository to implement them (I'm not sure how quick the turnaround is after a PR is merged on the repository — any idea what the process is after that @kennytm?). I'll try to do this soon.

EDIT: I've opened emscripten-core/emscripten#5978 and emscripten-core/emscripten-fastcomp#210 to address this issue. Let's see what happens there.

@kennytm kennytm added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 24, 2017
@scottmcm
Copy link
Member

From #42423 (comment)

There is an LLVM intrinsic llvm.minnum.fNN. When compared with the pure Rust implementation, the generated assembly is generally worse on x86_64 and i686, and better on ARM and AArch64 (Both won't call fmaxf/fminf unless there is no hard-float support).

Is the x64 assembly better now?

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Dec 29, 2017

Running the test case from the linked comment through godbolt reveals that the x86 situation is still similar: identical asm for general min/max, but the intrinsic optimizes less well when used for clamping.

@varkor
Copy link
Member Author

varkor commented Dec 30, 2017

This is quite interesting — before submitting the PR, I checked that the intrinsics generated better instructions in isolation than the existing Rust implementation, but in @rkruppe's example, they're identical. Here's a very small change to the example — the only thing that's changed is the order of the conditions, but the generated asm is significantly worse.

if other > self || other.is_nan() { self } else { other } // 16 instructions (min)
if other.is_nan() || other > self { self } else { other } // 11 instructions (min)

max is similar. Notice that it's the former, less efficient, version which is currently used in the standard library.

I hadn't spotted the issue shown by clamping, but given that the advantage the intrinsics demonstrate seems to be achievable by the Rust code, I imagine it'd be better to hold off from using the intrinsics for min and max for now.

What's the best way to proceed? Use Rust for min and max, but make the condition optimisation here? Is it worth having the intrinsics if they're not directly used in the standard library yet?

@hanna-kruppe
Copy link
Contributor

Optimizing the Rust implementation would be great. I don't think there's any point in having the intrinsics if we're not going to use them. Very unfortunate that you've already spent the time on them, but at least another optimization came out of it :)

varkor added a commit to varkor/rust that referenced this pull request Dec 30, 2017
Swapping the conditions generates more efficient x86 assembly. See
rust-lang#46926 (comment).
@varkor varkor mentioned this pull request Dec 30, 2017
@varkor
Copy link
Member Author

varkor commented Dec 30, 2017

Let's close this PR for the time being, then. When LLVM gets a little bit better at optimising minnum/maxnum intrinsics we can look into this again.

@varkor varkor closed this Dec 30, 2017
bors added a commit that referenced this pull request Dec 31, 2017
Optimise min/max

Swapping the conditions generates more efficient x86 assembly. See
#46926 (comment).

r? @rkruppe
bors added a commit that referenced this pull request Jun 7, 2019
Use LLVM intrinsics for floating-point min/max

Resurrection of #46926, now that the optimisation issues are fixed. I've confirmed locally that #61384 solves the issues.

I'm not sure if we're allowed to move the `min`/`max` methods from libcore to libstd: I can't quite tell what the status is from #50145. However, this is necessary to use the intrinsics.

Fixes #18384.

r? @SimonSapin
cc @rkruppe @nikic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-blocked Status: Blocked on something else such as an RFC or other implementation work.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use minnum/maxnum LLVM instrinsics for fmin/fmax
8 participants