Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new instructions: min / max #33

Open
nemequ opened this issue Aug 1, 2021 · 4 comments
Open

Add new instructions: min / max #33

nemequ opened this issue Aug 1, 2021 · 4 comments
Labels
in-overview Instruction has been added to Overview.md instruction-proposal

Comments

@nemequ
Copy link

nemequ commented Aug 1, 2021

  1. What are the instructions being proposed?
  • relaxed f32x4.min
  • relaxed f32x4.max
  • relaxed f64x2.min
  • relaxed f64x2.max
  1. What are the semantics of these instructions?

Return the lane-wise minimum or maximum of two values. If either is NaN, or the values are -0.0 and +0.0, then which value will be returned is implementation-defined.

  1. How will these instructions be implemented? Give examples for at least
    x86-64 and ARM64. Also provide reference implementation in terms of 128-bit
    Wasm SIMD.

Pretty much all architectures which support SIMD have min / max instructions. On those platforms, these would be mapped to those instructions.

relaxed f32x4.min:

  • x86 + SSE: minps
  • armv7 + NEON: vmin.f32
  • AArch64: fmin or fminnm
  • POWER6 + AltiVec: vminfp
  • POWER7 + VSX: xvminsp
  • MIPS + MSA: fmin.w

relaxed f32x4.max:

  • x86 + SSE: maxps
  • armv7 + NEON: vmax.f32
  • AArch64: fmax or fmaxnm
  • POWER6 + AltiVec: vmaxfp
  • POWER7 + VSX: xvmaxsp
  • MIPS + MSA: fmax.w

relaxed f64x2.min:

  • x86 + SSE2: minpd
  • AArch64: fmin or fminnm
  • POWER7 + VSX: xvmindp
  • MIPS + MSA: fmin.d

relaxed f64x2.max:

  • x86 + SSE2: maxpd
  • AArch64: fmax or fmaxnm
  • POWER7 + VSX: xvmaxdp
  • MIPS + MSA: fmax.d

On platforms where no hardware support is available, implementations could use the same code they use for the pmin / pmax instructions (or min/max if they prefer, or some other sequence).

  1. How does behavior differ across processors? What new fingerprinting surfaces will be exposed?

Different processors will return different results if one of the inputs is NaN.

  • If using fmin/fmax, Arm will always return NaN if either input is NaN
  • POWER (and I think MIPS) will return the number, as will Arm if using fminnm/fmaxnm.
  • On x86, the result depends on the argument order.
  1. What use cases are there?

Whenever you want the minimum or maximum of two values and don't have NaNs in your data. Currently the programmer must choose between instructions which will perform sub-optimally on Arm (pmin/pmax) or x86 (min/max); with these instructions the fastest implementation will be selected automatically.

@Maratyszcza
Copy link
Collaborator

There is another case where the underlying instruction differ: when one input is +0.0 and the other is -0.0

@nemequ
Copy link
Author

nemequ commented Aug 5, 2021

Thanks for pointing that out, I hadn't considered it ☹. I updated the description to make that implementation-defined as well.

ngzhian added a commit to ngzhian/relaxed-simd that referenced this issue Aug 16, 2021
ngzhian added a commit that referenced this issue Sep 7, 2021
@ngzhian
Copy link
Member

ngzhian commented Nov 1, 2021

For RISC-V V extension, vector float min/max follows scalar semantics, copied the relevant paragraph:

Floating-point minimum-number and maximum-number instructions FMIN.S and FMAX.S write,
respectively, the smaller or larger of rs1 and rs2 to rd. For the purposes of these instructions only,
the value −0.0 is considered to be less than the value +0.0. If both inputs are NaNs, the result is
the canonical NaN. If only one operand is a NaN, the result is the non-NaN operand. Signaling
NaN inputs set the invalid operation exception flag, even when the result is not NaN

Note that this is different from x86/arm for relaxed_min(1.0f, NaN):

  • x86 and ARM: NaN
  • RISC V: 1.0f

@ngzhian
Copy link
Member

ngzhian commented Nov 1, 2021

Lookingat PowerISA V2.07B has vminfp, vmaxfp, where:

  • max of +0 and -0 is +0
  • min of +0 and -0 is -0.
  • max of x and NaN is QNaN
  • min of x and NaN is QNaN

@ngzhian ngzhian added the in-overview Instruction has been added to Overview.md label Feb 18, 2022
kangwoosukeq pushed a commit to prosyslab/v8 that referenced this issue Apr 28, 2022
Prototype F32x4Relaxed(Min/Max) and F64x2Relaxed(Min/Max)
operations for ARM. F32x4 variants map directly to vmin/vmax
hardware instructions which are also used for F32x4(Min/Max)
operations. The F64x2 variants are mapped in this implementation
to Pmin/Pmax instructions as detailed in the github issue.
WebAssembly/relaxed-simd#33

Bug: v8:12284
Change-Id: I5ea939385fa0ae97bbdf776fc0b763cabb1b293c
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/3501347
Reviewed-by: Adam Klein <adamk@chromium.org>
Commit-Queue: Deepti Gandluri <gdeepti@chromium.org>
Cr-Commit-Position: refs/heads/main@{#79355}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-overview Instruction has been added to Overview.md instruction-proposal
Projects
None yet
Development

No branches or pull requests

3 participants