Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autovectorization support #84

Open
penzn opened this issue Aug 9, 2022 · 6 comments
Open

Autovectorization support #84

penzn opened this issue Aug 9, 2022 · 6 comments

Comments

@penzn
Copy link
Contributor

penzn commented Aug 9, 2022

We discussed it in latest SIMD meeting, wanted to get @tlively's perspective.

Compilers have -ffast-math flag, which at least partially fits the bill for supporting relaxed SIMD operations.

  • Fast math allows some variability of result, but would including both Arm and x86 outputs be too much variability? Maybe we can limit what instructions we support.
  • Or would it be necessary to generate platform detection code to reduce variability?
@tlively
Copy link
Member

tlively commented Aug 10, 2022

I don't think there's any need to restrict the -ffast-math optimizations we provide for users who opt-in to using that flag. In other words, we should apply the same aggressive optimizations available to other targets. If a user is not getting the results they need, then their program was not well-specified enough and the fix should be in the user's code, not in the compiler.

@sunfishcode
Copy link
Member

There is a fundamental difference between -ffast-math on native targets and -ffast-math on wasm. On native targets, one compiles with -ffast-math and can then test the output, and trust that it'll continue to behave as tested, because all the nondeterminism related to floating-point has been resolved. On wasm, Developers may test their code on their local machine with a particular wasm engine, and it may work for them, and their users may have machines with different architectures and different wasm engines, where it may not work.

So it is worth considering restricting the -ffast-math optimizations.

@penzn
Copy link
Contributor Author

penzn commented Aug 12, 2022

Sure, fast math should change the output even with MVP, but the change would at least be portable. Here we are going away from that, and I want to understand the consequences, especially for things producer can and cannot do.

There are broadly two types of instructions in this proposal:

  1. Most introduce non-determinism w.r.t out-of-range values (what happens with OOB lane indices and such(
  2. Instructions that actually affect precision, namely qfma, dot, and bfloat ops if added

fmin/fmax might be counted in the second category because they have drastic enough differences in output thanks to opposite approaches to NaNs in the two major architectures.

Technically first category doesn't affect precision, and theoretically fast math transformations should have the same effect as in core Wasm, as long they are not dependent on out-of-range semantics.

For the second group it has been proposed to use platform detection as mitigation, though I am not sure if that would get us all the way back to stable. Thoughts?

@tlively
Copy link
Member

tlively commented Aug 12, 2022

There is a fundamental difference between -ffast-math on native targets and -ffast-math on wasm. On native targets, one compiles with -ffast-math and can then test the output, and trust that it'll continue to behave as tested, because all the nondeterminism related to floating-point has been resolved.

This is fundamentally different from native targets in the same way relaxed-simd itself is fundamentally different from native architectures, though. I would think that by opting into relaxed-simd, the user would be opting into accepting additional testing burden since the nondeterminism has explicitly not been resolved. I can see that this would be inconvenient for the user, but they can always choose not to use relaxed-simd or not use -ffast-math, so I don't see that's it's worth doing anything special in the tools here.

@penzn, I don't quite understand the distinction you're trying to draw between those two groups of instructions. Either way, the compiler should be free to perform instruction selection in any way that matches the specified semantics (no matter how loose or strict they might be) of the input program. Baking platform detection into compilers seems complicated and undesirable.

@sunfishcode
Copy link
Member

This is fundamentally different from native targets in the same way relaxed-simd itself is fundamentally different from native architectures, though. I would think that by opting into relaxed-simd, the user would be opting into accepting additional testing burden since the nondeterminism has explicitly not been resolved. I can see that this would be inconvenient for the user, but they can always choose not to use relaxed-simd or not use -ffast-math, so I don't see that's it's worth doing anything special in the tools here.

Do you envision relaxed-simd will be automatically enabled by -ffast-math, or will it always remain a separate opt-in?

@tlively
Copy link
Member

tlively commented Aug 15, 2022

I expect it would be a separate opt-in via -mrelaxed-simd to explicitly enable the target feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants