[mlir][llvm] Return failure from type converter for n-D scalable vectors #65450

c-rhodes · 2023-09-06T07:50:38Z

This patch changes vector type conversion to return failure on n-D scalable vector types instead of asserting.

This is an alternative approach to #65261 that aims to enable lowering of Vector ops directly to ArmSME intrinsics where possible, and seems more consistent with other type conversions. It's trivial to hit the assert at the moment and it could be interpreted as n-D scalable vector types being a bug, when they're valid types in the Vector dialect.

By returning failure it will generally fail more gracefully, particularly for release builds or other builds where assertions are disabled.

This patch changes vector type conversion to return failure on n-D scalable vector types instead of asserting. This is an alternative approach to llvm#65261 that aims to enable lowering of Vector ops directly to ArmSME intrinsics where possible, and seems more consistent with other type conversions. It's trivial to hit the assert at the moment and it could be interpreted as n-D scalable vector types being a bug, when they're valid types in the Vector dialect. By returning failure it will generally fail more gracefully, particularly for release builds or other builds where assertions are disabled.

banach-space · 2023-09-06T09:25:15Z

Thank you, this is far less intrusive and feels much more canonical than #65261 - this would be my preference :)

This should be sufficient for two types of lowerings from Vector to ArmSME:

Vector -> ArmSME Ops -> LLVM SME intrinsics (with custom ops),
Vector -> LLVM SME intrinsics (without custom ops).

ATM, we only use 1. I am guessing that you'd like to use 2. for vector.outerproduct to lower to SME's MOPA instructions, e.g. FMOPA? I am still wondering whether mixing 1. and 2. is the right approach.

But that's something we can discuss in a different patch,, this change makes sense regardless.

LGTM, but please wait for more reviews before landing this :)

c-rhodes · 2023-09-06T12:49:19Z

ATM, we only use 1. I am guessing that you'd like to use 2. for vector.outerproduct to lower to SME's MOPA instructions, e.g. FMOPA? I am still wondering whether mixing 1. and 2. is the right approach.

But that's something we can discuss in a different patch,, this change makes sense regardless.

That's correct, I want to lower vector.outerproduct directly to intrinsics, at least initially as this seems to work. But this would also open this up for other ops where possible.

LGTM, but please wait for more reviews before landing this :)

Thanks for reviewing!

This patch adds support for lowering vector.outerproduct to the ArmSME MOPA intrinsic for the following types: vector<[8]xf16>, vector<[8]xf16> -> vector<[8]x[8]xf16> vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16> vector<[4]xf32>, vector<[4]xf32> -> vector<[4]x[4]xf32> vector<[2]xf64>, vector<[2]xf64> -> vector<[2]x[2]xf64> The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to BFMOPA (non-widening) [2]. Note at the ISA level these variants are implemented by different architecture features, these are listed below: FMOPA (non-widening) * half-precision - +sme2p1,+sme-f16f16 * single-precision - +sme * double-precision - +sme-f64f64 BFMOPA (non-widening) * half-precision - +sme2p1,+b16b16 There's currently no way to target different features when lowering to ArmSME. Integration tests are added for F32 and F64. We use QEMU to run the integration tests but SME2 support isn't available yet, it's targeted for 9.0, so integration tests for these variants excluded. Masking is currently unsupported. Depends on llvm#65450. [1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate- [2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-

dcaballe

LGTM, thanks! Yeah, much better!

dcaballe · 2023-09-11T05:54:23Z

mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp

-  assert(
-      (!type.isScalable() || (type.getRank() == 1)) &&
-      "expected 1-D scalable vector (n-D scalable vectors are not supported)");
+  if (type.isScalable() && (type.getRank() > 1))


Do we handle the 0-D case gracefully here?

For 0-D it won't reach here having returned already

c-rhodes · 2023-09-11T08:32:02Z

Thanks for reviewing!

This patch adds support for lowering vector.outerproduct to the ArmSME MOPA intrinsic for the following types: vector<[8]xf16>, vector<[8]xf16> -> vector<[8]x[8]xf16> vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16> vector<[4]xf32>, vector<[4]xf32> -> vector<[4]x[4]xf32> vector<[2]xf64>, vector<[2]xf64> -> vector<[2]x[2]xf64> The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to BFMOPA (non-widening) [2]. Note at the ISA level these variants are implemented by different architecture features, these are listed below: FMOPA (non-widening) * half-precision - +sme2p1,+sme-f16f16 * single-precision - +sme * double-precision - +sme-f64f64 BFMOPA (non-widening) * half-precision - +sme2p1,+b16b16 There's currently no way to target different features when lowering to ArmSME. Integration tests are added for F32 and F64. We use QEMU to run the integration tests but SME2 support isn't available yet, it's targeted for 9.0, so integration tests for these variants excluded. Masking is currently unsupported. Depends on #65450. [1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate- [2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-

This patch adds support for lowering vector.outerproduct to the ArmSME MOPA intrinsic for the following types: vector<[8]xf16>, vector<[8]xf16> -> vector<[8]x[8]xf16> vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16> vector<[4]xf32>, vector<[4]xf32> -> vector<[4]x[4]xf32> vector<[2]xf64>, vector<[2]xf64> -> vector<[2]x[2]xf64> The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to BFMOPA (non-widening) [2]. Note at the ISA level these variants are implemented by different architecture features, these are listed below: FMOPA (non-widening) * half-precision - +sme2p1,+sme-f16f16 * single-precision - +sme * double-precision - +sme-f64f64 BFMOPA (non-widening) * half-precision - +sme2p1,+b16b16 There's currently no way to target different features when lowering to ArmSME. Integration tests are added for F32 and F64. We use QEMU to run the integration tests but SME2 support isn't available yet, it's targeted for 9.0, so integration tests for these variants excluded. Masking is currently unsupported. Depends on llvm#65450. [1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate- [2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-

…ors (llvm#65450) This patch changes vector type conversion to return failure on n-D scalable vector types instead of asserting. This is an alternative approach to llvm#65261 that aims to enable lowering of Vector ops directly to ArmSME intrinsics where possible, and seems more consistent with other type conversions. It's trivial to hit the assert at the moment and it could be interpreted as n-D scalable vector types being a bug, when they're valid types in the Vector dialect. By returning failure it will generally fail more gracefully, particularly for release builds or other builds where assertions are disabled.

This patch adds support for lowering vector.outerproduct to the ArmSME MOPA intrinsic for the following types: vector<[8]xf16>, vector<[8]xf16> -> vector<[8]x[8]xf16> vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16> vector<[4]xf32>, vector<[4]xf32> -> vector<[4]x[4]xf32> vector<[2]xf64>, vector<[2]xf64> -> vector<[2]x[2]xf64> The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to BFMOPA (non-widening) [2]. Note at the ISA level these variants are implemented by different architecture features, these are listed below: FMOPA (non-widening) * half-precision - +sme2p1,+sme-f16f16 * single-precision - +sme * double-precision - +sme-f64f64 BFMOPA (non-widening) * half-precision - +sme2p1,+b16b16 There's currently no way to target different features when lowering to ArmSME. Integration tests are added for F32 and F64. We use QEMU to run the integration tests but SME2 support isn't available yet, it's targeted for 9.0, so integration tests for these variants excluded. Masking is currently unsupported. Depends on llvm#65450. [1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate- [2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-

c-rhodes requested review from banach-space and MacDue September 6, 2023 07:50

c-rhodes requested a review from a team as a code owner September 6, 2023 07:50

github-actions bot added the mlir label Sep 6, 2023

c-rhodes mentioned this pull request Sep 6, 2023

[mlir][ArmSME] Use ArmSMETypeConverter for all VectorToLLVM patterns #65261

Closed

banach-space requested a review from dcaballe September 6, 2023 09:26

banach-space added mlir:llvm mlir:vectorops labels Sep 6, 2023

banach-space approved these changes Sep 6, 2023

View reviewed changes

c-rhodes mentioned this pull request Sep 7, 2023

[mlir][ArmSME] Lower vector.outerproduct to FMOPA/BFMOPA #65621

Merged

dcaballe approved these changes Sep 11, 2023

View reviewed changes

c-rhodes merged commit 38eb55a into llvm:main Sep 11, 2023

michaelrj-google mentioned this pull request Sep 12, 2023

[libc] Move long double table option to new config #66151

Merged

vzakhari mentioned this pull request Sep 12, 2023

internap proc trampolines #66156

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir][llvm] Return failure from type converter for n-D scalable vectors #65450

[mlir][llvm] Return failure from type converter for n-D scalable vectors #65450

c-rhodes commented Sep 6, 2023

banach-space commented Sep 6, 2023

c-rhodes commented Sep 6, 2023

dcaballe left a comment

dcaballe Sep 11, 2023

c-rhodes Sep 11, 2023

c-rhodes commented Sep 11, 2023

[mlir][llvm] Return failure from type converter for n-D scalable vectors #65450

[mlir][llvm] Return failure from type converter for n-D scalable vectors #65450

Conversation

c-rhodes commented Sep 6, 2023

banach-space commented Sep 6, 2023

c-rhodes commented Sep 6, 2023

dcaballe left a comment

Choose a reason for hiding this comment

dcaballe Sep 11, 2023

Choose a reason for hiding this comment

c-rhodes Sep 11, 2023

Choose a reason for hiding this comment

c-rhodes commented Sep 11, 2023