-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise overload resolution for splats/truncations #7114
base: main
Are you sure you want to change the base?
Conversation
Alternate, but inferior approach can be viewed here #7115. I include it here because it might seem a simpler approach initially, but proved to be functional, but brittle. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Allow truncations when matching arguments for intrinsic overloads. This eliminates the need for explicit scalar extractions from vectors for arguments that are scalar by nature. This encompasses any vectors passed for scalars, allowing the truncation, but emitting a warning the same as is done for other assignments of vectors to scalars. This maintains splats as the preferred transformations and promotes perfect matches to be preferred over that. This has the effect of removing the need to carefully order intrinsics to ensure that the right variant gets matched first before another one incorrectly takes its place with a faulty cast. Allowing truncations causes a problems with a small subset of intrinsics that have explicit overloads for various matrix,vector, scalar combinations. Namely the mul overloads. These could be simplified to accept a new range of template types except the dimensions need to be matched in unconventional ways. For these, the notion of uncastable or "ONLY" variants of the template/layout types are introduced. These are indicated with a trailing "!" after the parameter typename in gen_intrin_main, which directs them to an array that contains a NOCAST enum that, when encountered, will skip the attempts to splat or truncate. Fixes microsoft#7079
b8a502e
to
819545b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review guidance comments.
LITEMPLATE_ARRAY = 6, // Scalar array. | ||
LITEMPLATE_SCALAR_ONLY = 7, // Uncastable scalar types. | ||
LITEMPLATE_VECTOR_ONLY = 8, // Uncastable vector types (eg. float3). | ||
LITEMPLATE_MATRIX_ONLY = 9, // Uncastable matrix types (eg. float3x3). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Castability here refers to the ability to truncate or splat something to change it's "shape" (or template or layout, we use a few terms) and not the castability of the contents.
} | ||
} | ||
// offset 0 | ||
if (opcode == OP::OpCode::TextureLoad) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was some simplification that came naturally with the removal of the extract operator here. The opcode is determined completely by the resource kind so a test for one is the same as a test for the other.
} else { | ||
storeArgs.emplace_back(offset); // offset | ||
} | ||
storeArgs.emplace_back(offset); // offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of several explicit "casts" in the form of a vector to scalar truncation that are now done in the default code paths instead of relying on special cases.
g_NullTT, g_ScalarTT, g_VectorTT, g_MatrixTT, | ||
g_AnyTT, g_ObjectTT, g_ArrayTT, | ||
g_NullTT, g_ScalarTT, g_VectorTT, g_MatrixTT, g_AnyTT, | ||
g_ObjectTT, g_ArrayTT, g_ScalarOnlyTT, g_VectorOnlyTT, g_MatrixOnlyTT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatically generated code for intrinsics will assign indexes into this array that will use the Only
variants that have a NOCAST
entry which will stop iterating through possibilities in a way that makes it clear that it failed due to requiring a cast where none is allowed.
@@ -6113,7 +6123,7 @@ bool HLSLExternalSource::MatchArguments( | |||
ArBasicKind | |||
ComponentType[MaxIntrinsicArgs]; // Component type for each argument, | |||
// AR_BASIC_UNKNOWN if unspecified. | |||
UINT uSpecialSize[IA_SPECIAL_SLOTS]; // row/col matching types, UNUSED_INDEX32 | |||
UINT uSpecialSize[IA_SPECIAL_SLOTS]; // row/col matching types, UnusedSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidental, just an incorrect and misleading comment.
r += status; | ||
uav1.Load(a, status); // expected-error {{no matching member function for call to 'Load'}} expected-note {{requires single argument 'byteOffset', but 2 arguments were provided}} fxc-error {{X3013: RWByteAddressBuffer<uint>.Load(uint)}} fxc-error {{X3013: RWByteAddressBuffer<uint>.Load(uint, out uint status)}} fxc-error {{X3013: 'Load': no matching 2 parameter intrinsic method}} fxc-error {{X3013: Possible intrinsic methods are:}} | ||
uav1.Load(a, status); // expected-warning {{implicit truncation of vector type}} fxc-error {{X3013: RWByteAddressBuffer<uint>.Load(uint)}} fxc-error {{X3013: RWByteAddressBuffer<uint>.Load(uint, out uint status)}} fxc-error {{X3013: 'Load': no matching 2 parameter intrinsic method}} fxc-error {{X3013: Possible intrinsic methods are:}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, these are no longer failures and they give the same truncation warnings you'd see elsewhere when assingning a vector to a scalar.
numeric<c2> [[rn,unsigned_op=umul]] mul(in $match<1, 0> numeric<c>! a, in col_major $match<2, 0> numeric<c, c2>! b) : mul_vm; | ||
numeric<r, c> [[rn,unsigned_op=umul]] mul(in $match<1, 0> numeric<r, c>! a, in $match<2, 0> numeric! b) : mul_ms; | ||
numeric<r> [[rn,unsigned_op=umul]] mul(in row_major $match<1, 0> numeric<r, c>! a, in $match<2, 0> numeric<c>! b) : mul_mv; | ||
numeric<r, c2> [[rn,unsigned_op=umul]] mul(in row_major $match<1, 0> numeric<r, c>! a, in col_major $match<2, 0> numeric<c, c2>! b) : mul_mm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the cases where uncastable "only" types were required since they previously relied on the order of the intrinsics to get correct behavior and allowing truncation in addition to splatting made no order possible that would preserve keeping the correct calls for the correct inputs. I chose !
to represent uncastability, but a number of options are possible.
$classT [[ro]] Load(in int<1> x) : buffer_load; | ||
$classT [[]] Load(in int<1> x, out uint_only status) : buffer_load_s; | ||
$classT [[ro]] Load(in int x) : buffer_load; | ||
$classT [[]] Load(in int x, out uint_only status) : buffer_load_s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sole case where the load index was represented as a vec1, which contradicts our docs and is incosistent with its RW counterpart.
type_any_re = re.compile(r"(\S+)<>$") | ||
type_array_re = re.compile(r"(\S+)\[\]$") | ||
type_object_re = re.compile( | ||
r"""( | ||
sampler\w* | string | | ||
sampler\w* | any_sampler\w* | string | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really related. The only change required to leave a large conditional block below as completely dead. The nice thing about changes here is that comparing the output source can verify that the results are identical as I did here.
): | ||
template_list = "LITEMPLATE_OBJECT" | ||
else: | ||
template_list = "LITEMPLATE_SCALAR" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the small change above to add any_sampler
to the initial regex, all this explicit matching is redundant.
Allow truncations when matching arguments for intrinsic overloads. This eliminates the need for explicit scalar extractions from vectors for arguments that are scalar by nature. This encompasses any vectors passed for scalars, allowing the truncation, but emitting a warning the same as is done for other assignments of vectors to scalars.
This maintains splats as the preferred transformations and promotes perfect matches to be preferred over that. This has the effect of removing the need to carefully order intrinsics to ensure that the right variant gets matched first before another one incorrectly takes its place with a faulty cast.
Allowing truncations causes a problems with a small subset of intrinsics that have explicit overloads for various matrix,vector, scalar combinations. Namely the mul overloads. These could be simplified to accept a new range of template types except the dimensions need to be matched in unconventional ways.
For these, the notion of uncastable or "ONLY" variants of the template/layout types are introduced. These are indicated with a trailing "!" after the parameter typename in gen_intrin_main, which directs them to an array that contains a NOCAST enum that, when encountered, will skip the attempts to splat or truncate.
Fixes #7079