Revise overload resolution for splats/truncations #7114

pow2clk · 2025-02-03T18:50:00Z

Allow truncations when matching arguments for intrinsic overloads. This eliminates the need for explicit scalar extractions from vectors for arguments that are scalar by nature. This encompasses any vectors passed for scalars, allowing the truncation, but emitting a warning the same as is done for other assignments of vectors to scalars.

This maintains splats as the preferred transformations and promotes perfect matches to be preferred over that. This has the effect of removing the need to carefully order intrinsics to ensure that the right variant gets matched first before another one incorrectly takes its place with a faulty cast.

Allowing truncations causes a problems with a small subset of intrinsics that have explicit overloads for various matrix,vector, scalar combinations. Namely the mul overloads. These could be simplified to accept a new range of template types except the dimensions need to be matched in unconventional ways.

For these, the notion of uncastable or "ONLY" variants of the template/layout types are introduced. These are indicated with a trailing "!" after the parameter typename in gen_intrin_main, which directs them to an array that contains a NOCAST enum that, when encountered, will skip the attempts to splat or truncate.

Fixes #7079

pow2clk · 2025-02-03T18:51:28Z

Alternate, but inferior approach can be viewed here #7115. I include it here because it might seem a simpler approach initially, but proved to be functional, but brittle.

github-actions · 2025-02-03T18:51:58Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Allow truncations when matching arguments for intrinsic overloads. This eliminates the need for explicit scalar extractions from vectors for arguments that are scalar by nature. This encompasses any vectors passed for scalars, allowing the truncation, but emitting a warning the same as is done for other assignments of vectors to scalars. This maintains splats as the preferred transformations and promotes perfect matches to be preferred over that. This has the effect of removing the need to carefully order intrinsics to ensure that the right variant gets matched first before another one incorrectly takes its place with a faulty cast. Allowing truncations causes a problems with a small subset of intrinsics that have explicit overloads for various matrix,vector, scalar combinations. Namely the mul overloads. These could be simplified to accept a new range of template types except the dimensions need to be matched in unconventional ways. For these, the notion of uncastable or "ONLY" variants of the template/layout types are introduced. These are indicated with a trailing "!" after the parameter typename in gen_intrin_main, which directs them to an array that contains a NOCAST enum that, when encountered, will skip the attempts to splat or truncate. Fixes microsoft#7079

pow2clk

Review guidance comments.

pow2clk · 2025-02-03T19:05:46Z

include/dxc/dxcapi.internal.h

+  LITEMPLATE_ARRAY = 6,       // Scalar array.
+  LITEMPLATE_SCALAR_ONLY = 7, // Uncastable scalar types.
+  LITEMPLATE_VECTOR_ONLY = 8, // Uncastable vector types (eg. float3).
+  LITEMPLATE_MATRIX_ONLY = 9, // Uncastable matrix types (eg. float3x3).


Castability here refers to the ability to truncate or splat something to change it's "shape" (or template or layout, we use a few terms) and not the castability of the contents.

pow2clk · 2025-02-03T19:06:45Z

lib/HLSL/HLOperationLower.cpp

-    }
-  }
-  // offset 0
-  if (opcode == OP::OpCode::TextureLoad) {


This was some simplification that came naturally with the removal of the extract operator here. The opcode is determined completely by the resource kind so a test for one is the same as a test for the other.

pow2clk · 2025-02-03T19:08:00Z

lib/HLSL/HLOperationLower.cpp

-    } else {
-      storeArgs.emplace_back(offset); // offset
-    }
+    storeArgs.emplace_back(offset); // offset


One of several explicit "casts" in the form of a vector to scalar truncation that are now done in the default code paths instead of relying on special cases.

pow2clk · 2025-02-03T19:09:46Z

tools/clang/lib/Sema/SemaHLSL.cpp

-    g_NullTT, g_ScalarTT, g_VectorTT, g_MatrixTT,
-    g_AnyTT,  g_ObjectTT, g_ArrayTT,
+    g_NullTT,   g_ScalarTT, g_VectorTT,     g_MatrixTT,     g_AnyTT,
+    g_ObjectTT, g_ArrayTT,  g_ScalarOnlyTT, g_VectorOnlyTT, g_MatrixOnlyTT,


Automatically generated code for intrinsics will assign indexes into this array that will use the Only variants that have a NOCAST entry which will stop iterating through possibilities in a way that makes it clear that it failed due to requiring a cast where none is allowed.

pow2clk · 2025-02-03T19:10:03Z

tools/clang/lib/Sema/SemaHLSL.cpp

@@ -6113,7 +6123,7 @@ bool HLSLExternalSource::MatchArguments(
  ArBasicKind
      ComponentType[MaxIntrinsicArgs]; // Component type for each argument,
                                       // AR_BASIC_UNKNOWN if unspecified.
-  UINT uSpecialSize[IA_SPECIAL_SLOTS]; // row/col matching types, UNUSED_INDEX32
+  UINT uSpecialSize[IA_SPECIAL_SLOTS]; // row/col matching types, UnusedSize


Incidental, just an incorrect and misleading comment.

pow2clk · 2025-02-03T19:31:56Z

tools/clang/test/SemaHLSL/intrinsic-examples.hlsl

  r += status;
-  uav1.Load(a, status); // expected-error {{no matching member function for call to 'Load'}} expected-note {{requires single argument 'byteOffset', but 2 arguments were provided}} fxc-error {{X3013:     RWByteAddressBuffer<uint>.Load(uint)}} fxc-error {{X3013:     RWByteAddressBuffer<uint>.Load(uint, out uint status)}} fxc-error {{X3013: 'Load': no matching 2 parameter intrinsic method}} fxc-error {{X3013: Possible intrinsic methods are:}}
+  uav1.Load(a, status); // expected-warning {{implicit truncation of vector type}} fxc-error {{X3013:     RWByteAddressBuffer<uint>.Load(uint)}} fxc-error {{X3013:     RWByteAddressBuffer<uint>.Load(uint, out uint status)}} fxc-error {{X3013: 'Load': no matching 2 parameter intrinsic method}} fxc-error {{X3013: Possible intrinsic methods are:}}


As above, these are no longer failures and they give the same truncation warnings you'd see elsewhere when assingning a vector to a scalar.

pow2clk · 2025-02-03T19:33:21Z

utils/hct/gen_intrin_main.txt

+numeric<c2> [[rn,unsigned_op=umul]] mul(in $match<1, 0> numeric<c>! a, in col_major $match<2, 0> numeric<c, c2>! b) : mul_vm;
+numeric<r, c> [[rn,unsigned_op=umul]] mul(in $match<1, 0> numeric<r, c>! a, in $match<2, 0> numeric! b) : mul_ms;
+numeric<r> [[rn,unsigned_op=umul]] mul(in row_major $match<1, 0> numeric<r, c>! a, in $match<2, 0> numeric<c>! b) : mul_mv;
+numeric<r, c2> [[rn,unsigned_op=umul]] mul(in row_major $match<1, 0> numeric<r, c>! a, in col_major $match<2, 0> numeric<c, c2>! b) : mul_mm;


These are the cases where uncastable "only" types were required since they previously relied on the order of the intrinsics to get correct behavior and allowing truncation in addition to splatting made no order possible that would preserve keeping the correct calls for the correct inputs. I chose ! to represent uncastability, but a number of options are possible.

pow2clk · 2025-02-03T19:34:01Z

utils/hct/gen_intrin_main.txt

-$classT [[ro]] Load(in int<1> x) : buffer_load;
-$classT [[]] Load(in int<1> x, out uint_only status) : buffer_load_s;
+$classT [[ro]] Load(in int x) : buffer_load;
+$classT [[]] Load(in int x, out uint_only status) : buffer_load_s;


The sole case where the load index was represented as a vec1, which contradicts our docs and is incosistent with its RW counterpart.

pow2clk · 2025-02-03T19:35:22Z

utils/hct/hctdb.py

        type_any_re = re.compile(r"(\S+)<>$")
        type_array_re = re.compile(r"(\S+)\[\]$")
        type_object_re = re.compile(
            r"""(
-            sampler\w* | string |
+            sampler\w* | any_sampler\w* | string |


Not really related. The only change required to leave a large conditional block below as completely dead. The nice thing about changes here is that comparing the output source can verify that the results are identical as I did here.

pow2clk · 2025-02-03T19:35:51Z

utils/hct/hctdb.py

-                        ):
-                            template_list = "LITEMPLATE_OBJECT"
-                        else:
-                            template_list = "LITEMPLATE_SCALAR"


With the small change above to add any_sampler to the initial regex, all this explicit matching is redundant.

add load/str tests

c65a179

pow2clk requested a review from a team as a code owner February 3, 2025 18:50

pow2clk force-pushed the trunc_resolution branch from b8a502e to 819545b Compare February 3, 2025 18:56

pow2clk mentioned this pull request Feb 3, 2025

Handle coords in load helper #7115

Draft

pow2clk commented Feb 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revise overload resolution for splats/truncations #7114

Revise overload resolution for splats/truncations #7114

pow2clk commented Feb 3, 2025

pow2clk commented Feb 3, 2025 •

edited

Loading

github-actions bot commented Feb 3, 2025 •

edited

Loading

pow2clk left a comment

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

pow2clk Feb 3, 2025

Revise overload resolution for splats/truncations #7114

Are you sure you want to change the base?

Revise overload resolution for splats/truncations #7114

Conversation

pow2clk commented Feb 3, 2025

pow2clk commented Feb 3, 2025 • edited Loading

github-actions bot commented Feb 3, 2025 • edited Loading

pow2clk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pow2clk commented Feb 3, 2025 •

edited

Loading

github-actions bot commented Feb 3, 2025 •

edited

Loading