Adding support to mono for v128 constants #81902

tannergooding · 2023-02-09T17:10:53Z

This resolves #81482

Notably it adds the general support for the feature, but doesn't make use of it "everywhere" that it could. It doesn't add verbose constant folding support nor does it recognize the more general cases of _Create, _CreateScalar, and _CreateScalarUnsafe yet. Those should be handled in a separate PR.

dotnet-issue-labeler · 2023-02-09T17:10:59Z

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

tannergooding · 2023-02-10T15:50:15Z

src/mono/mono/mini/aot-compiler.c

+	case MONO_PATCH_INFO_X128:
+	case MONO_PATCH_INFO_X128_GOT:
+		encode_value (((guint32 *)patch_info->data.target) [(MINI_LS_WORD_IDX * 2) + MINI_LS_WORD_IDX], p, &p);
+		encode_value (((guint32 *)patch_info->data.target) [(MINI_LS_WORD_IDX * 2) + MINI_MS_WORD_IDX], p, &p);
+		encode_value (((guint32 *)patch_info->data.target) [(MINI_MS_WORD_IDX * 2) + MINI_LS_WORD_IDX], p, &p);
+		encode_value (((guint32 *)patch_info->data.target) [(MINI_MS_WORD_IDX * 2) + MINI_MS_WORD_IDX], p, &p);


AFAIK, there isn't any SIMD support for BE architectures today.

It's possibly this needs to track the underlying element type and splat each T in memory correctly instead.

tannergooding · 2023-02-10T15:52:22Z

src/mono/mono/mini/mini-amd64.c

+		else if (patch_info->type == MONO_PATCH_INFO_X128)
+			code_size += 16 + 15; /* sizeof (Vector128<T>) + alignment */
+		else if (patch_info->type == MONO_PATCH_INFO_R8)
 			code_size += 8 + 15; /* sizeof (double) + alignment */
-		if (patch_info->type == MONO_PATCH_INFO_R4)
+		else if (patch_info->type == MONO_PATCH_INFO_R4)
 			code_size += 4 + 15; /* sizeof (float) + alignment */
-		if (patch_info->type == MONO_PATCH_INFO_GC_CARD_TABLE_ADDR)
+		else if (patch_info->type == MONO_PATCH_INFO_GC_CARD_TABLE_ADDR)


As per the FIXME comment below, the alignment here only needs to be 8 for double and 4 for float. But some other logic was utilizing R4/R8 for packed SIMD instructions.

That should ideally be fixed and those updated to properly use X128 constants to save on space and help ensure correctness.

tannergooding · 2023-02-10T15:55:09Z

src/mono/mono/mini/mini-llvm.c

+			if (!strcmp(class_name, "Vector64`1") || !strcmp (class_name, "Vector128`1") || !strcmp (class_name, "Vector256`1") || !strcmp (class_name, "Vector512`1")) {
+				MonoType *element_type = mono_class_get_context (ins->klass)->class_inst->type_argv [0];
+				etype = element_type->type;
+				ecount = mono_class_value_size (ins->klass, NULL) / mono_class_value_size (mono_class_from_mono_type_internal(element_type), NULL);
+			} else if (!strcmp(class_name, "Vector4") || !strcmp(class_name, "Plane") || !strcmp(class_name, "Quaternion")) {
+				etype = MONO_TYPE_R4;
+				ecount = 4;
+			} else {
+				g_assert_not_reached ();
+			}


Just wanted to leave a more general comment that Mono currently has to do these class name and type checks to determine things.

It would be beneficial if there was a way to track that the simdType and simdBaseType. In RyuJIT, we have TYP_SIMD8/12/16/32/64 and we then track the primitive base type as a separate field.

This allows better codegen, better specialization, and all-over cheaper checks. Even if this was just some helper method that cached a klass to simd info lookup, I think it would improve/simplify things.

There is a simd_class_to_llvm_type () function which can be used here.

Thanks for the reference @vargaz!

I'm not sure that really addresses the backing issue of there still needing to be string comparisons to do the checks in the first place and it doesn't help things on the MonoJIT side.

My comment was more about the general cost of going from a klass to the respective SimdType and SimdBaseType and how if Mono had a different way to track this it could be done cheaply once at import.

Subsequent handling, including simd_class_to_llvm_type could then be made cheaper.

tannergooding · 2023-02-10T15:56:57Z

src/mono/mono/mini/mini-runtime.c

+	case MONO_PATCH_INFO_X128_GOT:
+		return hash | (guint32)*(double*)ji->data.target;
 	case MONO_PATCH_INFO_R8_GOT:
 		return hash | (guint32)*(double*)ji->data.target;


This is how the hashes are currently being built for R4/R8, but they don't necessarily make "sense" to me.

This isn't a "good" hash since it just or's bits together, it doesn't take into account the "upper bits", and it relies on undefined behavior in the case the double or float is NaN/Infinity/out of range.

It's not the best, but its just used during JITting/AOTing, so it's not a problem in practice.

src/mono/mono/mini/aot-compiler.c

src/mono/mono/mini/aot-runtime.c

src/mono/mono/mini/cpu-x86.mdesc

tannergooding · 2023-02-13T20:36:39Z

src/mono/mono/mini/mini-amd64.c

+		case MONO_PATCH_INFO_X128: {
+			guint8 *pos, *patch_pos;
+			guint32 target_pos;
+
+			/* The SSE opcodes require a 16 byte alignment */
+			code = (guint8*)ALIGN_TO (code, 16);
+
+			pos = cfg->native_code + patch_info->ip.i;
+			if (IS_REX (pos [0])) {
+				patch_pos = pos + 4;
+				target_pos = GPTRDIFF_TO_UINT32 (code - pos - 8);
+			}
+			else {
+				patch_pos = pos + 3;
+				target_pos = GPTRDIFF_TO_UINT32 (code - pos - 7);
+			}
+
+			memcpy (code, patch_info->data.target, 16);
+			code += 16;
+
+			*(guint32*)(patch_pos) = target_pos;
+
+			remove = TRUE;
+			break;
+		}


The original handling I added was the cause of the failures.

MonoJIT doesn't support the VEX encoding today (only the legacy encoding) and the R4/R8 patch point logic is effectively hardcoded to movss/movsd.

For that scenario, we have:

1-byte prefix of 0xF3 (float) or 0xF2 (double)

Optional REX prefix of 0x44 if using extended registers (XMM8-XMM15)

2-byte opcode of 0x0F 0x010

1-byte modR/M byte

4-byte RIP relative offset

So this the R4/R8 logic was getting the offset of the RIP relative offset so it could be patched.

For X128 we currently hardcode this to movups and so we instead have:

Optional REX prefix of 0x44 if using extended registers (XMM8-XMM15)

2-byte opcode of 0x0F 0x010

1-byte modR/M byte

4-byte RIP relative offset

Thus it's 1-byte smaller. We could but don't want to emit movupd or movdqu as that's 1-byte larger for no benefit.

We could emit movaps instead, but there isn't any benefit to that on modern machines (anything since ~2011) and it strictly requires that the address be aligned, which is overall less flexible.

When the VEX support is added this all becomes a constant 4-bytes before the RIP relative offset is encoded (3-byte VEX encoded opcode + modR/M byte).

fanyang-mono · 2023-02-14T18:43:26Z

/azp run runtime-extra-platforms

azure-pipelines · 2023-02-14T18:43:48Z

Azure Pipelines successfully started running 1 pipeline(s).

tannergooding · 2023-02-15T13:57:34Z

extra-platforms failures are all networking or other flaky tests that are also seeing failures in the mainline.

ghost assigned tannergooding Feb 9, 2023

tannergooding force-pushed the numerics-rewrite branch 6 times, most recently from fe252c3 to cbee606 Compare February 9, 2023 19:55

Adding support to mono for v128 constants

f3f0340

tannergooding force-pushed the numerics-rewrite branch from cbee606 to f3f0340 Compare February 9, 2023 21:46

tannergooding marked this pull request as ready for review February 9, 2023 22:29

tannergooding requested review from vargaz, lambdageek and SamMonoRT as code owners February 9, 2023 22:30

lewing requested review from radekdoulik and fanyang-mono February 10, 2023 00:12

Ensure that mini-amd64 can handle Vector128<T> constants

046ddc9

tannergooding commented Feb 10, 2023

View reviewed changes

Have xconst "maximum instruction length" match r8const

ed988af

vargaz reviewed Feb 11, 2023

View reviewed changes

src/mono/mono/mini/aot-compiler.c Show resolved Hide resolved

vargaz reviewed Feb 11, 2023

View reviewed changes

src/mono/mono/mini/aot-runtime.c Outdated Show resolved Hide resolved

Ensure the right data.target indices are written

e21a876

vargaz reviewed Feb 13, 2023

View reviewed changes

src/mono/mono/mini/cpu-x86.mdesc Outdated Show resolved Hide resolved

vargaz approved these changes Feb 13, 2023

View reviewed changes

fanyang-mono approved these changes Feb 13, 2023

View reviewed changes

Correctly handle the X128 patch point

1fa29d7

tannergooding commented Feb 13, 2023

View reviewed changes

Ensure MONO_TYPE_I and MONO_TYPE_U are handled

cef739e

tannergooding force-pushed the numerics-rewrite branch from 1afa16f to cef739e Compare February 14, 2023 00:14

build-analysis bot mentioned this pull request Feb 14, 2023

Roslyn source generator crash on mono/linux/arm64 #81123

Closed

Ensure Vector`1 is handled

c94cb5c

build-analysis bot mentioned this pull request Feb 14, 2023

Tracking issue for CI build timeouts #76454

Closed

tannergooding merged commit db5dfad into dotnet:main Feb 15, 2023

tannergooding deleted the numerics-rewrite branch February 15, 2023 13:57

This was referenced Feb 22, 2023

[Perf] Linux/x64: 32 Improvements on 2/15/2023 2:23:18 PM dotnet/perf-autofiling-issues#13307

Open

[Perf] Linux/x64: 43 Improvements on 2/15/2023 2:23:18 PM dotnet/perf-autofiling-issues#13312

Open

ghost locked as resolved and limited conversation to collaborators Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support to mono for v128 constants #81902

Adding support to mono for v128 constants #81902

tannergooding commented Feb 9, 2023 •

edited

Loading

dotnet-issue-labeler bot commented Feb 9, 2023

tannergooding Feb 10, 2023

tannergooding Feb 10, 2023

tannergooding Feb 10, 2023

vargaz Feb 13, 2023

tannergooding Feb 14, 2023

tannergooding Feb 10, 2023

vargaz Feb 13, 2023

tannergooding Feb 13, 2023

fanyang-mono commented Feb 14, 2023

azure-pipelines bot commented Feb 14, 2023

tannergooding commented Feb 15, 2023

Adding support to mono for v128 constants #81902

Adding support to mono for v128 constants #81902

Conversation

tannergooding commented Feb 9, 2023 • edited Loading

dotnet-issue-labeler bot commented Feb 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fanyang-mono commented Feb 14, 2023

azure-pipelines bot commented Feb 14, 2023

tannergooding commented Feb 15, 2023

tannergooding commented Feb 9, 2023 •

edited

Loading