[Arm64] ASIMD Implement widening, narrowing, saturating instructions #35379

echesakov · 2020-04-23T23:45:25Z

Adds new instructions for #32512 #35143

Also fixes #35303

Attached are jitDisasm and WinDbg u command outputs.
jitDisasm
WinDbg(u)

…p emitfmtsarm64.h

…mt == insFmt) in emitarm64.cpp

…mitarm64.cpp

…arm64.cpp

…4.cpp

…in emitarm64.cpp

…tarm64.cpp

…n emitarm64.cpp

… ssubl{2}, ssubw{2}, uabal{2}, uabdl{2}, uaddl{2}, uaddw{2}, umlal{2}, umlsl{2}, usubl{2}, usubw{2} in emitarm64.cpp

…in emitter::emitIns_R_R_R_I in emitarm64.cpp

….cpp

…Ins_R_R_R in emitarm64.cpp

…arm64.cpp emitarm64.cpp emitfmtsarm64.h

src/coreclr/src/jit/codegenarm64.cpp

echesakov · 2020-04-23T23:58:44Z

src/coreclr/src/jit/emitarm64.cpp

-        case IF_BI_0B:
-        case IF_BI_0C:
-        case IF_BI_1A:
-        case IF_BI_1B:


@BruceForstall @briansull Do you know why this giant sequence of case-s is needed? It's obviously not checking if fmt argument passed from outside (i.e. from one of the emitIns_X_Y_Z functions) matches the instruction format specified in INST1() in hwintrinsiclistarm64.h table. If someone makes a mistake in this table - it will go unnoticed (it happened to me yesterday). I believe it is better to replace this with the following logic in default: Thoughts?

This seems reasonable to me. I guess the idea here is that for INST1, there is exactly one fmt. For INST2+, there is a "pseudo-format" in the "fmt" part of the macro that then maps to the actual format.

The ones named here were all of the formats that only had one possible formats. This switch insures that you properly update this method when adding new instr formats. With your change the assumption is made that a newly added instrFormat is an INST1 case by default. Where as before your chanfge it would assert on the newly added format.

But after looking into your change I believe that what you have added here will also work fine.

The ones named here were all of the formats that only had one possible formats. This switch insures that you properly update this method when adding new instr formats. With your change the assumption is made that a newly added instrFormat is an INST1 case by default. Where as before your chanfge it would assert on the newly added format.

That would assert only if you haven't added a new format and set fmt to the same value in one of emitIns_X_Y_Z functions. But if the latter is false (i.e. the function returns wrong format) then it wouldn't assert.

After this change it is not longer required to update this switch and it will automatically check that the fmt set in emitIns is consistent with the one declared in instrsarm64.h table.

BruceForstall · 2020-04-24T01:44:51Z

src/coreclr/src/jit/codegenarm64.cpp

@@ -7864,6 +7858,40 @@ void CodeGen::genArm64EmitterUnitTests()
    theEmitter->emitIns_R_R(INS_fcvtn2, EA_8BYTE, REG_V0, REG_V1);
 #endif

+#ifdef ALL_ARM64_EMITTER_UNIT_TESTS
+    // sadalp vector
+    theEmitter->emitIns_R_R(INS_sadalp, EA_8BYTE, REG_V0, REG_V1, INS_OPTS_8B);


These are a little odd because Vd and Vn take different arrangement specifiers, even though one implies the other. So which register does this insOpts specifier refer to? There must be other SIMD instructions like this as well. Should we have a function that takes two, and asserts they are as expected (e.g., if we have Vd.4H then we have Vn.8B)?

Are there instructions where the two specifies can both be arbitrary (and unrelated)?

I would say most of the added "long" instructions (except high narrow - addhn{2}, subhn{2}) specify their source register(s) arrangement specifiers.

I don't think we want to specify two arrangements when emitting at the moment - this would complicate the logic in the intrinsic backend - you would not only need to infer the insOpts from an incoming argument/return value but also to compute the second arrangement somehow.

Are there instructions where the two specifies can both be arbitrary (and unrelated)?

I have not seen such instructions - I hope there won't be any

briansull

LGTM

briansull · 2020-04-27T19:56:43Z

src/coreclr/src/jit/emitarm64.cpp

-        case IF_BI_0B:
-        case IF_BI_0C:
-        case IF_BI_1A:
-        case IF_BI_1B:


The ones named here were all of the formats that only had one possible formats. This switch insures that you properly update this method when adding new instr formats. With your change the assumption is made that a newly added instrFormat is an INST1 case by default. Where as before your chanfge it would assert on the newly added format.

briansull · 2020-04-27T20:03:10Z

src/coreclr/src/jit/emitarm64.cpp

-        case IF_BI_0B:
-        case IF_BI_0C:
-        case IF_BI_1A:
-        case IF_BI_1B:


But after looking into your change I believe that what you have added here will also work fine.

…s in emitarm64.cpp

…sts() in codegenarm64.cpp

…Instructions

echesakov · 2020-04-29T01:10:55Z

Last commits are:

Fix dissasembly of saddw{2}, uaddw{2}, ssubw{2}, usubw{2}
Address Bruce's feedback regarding diagnostic message
Merge latest master

echesakov added 30 commits April 23, 2020 12:56

Add sadalp, uadalp in instrsarm64.h

b347d25

Add sabal{2}, uabal{2} in instrsarm64.h

253815e

Add sabdl{2}, uabdl{2} in instrsarm64.h

6e60f13

Add addhn{2}, raddhn{2}, rsubhn{2}, subhn{2} in instrsarm64.h

be774c6

Add shadd, shsub, uhadd, uhsub in instrsarm64.h

5022a16

Add srhadd, urhadd in instrsarm64.h

0c0a7a5

Add saddl{2}, uaddl{2} in instrsarm64.h

24124b6

Add ssubl{2}, usubl{2} in instrsarm64.h

8a512cd

Add saddw{2}, uaddw{2} in instrsarm64.h

b0c15ff

Add ssubw{2}, usubw{2} in instrsarm64.h

f583c85

Add saddlp, uaddlp in instrsarm64.h

7a93934

Add pmull{2} in instrsarm64.h

f680968

Add sqadd, sqsub, uqadd, uqsub in instrsarm64.h

be96f22

Add smlal{2}, umlal{2} in instrsarm64.h

32a3b40

Add smlsl{2}, umlsl{2} in instrsarm64.h

8709fff

Add smull2, umull2 in instrsarm64.h

c21a7fa

Update smull,umull in instrsarm64.h

ebdbebb

Extend scalar encoding IF_DV_3E to support size != 11 in emitarm64.cp…

6e3896d

…p emitfmtsarm64.h

Remove giant sequence of case-s and replace them with simple check (f…

1ba35c6

…mt == insFmt) in emitarm64.cpp

Formatting in emitfmtsarm64.h

1007ed6

Add IF_DV_3H and IF_DV_3HI in emitfmtsarm64.h

3f68e8f

Add EN3K and EN2R in emitarm64.cpp emitfmtsarm64.h

c0a929a

Add emitter::optWidenDstArrangement in emitarm64.cpp emitarm64.h

368abb4

Extend encoding IF_DV_2T to support Long Pairwise in emitarm64.cpp

601ec9d

Implement emitter::emitInsSanityCheck for IF_DV_3H and IF_DV_3HI in e…

f4ce96e

…mitarm64.cpp

Implement emitter::emitOutputInstr for IF_DV_3H and IF_DV_3HI in emit…

528292b

…arm64.cpp

Implement emitter::emitDispIns for IF_DV_3H and IF_DV_3HI in emitarm6…

c816d1a

…4.cpp

Mark IF_DV_3F, IF_DV_3H and IF_DV_3HI as not-writing to GC registers …

159adbf

…in emitarm64.cpp

Implement sqadd, sqsub, uqadd, uqsub in emitter::emitIns_R_R_R in emi…

fe96fcd

…tarm64.cpp

Implement sadalp, saddlp, uadalp and uaddlp in emitter::emitIns_R_R i…

55ee8f4

…n emitarm64.cpp

echesakov added 12 commits April 23, 2020 16:34

Implement sabal{2}, sabdl{2}, saddl{2}, saddw{2}, smlal{2}, smlsl{2},…

c3aa7e7

… ssubl{2}, ssubw{2}, uabal{2}, uabdl{2}, uaddl{2}, uaddw{2}, umlal{2}, umlsl{2}, usubl{2}, usubw{2} in emitarm64.cpp

Implement pmull{2} in emitter::emitIns_R_R_R in emitarm64.cpp

38e965e

Implement smlal{2}, smlsl{2}, smull{2}, umlal{2}, umlsl{2}, umull{2} …

f5bd031

…in emitter::emitIns_R_R_R_I in emitarm64.cpp

Update mul, pmul, smull, umull in emitter::emitIns_R_R_R in emitarm64…

49ceba5

….cpp

Refactor emitter::emitIns_R_R_R in emitarm64.cpp

05367d9

Implement shadd, shsub, srhadd, uhadd, uhsub, urhadd in emitter::emit…

bbcb3eb

…Ins_R_R_R in emitarm64.cpp

Add Arm64 emitter unit tests in codegenarm64.cpp

27d37cb

Add PerfScore for the new instructions in emitarm64.cpp

c5277ab

Add PerfScore for ssra,srsra,usra,ursra in emitarm64.cpp

77e6f8c

Why "No point doing this in a "real" JIT."? in codegenarm64.cpp

5b24c05

Fix disassembly for sxtb, sxth, sxtb, uxtb and uxth in emitarm64.cpp

671183b

Support "Crypto polynomial (64x64) multiply long" pmull{2} in codegen…

afe5dd0

…arm64.cpp emitarm64.cpp emitfmtsarm64.h

echesakov added arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Apr 23, 2020

echesakov commented Apr 24, 2020

View reviewed changes

echesakov requested review from BruceForstall and briansull April 24, 2020 00:02

echesakov marked this pull request as ready for review April 24, 2020 00:03

BruceForstall reviewed Apr 24, 2020

View reviewed changes

BruceForstall approved these changes Apr 27, 2020

View reviewed changes

briansull approved these changes Apr 27, 2020

View reviewed changes

echesakov added 3 commits April 28, 2020 18:00

Support saddw{2}, uaddw{2}, ssubw{2}, usubw{2} in emitter::emitDispIn…

535e6c5

…s in emitarm64.cpp

Put #ifdef-#endif-s around printf-s in CodeGen::genArm64EmitterUnitTe…

1561d41

…sts() in codegenarm64.cpp

Merge branch 'master' into Arm64-ASIMD-Widening-Narrowing-Saturating-…

74a1fa8

…Instructions

echesakov merged commit f03d585 into dotnet:master Apr 29, 2020

echesakov deleted the Arm64-ASIMD-Widening-Narrowing-Saturating-Instructions branch April 29, 2020 03:54

echesakov mentioned this pull request Apr 29, 2020

Add VectorTableList and TableVectorExtension intrinsics #35600

Merged

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Arm64] ASIMD Implement widening, narrowing, saturating instructions #35379

[Arm64] ASIMD Implement widening, narrowing, saturating instructions #35379

echesakov commented Apr 23, 2020 •

edited

Loading

echesakov Apr 23, 2020 •

edited

Loading

BruceForstall Apr 27, 2020

briansull Apr 27, 2020

briansull Apr 27, 2020

echesakov Apr 27, 2020

BruceForstall Apr 24, 2020

echesakov Apr 24, 2020 •

edited

Loading

briansull left a comment

briansull Apr 27, 2020

briansull Apr 27, 2020

echesakov commented Apr 29, 2020

[Arm64] ASIMD Implement widening, narrowing, saturating instructions #35379

[Arm64] ASIMD Implement widening, narrowing, saturating instructions #35379

Conversation

echesakov commented Apr 23, 2020 • edited Loading

echesakov Apr 23, 2020 • edited Loading

Choose a reason for hiding this comment

BruceForstall Apr 27, 2020

Choose a reason for hiding this comment

briansull Apr 27, 2020

Choose a reason for hiding this comment

briansull Apr 27, 2020

Choose a reason for hiding this comment

echesakov Apr 27, 2020

Choose a reason for hiding this comment

BruceForstall Apr 24, 2020

Choose a reason for hiding this comment

echesakov Apr 24, 2020 • edited Loading

Choose a reason for hiding this comment

briansull left a comment

Choose a reason for hiding this comment

briansull Apr 27, 2020

Choose a reason for hiding this comment

briansull Apr 27, 2020

Choose a reason for hiding this comment

echesakov commented Apr 29, 2020

echesakov commented Apr 23, 2020 •

edited

Loading

echesakov Apr 23, 2020 •

edited

Loading

echesakov Apr 24, 2020 •

edited

Loading