Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold "cns"[cns] for ROS<char> #78593

Merged
merged 15 commits into from
Nov 21, 2022
Merged

Fold "cns"[cns] for ROS<char> #78593

merged 15 commits into from
Nov 21, 2022

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Nov 19, 2022

Closes #78359
e.g.:

int Test() => DoWork("D");

[MethodImpl(MethodImplOptions.AggressiveInlining)]
int DoWork(ReadOnlySpan<char> format)
{
    if (format.Length == 1)
    {
        switch (format[0] | 0x20)
        {
            case 'a': return 1;
            case 'b': return 2;
            case 'c': return 4;
            case 'd': return 8;
            case 'e': return 16;
            case 'f': return 32;
        }
    }
    throw new Exception();
}

Current codegen for Test:

; Method ConsoleApp8.C:Test():int:this
G_M000_IG01:                ;; offset=0000H
       56                   push     rsi
       4883EC20             sub      rsp, 32
G_M000_IG02:                ;; offset=0005H
       48B84023807D78020000 mov      rax, 0x2787D802340
       488B00               mov      rax, gword ptr [rax]
       4883C00C             add      rax, 12
       0FB700               movzx    rax, word  ptr [rax]
       83C820               or       eax, 32
       83C09F               add      eax, -97
       83F805               cmp      eax, 5
       7746                 ja       SHORT G_M000_IG10
       8BC0                 mov      eax, eax
       488D0D63000000       lea      rcx, [reloc @RWD00]
       8B0C81               mov      ecx, dword ptr [rcx+4*rax]
       488D15CEFFFFFF       lea      rdx, G_M000_IG02
       4803CA               add      rcx, rdx
       FFE1                 jmp      rcx
G_M000_IG03:                ;; offset=003CH
       B801000000           mov      eax, 1
       EB21                 jmp      SHORT G_M000_IG09
G_M000_IG04:                ;; offset=0043H
       B802000000           mov      eax, 2
       EB1A                 jmp      SHORT G_M000_IG09
G_M000_IG05:                ;; offset=004AH
       B804000000           mov      eax, 4
       EB13                 jmp      SHORT G_M000_IG09
G_M000_IG06:                ;; offset=0051H
       B808000000           mov      eax, 8
       EB0C                 jmp      SHORT G_M000_IG09
G_M000_IG07:                ;; offset=0058H
       B810000000           mov      eax, 16
       EB05                 jmp      SHORT G_M000_IG09
G_M000_IG08:                ;; offset=005FH
       B820000000           mov      eax, 32
G_M000_IG09:                ;; offset=0064H
       4883C420             add      rsp, 32
       5E                   pop      rsi
       C3                   ret      
G_M000_IG10:                ;; offset=006AH
       48B9582E82E0FA7F0000 mov      rcx, 0x7FFAE0822E58
       E84744AC5F           call     CORINFO_HELP_NEWSFAST
       488BF0               mov      rsi, rax
       488BCE               mov      rcx, rsi
       FF153BB10800         call     [System.Exception:.ctor():this]
       488BCE               mov      rcx, rsi
       E843509F5F           call     CORINFO_HELP_THROW
       CC                   int3     
RWD00  	dd	00000037h ; case G_M000_IG03
       	dd	0000003Eh ; case G_M000_IG04
       	dd	00000045h ; case G_M000_IG05
       	dd	0000004Ch ; case G_M000_IG06
       	dd	00000053h ; case G_M000_IG07
       	dd	0000005Ah ; case G_M000_IG08
; Total bytes of code: 142

New codegen for Test:

; Method ConsoleApp8.C:Test():int:this
       B808000000           mov      eax, 8
       C3                   ret      
; Total bytes of code: 6

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 19, 2022
@ghost
Copy link

ghost commented Nov 19, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Closes #78359
e.g.:

int Test() => DoWork("D");

[MethodImpl(MethodImplOptions.AggressiveInlining)]
int DoWork(ReadOnlySpan<char> format)
{
    if (format.Length == 1)
    {
        switch (format[0] | 0x20)
        {
            case 'a': return 1;
            case 'b': return 2;
            case 'c': return 4;
            case 'd': return 8;
            case 'e': return 16;
            case 'f': return 32;
        }
    }
    throw new Exception();
}

Current codegen for Test:

; Method ConsoleApp8.C:Test():int:this
G_M000_IG01:                ;; offset=0000H
       56                   push     rsi
       4883EC20             sub      rsp, 32
G_M000_IG02:                ;; offset=0005H
       48B84023807D78020000 mov      rax, 0x2787D802340
       488B00               mov      rax, gword ptr [rax]
       4883C00C             add      rax, 12
       0FB700               movzx    rax, word  ptr [rax]
       83C820               or       eax, 32
       83C09F               add      eax, -97
       83F805               cmp      eax, 5
       7746                 ja       SHORT G_M000_IG10
       8BC0                 mov      eax, eax
       488D0D63000000       lea      rcx, [reloc @RWD00]
       8B0C81               mov      ecx, dword ptr [rcx+4*rax]
       488D15CEFFFFFF       lea      rdx, G_M000_IG02
       4803CA               add      rcx, rdx
       FFE1                 jmp      rcx
G_M000_IG03:                ;; offset=003CH
       B801000000           mov      eax, 1
       EB21                 jmp      SHORT G_M000_IG09
G_M000_IG04:                ;; offset=0043H
       B802000000           mov      eax, 2
       EB1A                 jmp      SHORT G_M000_IG09
G_M000_IG05:                ;; offset=004AH
       B804000000           mov      eax, 4
       EB13                 jmp      SHORT G_M000_IG09
G_M000_IG06:                ;; offset=0051H
       B808000000           mov      eax, 8
       EB0C                 jmp      SHORT G_M000_IG09
G_M000_IG07:                ;; offset=0058H
       B810000000           mov      eax, 16
       EB05                 jmp      SHORT G_M000_IG09
G_M000_IG08:                ;; offset=005FH
       B820000000           mov      eax, 32
G_M000_IG09:                ;; offset=0064H
       4883C420             add      rsp, 32
       5E                   pop      rsi
       C3                   ret      
G_M000_IG10:                ;; offset=006AH
       48B9582E82E0FA7F0000 mov      rcx, 0x7FFAE0822E58
       E84744AC5F           call     CORINFO_HELP_NEWSFAST
       488BF0               mov      rsi, rax
       488BCE               mov      rcx, rsi
       FF153BB10800         call     [System.Exception:.ctor():this]
       488BCE               mov      rcx, rsi
       E843509F5F           call     CORINFO_HELP_THROW
       CC                   int3     
RWD00  	dd	00000037h ; case G_M000_IG03
       	dd	0000003Eh ; case G_M000_IG04
       	dd	00000045h ; case G_M000_IG05
       	dd	0000004Ch ; case G_M000_IG06
       	dd	00000053h ; case G_M000_IG07
       	dd	0000005Ah ; case G_M000_IG08
; Total bytes of code: 142

New codegen for Test:

; Method ConsoleApp8.C:Test():int:this
       B808000000           mov      eax, 8
       C3                   ret      
; Total bytes of code: 6
Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo marked this pull request as ready for review November 19, 2022 19:56
@EgorBo
Copy link
Member Author

EgorBo commented Nov 19, 2022

@SingleAccretion can you please take a look at the VN side when you have time, we discussed this one yesterday. For the issue with the side-effects I decided to just call gtUpdateStmtSideEffects for current statement and it worked.

{
ValueNum addrVN = tree->gtGetOp1()->gtVNPair.GetLiberal();
VNFuncApp funcApp;
if (!varTypeIsShort(tree) || !vnStore->GetVNFunc(addrVN, &funcApp))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we interested in TYP_SHORT (signed) trees, since I assume this is a TP heuristic?

This shouldn't have to recompute addrVN + GetVNFunc, since the caller already did that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SingleAccretion can you point me where exactly caller already did it? it seems like other paths also compute it

else if (vnStore->GetVNFunc(addrNvnp.GetLiberal(), &funcApp) && (funcApp.m_func == VNF_PtrToStatic))
{
fldSeq = vnStore->FieldSeqVNToFieldSeq(funcApp.m_args[1]);
offset = vnStore->ConstantValue<ssize_t>(funcApp.m_args[2]);
// Note VNF_PtrToStatic statics are currently always "simple".
fgValueNumberFieldLoad(tree, /* baseAddr */ nullptr, fldSeq, offset);
}
else if (tree->OperIs(GT_IND) && fgValueNumberConstLoad(tree->AsIndir()))
{
// VN is assigned inside fgValueNumberConstLoad
}
else if (vnStore->GetVNFunc(addrNvnp.GetLiberal(), &funcApp) && (funcApp.m_func == VNF_PtrToArrElem))
{
fgValueNumberArrayElemLoad(tree, &funcApp);
}
.

Changed to TYP_SHORT (yes, it was a TP fast-out path)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you point me where exactly caller already did it? it seems like other paths also compute it

else if (vnStore->GetVNFunc(addrNvnp.GetLiberal(), &funcApp) computes it. It would be good to change this to compute the thing once like ASG numbering does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind if I leave it as is, I've just pushed a change to extract vnfunc only after TP-oriented checks. I tried to re-organize code to save these two existing lookups and didn't like the outcome, feel free to file a PR to clean it up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/valuenum.cpp Outdated Show resolved Hide resolved
EgorBo and others added 2 commits November 19, 2022 23:41
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
@am11
Copy link
Member

am11 commented Nov 19, 2022

        switch (format[0] | 0x20)

Looks like with char.ToLowerInvariant() it emits an extra mov and a jmp.

+            switch (char.ToLowerInvariant(value[0]))
-            switch (value[0] | 0x20)
C.M2()
+    L0000: mov rcx, 0x207d38020d8
+    L000a: mov rcx, [rcx]
+    L000d: jmp 0x00007ff863530090
-    L0000: mov eax, 8
-    L0005: ret

and gets worse with value.ToLowerInvariant()[0]. Maybe importer can normalize them to | 0x20 equivalent for ASCII / A-Za-z values (since ToLowerInvariant() is more common than | 0x20).

@EgorBo
Copy link
Member Author

EgorBo commented Nov 20, 2022

@am11 good point! That jump is a tail call to TextInfo:ToLowerInvariant which is not inlined, I've pushed a change that fixes that - 369179d, e.g.:

char foo() => char.ToLowerInvariant('B');

Was:

; Method Prog:foo():ushort:this
       B942000000           mov      ecx, 66 ; 'B'
       FF25556B1500         tail.jmp [System.Globalization.TextInfo:ToLowerInvariant(ushort):ushort]
; Total bytes of code: 11

Now:

; Method Prog:foo():ushort:this
       B862000000           mov      eax, 98 ; 'b'
       C3                   ret      
; Total bytes of code: 6

I think it's fine to mark those as AggressiveInlining: Invariant is folded always (inliner has no chance to know it in advance) and IsAsciiCodePoint and ToUpperAsciiInvariant are single instructions basically.

@EgorBo
Copy link
Member Author

EgorBo commented Nov 20, 2022

@jkotas does VM side look good to you?

@jkotas
Copy link
Member

jkotas commented Nov 20, 2022

The VM side looks good.

@EgorBo
Copy link
Member Author

EgorBo commented Nov 20, 2022

@SingleAccretion can you take a look again? 🙂

As I said in some of the comments, I decided to avoid assertprop changes in this PR - so it still folds everything, but might leave nullchecks in some cases. I'll work separately (once this lands and collections are updated) to see if it's worth the effort and JIT-TP to clean them up (local runs of jit-diff state that there are not so many hits)

Copy link
Contributor

@SingleAccretion SingleAccretion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jit side changes LGTM.

@EgorBo
Copy link
Member Author

EgorBo commented Nov 20, 2022

@jakobbotsch Can you please sign off the jit side (if it looks good to you)

@EgorBo EgorBo merged commit 0ff0f7d into dotnet:main Nov 21, 2022
@EgorBo
Copy link
Member Author

EgorBo commented Nov 21, 2022

Merging to have a new collection by Monday.
Two CI failures are unrelated because they're Mono only, e.g. iOS seems to be struggling connecting to a test device in that test (infra issue)

@jkotas
Copy link
Member

jkotas commented Nov 21, 2022

Two CI failures are unrelated because they're Mono only

We still want to keep track of the unrelated intermittent failures to ensure that they are getting fixed: https://github.com/dotnet/runtime/blob/main/docs/pr-builds.md#what-to-do-if-you-determine-the-failure-is-unrelated

Both of the failures are known - I have added comments to the tracking issues for you.

@EgorBo EgorBo mentioned this pull request Nov 22, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Dec 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inlined ReadOnlySpan<char> parameter lacks string-like optimizations
5 participants