Contain memory operands under casts #72719

SingleAccretion · 2022-07-23T13:26:14Z

Recognize legal patterns in lowering and then fold the sign/zero-extension into an appropriate load at codegen time.

The change implements full support on XARCH, partial support on ARM/64, and punts LA. A follow-up issue will be filed to track the remaining work. The ARM/64 commit contains some notes on what is required for said support on our load/store architectures.

The diffs are nice and simple:

-            ldr     lr, [sp+0x34]      // [V07 arg7]
-            uxtb    lr, lr
+            ldrb    lr, [sp+0x34]      // [V07 arg7]

Regressions are RA/alignment.

ghost · 2022-07-23T13:26:28Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Fold the sign/zero-extension into an appropriate load.

The change implements full support on XARCH, partial support on ARM/64, and punts LA. A follow-up issue will be filed to track the remaining work. The ARM/64 commit contains some notes on what is required for said support on our load/store architectures.

The diffs are nice and simple:

-            ldr     lr, [sp+0x34]      // [V07 arg7]
-            uxtb    lr, lr
+            ldrb    lr, [sp+0x34]      // [V07 arg7]

Author:	SingleAccretion
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

SingleAccretion · 2022-07-24T12:20:01Z

@dotnet/jit-contrib

src/coreclr/jit/codegenarmarch.cpp

TIHan

The diffs look really good.

I tried doing something similar here: #70756 - but I didn't get to work on it longer.

src/coreclr/jit/codegenxarch.cpp

SingleAccretion · 2022-07-28T18:56:59Z

I think it would be good to run the fuzzers and [libraries] stress on this change.

EgorBo · 2022-07-28T19:15:47Z

/azp run Fuzzlyn, Antigen, runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-coreclr gcstress0x3-gcstress0xc, runtime-coreclr jitstressregs

azure-pipelines · 2022-07-28T19:16:27Z

Azure Pipelines successfully started running 6 pipeline(s).

TODO: consider using a dedicated IND_EXT oper for ARM/ARM64 instead of containment. This would allow us to cleany handle all indirections. It would not mean we'd give up on the casts containment, as we'd still need to handle the "reg optional" case. IND_EXT will be much like an ordinary IND, but have a "source" and "target" types. The "target" type would always be int/long, while "source" could be of any integral type. This design would be a bit more natural, and nicely separable from casts. However, the main problem with the current state of things, apart from the fact codegen of indirections is tied strongly to "GenTreeIndir", is the fact that changing type of the load can invalidate LEA containment. One would think this is solvable with some tricks, like re-running containment analysis on an indirection after processing the cast, but ARM64 codegen doesn't support uncontained LEAs in some cases. A possible solution to that problem is uncontaining the whole address tree. That would be messy, but ought to work. An additional complication is that these trees can contain a lot of contained operands as part of ADDEX and BFIZ, so what would have to be done first is the making of these into proper EXOPs. In any case, this is all future work.

In CAST<short>(IND<byte>(...)), "m_extendSrcSize" must be "1". Modulo the above, stress runs looked clean.

jakobbotsch

It's curious to me that there are so many x86 improvements. Do we not already removed most of these unnecessary casts there by retyping loads in morph?

SingleAccretion · 2022-08-23T10:59:22Z

It's curious to me that there are so many x86 improvements. Do we not already removed most of these unnecessary casts there by retyping loads in morph?

I think most of these come from normalize-on-load stack parameters, which we do not retype.

jakobbotsch · 2022-08-23T11:04:51Z

I think most of these come from normalize-on-load stack parameters, which we do not retype.

Doesn't morph introduce this IR shape itself in fgMorphLocalVar? So doing that is a large regression?

SingleAccretion · 2022-08-23T11:13:14Z

Doesn't morph introduce this IR shape itself in fgMorphLocalVar? So doing that is a large regression?

It of course does, and it is, but it is necessary for in-register parameters (and on-stack ones as well, though that should be relatively easy to fix by using extending loads in genEnregisterIncomingStackArgs).

Morph's logic there is questionable in more than one way. If we know the local will be DNER at that point, we could retype it to its small type instead of introducing a cast. However, this turns out to sometimes not be a win because we don't CSE locals, but do CSE casts.

jakobbotsch · 2022-08-24T09:43:10Z

Makes sense about CSE. The way fgMorphLocalVar worked always seemed odd to me.

Anyway, thanks for the contribution as usual.

DrewScoggins · 2022-08-30T16:42:43Z

Improvements on
Win-x86: dotnet/perf-autofiling-issues#8170
Win-x64 (AMD): dotnet/perf-autofiling-issues#8135
Linux-x64: dotnet/perf-autofiling-issues#8259
Windows-x64: dotnet/perf-autofiling-issues#8074
Ubuntu-x64: dotnet/perf-autofiling-issues#8260

AndyAyersMS · 2022-09-01T16:41:29Z

Possible regressions:
ubuntu arm64: dotnet/perf-autofiling-issues#8221
ubuntu x64: dotnet/perf-autofiling-issues#8255
windows x64: dotnet/perf-autofiling-issues#8258

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 23, 2022

ghost added the community-contribution Indicates that the PR has been added by a community member label Jul 23, 2022

SingleAccretion force-pushed the Casts-Containment branch 3 times, most recently from d3d8af4 to 6ffddae Compare July 23, 2022 23:14

SingleAccretion marked this pull request as ready for review July 24, 2022 12:19

TIHan reviewed Jul 25, 2022

View reviewed changes

src/coreclr/jit/codegenarmarch.cpp Show resolved Hide resolved

TIHan approved these changes Jul 25, 2022

View reviewed changes

tannergooding reviewed Jul 26, 2022

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Show resolved Hide resolved

jakobbotsch added this to the 8.0.0 milestone Jul 29, 2022

SingleAccretion added 6 commits August 22, 2022 12:19

Add GenTreeCast::IsZeroExtending

a7c93b5

Cast descriptor support

b219f5c

XARCH support

9b44d41

Cleanup, fix & a comment

a99db71

Fix an issue found by the fuzzers

7ec2a8a

In CAST<short>(IND<byte>(...)), "m_extendSrcSize" must be "1". Modulo the above, stress runs looked clean.

SingleAccretion force-pushed the Casts-Containment branch from 0da2d79 to 7ec2a8a Compare August 22, 2022 09:31

jakobbotsch approved these changes Aug 23, 2022

View reviewed changes

jakobbotsch merged commit 0469020 into dotnet:main Aug 24, 2022

SingleAccretion mentioned this pull request Aug 24, 2022

Complete support for containment of memory operands under casts #74490

Open

SingleAccretion deleted the Casts-Containment branch August 24, 2022 10:22

AndyAyersMS mentioned this pull request Aug 30, 2022

[Perf] Windows 10.0.19042/x64 : Improvement on 8/24/2022 9:42:22 AM dotnet/perf-autofiling-issues#8074

Closed

ghost locked as resolved and limited conversation to collaborators Oct 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contain memory operands under casts #72719

Contain memory operands under casts #72719

SingleAccretion commented Jul 23, 2022 •

edited

Loading

ghost commented Jul 23, 2022

SingleAccretion commented Jul 24, 2022

TIHan left a comment •

edited

Loading

SingleAccretion commented Jul 28, 2022

EgorBo commented Jul 28, 2022

azure-pipelines bot commented Jul 28, 2022

jakobbotsch left a comment

SingleAccretion commented Aug 23, 2022

jakobbotsch commented Aug 23, 2022

SingleAccretion commented Aug 23, 2022

jakobbotsch commented Aug 24, 2022

DrewScoggins commented Aug 30, 2022 •

edited by AndyAyersMS

Loading

AndyAyersMS commented Sep 1, 2022 •

edited

Loading

Contain memory operands under casts #72719

Contain memory operands under casts #72719

Conversation

SingleAccretion commented Jul 23, 2022 • edited Loading

ghost commented Jul 23, 2022

SingleAccretion commented Jul 24, 2022

TIHan left a comment • edited Loading

Choose a reason for hiding this comment

SingleAccretion commented Jul 28, 2022

EgorBo commented Jul 28, 2022

azure-pipelines bot commented Jul 28, 2022

jakobbotsch left a comment

Choose a reason for hiding this comment

SingleAccretion commented Aug 23, 2022

jakobbotsch commented Aug 23, 2022

SingleAccretion commented Aug 23, 2022

jakobbotsch commented Aug 24, 2022

DrewScoggins commented Aug 30, 2022 • edited by AndyAyersMS Loading

AndyAyersMS commented Sep 1, 2022 • edited Loading

SingleAccretion commented Jul 23, 2022 •

edited

Loading

TIHan left a comment •

edited

Loading

DrewScoggins commented Aug 30, 2022 •

edited by AndyAyersMS

Loading

AndyAyersMS commented Sep 1, 2022 •

edited

Loading