LSRA: Honor the register aversion decision during allocation #92709

kunalspathak · 2023-09-27T14:11:57Z

Taken from idea by @jakobbotsch in #92586 (comment) and slightly tweaked it.

ghost · 2023-09-27T14:12:12Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Taken from idea by @jakobbotsch in #92586 (comment) and slightly tweaked it.

Author:	kunalspathak
Assignees:	kunalspathak
Labels:	`area-CodeGen-coreclr`
Milestone:	-

jakobbotsch · 2023-09-27T14:24:36Z

src/coreclr/jit/lsra.cpp

+    regMaskTP updatedPreferences = (preferences & ~currentInterval->registerAversion);
+    if (updatedPreferences != RBM_NONE)
+    {
+        // registerAversion is the best-effort opportunity. Do not update the preference
+        // if registerAversion contains all the registers present in preferences.
+        preferences = updatedPreferences;
+    }


This defeats a lot of the purpose of having this -- I would expect the anti preferences to be more important than the preferences as allocating an anti-preferenced register is (always?) going to result in a spill, while allocating an unpreferred register is just going to result in a reg-reg move. Does this PR with this change fix my example in #92586?

The reason we can get to (preferences & ~currentInterval->antiPreferences) == RBM_NONE that your patch has is because during mergeRegisterPreferences() we do not stop the update from happening if the registerPreferences contains registers we want to avoid. The right way to do this is to in addition of tracking registerAversion make sure registerPreferences is up to date with that information throughout the allocation, not just do it during allocation of that interval/RefPosition. Reason being, we check if the interval's assigned physical register (physReg) is already part of registerPreferences and if yes, we skip the allocation and just assign the same register at the RefPosition. This happens regardless of if that register is present in registerAversion mask. By making sure we never put register in registerPreferences that are already in registerAversion we avoid cases like following:

For V01, if we do not keep registerPreferences up to date, we end up with:

Interval 1: (V01) ref RefPositions {#0@0 #3@9 #18@35 #40@53} physReg:rdx Preferences=[rdx r8] Aversions=[rcx rdx]

When we allocate for RefPosition #0, we see rdx is in registerPreferences and assign it.

By keeping things up to date, we get nice diff for such scenarios:

Now, I still have a check of RBM_NONE which I ideally won't to get rid of and just add an assert(updatedPreferences != RBM_NONE), but I need to see if there are any corner JitStressRegs cases where this is not correct.

Does this PR with this change fix my example in #92586?

It does:

mov rax, rcx cmp qword ptr [r8], 0 mov rcx, rdx cmovl rcx, rax mov qword ptr [r8], rcx mov rcx, rdx mov rdx, rax call [helloworld._92586:Bar(long,long)] nop

during mergeRegisterPreferences() we do not stop the update from happening if the registerPreferences contains registers we want to avoid

IMO that behavior is fine and better than trying to keep registerPreferences as a mix of preferences and anti preferences during build, where both preferences and anti preferences may be found in any order. It seems simpler to me to only combine preferences and antiPreferences at the last step right before we need it (in effect making both sets "add-only" during the build step). But maybe it complicates things during allocation stage given that we also want to update the preferences there.

Now, I still have a check of RBM_NONE which I ideally won't to get rid of and just add an assert(updatedPreferences != RBM_NONE), but I need to see if there are any corner JitStressRegs cases where this is not correct.

I suppose it depends on what the model is... for example, I would think that preferences & ~antiPreferences == RBM_NONE just means that at some point the interval would prefer to be in registers that at other points the interval would prefer not to be in. That seems like a perfectly reasonable situation that can occur.

It does:

With the change from 985c96a I also think that this case shouldn't occur anymore, but the comment in reset is confusing to me since both preferences and anti preferences are best-effort and I think that anti preferences are more important.

We should check memory impact of this change too since adding a field to Interval is pretty much equivalent to adding a field to GenTree.

We should check memory impact of this change too since adding a field to Interval is pretty much equivalent to adding a field to GenTree.

Can you remind me how to do it? I know that we spit that information as part of JITDump, but how do we get the aggregate data?

Flip the define here to enable memory stats in release builds:

runtime/src/coreclr/jit/jit.h

Line 511 in dd8e9aa

#define MEASURE_MEM_ALLOC 0 // You can set this to 1 to get memory stats in retail, as well

Then set DOTNET_JitMemStats=1 and run superpmi.exe with release JIT for some collection (without parallelization). It should output the aggregate results at the end.

For benchmarks.run collection. Let's see how much asmdiff improvements we see and should justify the cost.

- LSRA_Interval | 123169280 | 2.67% + LSRA_Interval | 135486208 | 2.92%

I'm personally all for taking the memory hit to improve LSRA...

Alright, so I have updated the logic to what you have proposed, but we still need to make sure that we honor the aversion even while updating the preferences as done in here. Doing that gives us some more improvements as seen below for real-world win/x64 collection:

[14:12:51] 126 contexts with diffs (69 improvements, 16 regressions, 41 same size) [14:12:51] -272/+36 bytes

Nice. There seems to be a somewhat significant TP hit, especially in MinOpts... is it just from the change in reset?

It would also be good to check the memory impact for MinOpts.

There seems to be a somewhat significant TP hit, especially in MinOpts

It is impacted more because of changes in mergePreferences which is called both by associateRefPositionWithInterval and updateRegisterPreferences. I could avoid calling updateRegisterPreferences at 2 places where we update registerAversion, but not sure if it is worth spreading the logic at other places.

?associateRefPosWithInterval@LinearScan@@AEAAXPEAVRefPosition@@@Z : 13402194 : +5.14% : 58.86% : +0.0829% ??$select@$0A@@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z : 6813596 : +0.76% : 29.93% : +0.0421% ?newInterval@LinearScan@@AEAAPEAVInterval@@W4var_types@@@Z : 2273723 : +2.56% : 9.99% : +0.0141% ?updateRegisterPreferences@Interval@@QEAAX_K@Z : 228420 : +35.73% : 1.00% : +0.0014% ?mergeRegisterPreferences@Interval@@QEAAX_K@Z : 38714 : NA : 0.17% : +0.0002%

asp.net collection tier0 MinOpts collection of windows/x64:

- LSRA_Interval | 181897840 | 7.39% + LSRA_Interval | 200087624 | 8.07%

ghost · 2023-10-28T05:01:18Z

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

…es with registerPreference

kunalspathak · 2023-11-08T20:15:14Z

@jakobbotsch - let me know if you have any other feedback.

jakobbotsch

LGTM. The TP cost / memory impact is perhaps a bit high for the diffs we're getting, but given that linux-x64 host sees TP improvements that might not reflect much. I'll leave it up to you.

I think there's some follow-up simplifications we can make, e.g. mergeRegisterPreferences tries to guess when the mask represents aversions/anti-preferences. I don't think we need that kind of guessing anymore, so likely we can simplify a lot of stuff there.

kunalspathak · 2023-11-08T21:01:44Z

I think there's some follow-up simplifications we can make, e.g. mergeRegisterPreferences tries to guess when the mask represents aversions/anti-preferences. I don't think we need that kind of guessing anymore, so likely we can simplify a lot of stuff there.

Given that this PR is a bit old, I will go ahead and merge this one and do a follow-up PR. I need to see how other minor changes that I called out in #92709 (comment) could change things.

kunalspathak added 2 commits September 26, 2023 14:39

Add Interval->registerAversion

774f5e0

Include killMask in register aversion

b1b8d2d

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 27, 2023

ghost assigned kunalspathak Sep 27, 2023

kunalspathak mentioned this pull request Sep 27, 2023

JIT: LSRA loses track of anti-preferences #92586

Closed

jakobbotsch reviewed Sep 27, 2023

View reviewed changes

kunalspathak added 2 commits September 27, 2023 12:21

Do not mergePreferences if they are present in registerAversion

985c96a

Add assert for updatedPreferences != RBM_NONE

060b077

kunalspathak changed the title ~~Honor the register aversion decision during allocation~~ LSRA: Honor the register aversion decision during allocation Sep 27, 2023

ghost closed this Oct 28, 2023

kunalspathak reopened this Oct 29, 2023

This was referenced Oct 29, 2023

Timeout in System.Net.Quic.Functional.Tests #86019

Closed

CI error: System.Net.Quic.QuicException: The connection timed out from inactivity #91757

Closed

kunalspathak added 2 commits November 7, 2023 11:54

Merge remote-tracking branch 'origin/main' into lsra-preferences

f364f70

Reset the preference based on allRegs if the registeraversion coincid…

354fb84

…es with registerPreference

kunalspathak marked this pull request as ready for review November 7, 2023 22:14

jakobbotsch approved these changes Nov 8, 2023

View reviewed changes

kunalspathak merged commit c422fca into dotnet:main Nov 9, 2023

kunalspathak deleted the lsra-preferences branch November 9, 2023 01:06

AndyAyersMS mentioned this pull request Nov 14, 2023

[Perf] Linux/x64: 9 Regressions on 11/9/2023 1:06:12 AM #94717

Closed

This was referenced Nov 14, 2023

[Perf] Linux/x64: 7 Improvements on 11/9/2023 1:06:12 AM dotnet/perf-autofiling-issues#24442

Closed

[Perf] Linux/x64: 2 Improvements on 11/10/2023 12:50:59 PM dotnet/perf-autofiling-issues#24444

Open

github-actions bot locked and limited conversation to collaborators Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSRA: Honor the register aversion decision during allocation #92709

LSRA: Honor the register aversion decision during allocation #92709

kunalspathak commented Sep 27, 2023

ghost commented Sep 27, 2023

jakobbotsch Sep 27, 2023

kunalspathak Sep 27, 2023

jakobbotsch Sep 27, 2023

kunalspathak Sep 27, 2023

jakobbotsch Sep 27, 2023

kunalspathak Sep 27, 2023

jakobbotsch Sep 27, 2023

kunalspathak Nov 7, 2023

jakobbotsch Nov 8, 2023

kunalspathak Nov 8, 2023

ghost commented Oct 28, 2023

kunalspathak commented Nov 8, 2023

jakobbotsch left a comment

kunalspathak commented Nov 8, 2023 •

edited

Loading

LSRA: Honor the register aversion decision during allocation #92709

LSRA: Honor the register aversion decision during allocation #92709

Conversation

kunalspathak commented Sep 27, 2023

ghost commented Sep 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Oct 28, 2023

kunalspathak commented Nov 8, 2023

jakobbotsch left a comment

Choose a reason for hiding this comment

kunalspathak commented Nov 8, 2023 • edited Loading

kunalspathak commented Nov 8, 2023 •

edited

Loading