-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
emit branchless form of (i >= 0 && j >= 0)/(i!=0&& j!= 0) for signed integers #62689
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsFixes #61940
|
src/coreclr/jit/optimizer.cpp
Outdated
assert(m_testInfo1.compTree->gtOper == GT_EQ || m_testInfo1.compTree->gtOper == GT_NE || | ||
m_testInfo1.compTree->gtOper == GT_LT || m_testInfo1.compTree->gtOper == GT_GE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(m_testInfo1.compTree->gtOper == GT_EQ || m_testInfo1.compTree->gtOper == GT_NE || | |
m_testInfo1.compTree->gtOper == GT_LT || m_testInfo1.compTree->gtOper == GT_GE); | |
assert(m_testInfo1.compTree->OperIs(GT_EQ, GT_NE, GT_LT, GT_GE)); |
src/coreclr/jit/optimizer.cpp
Outdated
|
||
if ((cond->gtOper != GT_EQ) && (cond->gtOper != GT_NE)) | ||
if ((cond->gtOper != GT_EQ) && (cond->gtOper != GT_NE) && (cond->gtOper != GT_LT) && (cond->gtOper != GT_GE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ((cond->gtOper != GT_EQ) && (cond->gtOper != GT_NE) && (cond->gtOper != GT_LT) && (cond->gtOper != GT_GE)) | |
if (!cond->OperIs(GT_EQ, GT_NE, GT_LT, GT_GE)) |
$(CLRTestBatchPreCommands) | ||
set DOTNET_JitNoStructPromotion=1 | ||
set DOTNET_JitNoCSE=1 | ||
]]></CLRTestBatchPreCommands> | ||
<BashCLRTestPreCommands><![CDATA[ | ||
$(BashCLRTestPreCommands) | ||
export DOTNET_JitNoStructPromotion=1 | ||
export DOTNET_JitNoCSE=1 | ||
]]></BashCLRTestPreCommands> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really need JitNoStructPromotion
and JitNoCSE
- seems like a copy & paste error?
Also, I don't think this should be Jit/Regression
folder, that is meant for tests for bug fixes.
JIT\opt\OptimizeBools
would be a better place.
Looking at https://dev.azure.com/dnceng/public/_build/results?buildId=1527034&view=ms.vss-build-web.run-extensions-tab there are more regressions than I'd expect to see from something like this. Can you take a look at some of these? |
Thanks @AndyAyersMS for pointing that, I will work on it. This is my first PR on coreclr so I didn't know about superpmi (which is kind of cool). Just to be sure : so my PR doesn't introduce regressions I need to ensure that 'Total bytes of delta' is always negative on the different OS/Arch ? since my code is supposed to produce less assembly code |
Ideally, yes, a change like this would only produce smaller code. But sometimes this isn't possible. What we typically do is look at the worst cases and see why the code size has increased. Often this uncovers things that can be improved in the optimization or reveals weaknesses in a related optimization that we should try and address. You should be able to run SPMI on just the specific method instances that have regressed and from there use jit dumps and to analyze what the jit is doing. Let me know if you need help getting this running. Sometimes the root cause of the regressions is complicated, and the regressions are hard to address. We then need to weigh the pluses and minuses to decide if we want to move forward with a change. |
@AndyAyersMS I found out my changes introduce calls to hackishModuleName:hackishMethodName(). They don't appear in jit dumps, but they are present in assembly code. Do you have any idea about their utility and what might cause them to appear? |
In the SPMI data collection we often don't collect the actual names (to save some space). During SPMI replay if the jit then asks for a name, we'll return one of these hackishModuleName:hackishMethodName strings. runtime/src/coreclr/tools/superpmi/superpmi-shared/methodcontext.cpp Lines 1097 to 1123 in cede333
It would surprise me if you start seeing calls in the emitted assembly after your change that weren't there before. Probably best to look at a specific example, can you share one? |
Here are some examples :
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a closer look at your regressions and let you know what I see.
src/coreclr/jit/optimizer.cpp
Outdated
|
||
if ((cond->gtOper != GT_EQ) && (cond->gtOper != GT_NE)) | ||
// we don't optimize this statements because we might delete them (e.g. array range checks) | ||
if (cond->OperIs(GT_LT, GT_GE) && (m_b1->bbFlags != BBF_IMPORTED || m_b2->bbFlags != BBF_IMPORTED)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check looks a little odd to me. We usually don't care about BBF_IMPORTED
later on in the jit. I wonder if we should be checking something else here instead to convey the meaning better.
Do you recall what lead you to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array range checks instructions (indexer >= 0, indexer < array.Length) are added to the user code, but later they get deleted during the Range Check Elimination step. When those instructions are passed through the boolean optimization step they get optimized with user code, so they're no longer deleted. That's why I added the BBF_IMPORTED filter so I optimize only user code, I know the filter is not constraining enough, but I didn't find some flag/property that could filter only array range check instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to fix a regression and not to fix some correctness issue? If so then it's likely a similar effect to what I mention below with jump threading. An upstream opt inhibits a downstream one.
If my experiment to move OptOptimizeBools
to the end of the optimization pass pans out, then this won't be necessary.
Should have some data on it soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's to fix regression (not to fix correctness issue)
In
Unfortunately, this blocks jump threading later on, as it can't reason about compound conditionals like In the baseline jit we see both the BB02 and BB03 conditions allowing us to fold other branches:
It's probably not possible to anticipate when this effect might happen in Another idea is to teach branch threading to look through boolean operators on compares. If we're on the true side of a branch and we see an AND we know both conditions must be true; if we're on the false side we know both must be false, etc. This might prove to be a bit challenging but should find more cases (even without this change). Will add notes to #48115. |
This indeed fixes |
Best guess is that by keeping more branches around during optimization we end up creating more assertions which causes us to lose other assertions and hence lose some optimizations. One of the annoyances of assertion prop is that at some point in large enough methods it just silently stops creating assertions and so spotting this is harder than it should be.... would be good if at the end of assertion gen we report how many assertions we wanted to create but couldn't, or something. Overall moving the phase is a net improvement, there are just a few annoying bad cases, and I t hink (but haven't verified it fixes most of the regressions you were seeing). If you want to play around with this yourself then cherry-pick the head commit from my fork: main...AndyAyersMS:MoveOptOptimizeBoolsLater Let me think about this for a bit and get back to you with a plan... |
@AndyAyersMS I tried reproducing this transformation locally by could not figure it out, can you share what commands do you execute to get it ? Here is what I'm doing to reproduce regression :
|
I usually extract the instance from SPMI, by using a script like the following. To make this work you need to fill in the actual paths to the SPMI collection and the baseline jit, as well as your locally built jit. Then you just invoke with the method context number (reported in the SPMI diff report). This leaves two jit dump files; nnn.d and nnn.d1 for the baseline and your locally built jit, respectively. I then usually diff those to see how/when the two jits behave differently. If you uncomment the @echo off
setlocal
set DEVENV="C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\IDE\devenv.exe"
set CDIR=%CD%
set COLLECTION=C:\repos\runtime2\artifacts\spmi\mch\63009f0c-662a-485b-bac1-ff67be6c7f9d.windows.x64\aspnet.run.windows.x64.checked.mch
set JIT=C:\repos\runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll
set BASEJIT=C:\repos\runtime2\artifacts\spmi\basejit\2210b7490fbd28b7ce52211339d4ee8af82f1dac.windows.x64.Checked\clrjit.dll
set SPMI=c:\repos\runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\superpmi.exe
set complus_jitdump=*
set complus_ngendump=*
@rem goto :DEBUG
del %CDIR%\%1.d*
set complus_jitstdoutfile=%CDIR%\%1.d
%SPMI% -c %1 %BASEJIT% %COLLECTION%
set complus_jitstdoutfile=%CDIR%\%1.d1
%SPMI% -c %1 %JIT% %COLLECTION%
goto :EOF
:DEBUG
%DEVENV% -debugexe %SPMI% -c %1 %JIT% %COLLECTION% But, what you are doing should work too. Just make sure you are running with optimization enabled, typically we do this by disabling tiered compilation by setting an environment var before launching
|
Sorry for the delay on this, I have been preoccupied with other work and unfortunately that work is going to take another week or so. I still think the right idea is to move this optimization later but as noted above this causes some complications. |
Sorry to say I am still a bit tied up -- hopefully for just another week or two. |
@@ -9176,6 +9226,30 @@ GenTree* OptBoolsDsc::optIsBoolComp(OptTestInfo* pOptTest) | |||
// | \--* LCL_VAR int V03 arg3 | |||
// \--* CNS_INT int 0 | |||
// | |||
// Case 5: (x != 0 && y != 0) => (x | y) != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be (x != 0 || y != 0)
I'm assuming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it should be (x != 0 || y != 0)
No worries. I've un-added everyone above. Also I might have confused things by pushing a merge to your branch. I should have explained what I was doing in more detail. |
thanks @AndyAyersMS :) I have two questions though :
|
Github has gotten pretty good about seeing through merges, so we generally don't need to rebase/force-push to update PRs these days, unless there are conflicts or something. I'm not sure how much to explain as it depends on how well you know You should have a copy of the runtime repo on your local machine somewhere (say The changes in your local repo are on branch
|
I am using VS 2022 (64-bit) Version 17.4.0 Preview 4.0 and have seen similar problems with debugger properly evaluation. Not sure if something has changed in our main branch that makes debugging new builds difficult. One solution is to switch over to using windbg which doesn't try to be quite so helpful. But if you've never used windbg it can be a bit of a steep learning curve. |
Does not this add merged commits at the head of my branch ? the PR history will be crowded with others commit. I did remark that your merge was quit clean, did you squash merged commits before pushing them to origin ?
Thanks for the hint I'll try it |
Yes. But git hub uses the "three dot" git diff so if you merge main into your PR branch, it updates the merge base, so those other commits don't appear as diffs in your PR. See eg https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-comparing-branches-in-pull-requests Likewise, if you run |
Based on CI testing, there is a problem introduced into using System;
using System.Globalization;
using System.Numerics;
using System.Runtime.CompilerServices;
class Assert
{
[MethodImpl(MethodImplOptions.NoInlining)]
public static void Equal(uint x, BigInteger y) { if (x != y) { Console.WriteLine($"{x} != {y}"); } }
[MethodImpl(MethodImplOptions.NoInlining)]
public static void True(bool b) { if (!b) { Console.WriteLine($"fail"); } }
}
class X
{
public static void Main()
{
BigInteger result;
Assert.True(BigInteger.TryParse("080000001", NumberStyles.HexNumber, null, out result));
Assert.Equal(0x80000001u, result);
}
} The disasm for the method (vs unmodified jit) shows this new transformation kicking in: ;; base
G_M53841_IG51: ;; offset=0468H
83FB01 cmp ebx, 1
7550 jne SHORT G_M53841_IG56
458B6500 mov r12d, dword ptr [r13]
33F6 xor rsi, rsi
4585FF test r15d, r15d
7505 jne SHORT G_M53841_IG52
4585E4 test r12d, r12d
7C0D jl SHORT G_M53841_IG53
;; size=21 bbWeight=0.50 PerfScore 3.00
G_M53841_IG52: ;; offset=047DH
4181FC00000080 cmp r12d, 0xFFFFFFFF80000000
0F85F7FEFFFF jne G_M53841_IG39
;; diff
G_M53841_IG51: ;; offset=0468H
83FB01 cmp ebx, 1
754E jne SHORT G_M53841_IG55
458B6500 mov r12d, dword ptr [r13]
33F6 xor rsi, rsi
418BCF mov ecx, r15d
410BCC or ecx, r12d
740D je SHORT G_M53841_IG52 ;; <<<-- I think this should be jne
4181FC00000080 cmp r12d, 0xFFFFFFFF80000000
0F85F9FEFFFF jne G_M53841_IG39 |
SPMI failure was timeout on x86. I think this is finally ready to merge. |
/azp run fuzzlyn |
Azure Pipelines successfully started running 1 pipeline(s). |
@pedrobsaila finally merged. Thank you for all your hard work here, and for hanging in there over what turned out to be an unexpectedly long process. |
Thank you @AndyAyersMS too for your guidance, kindness and patience with me through my first PR on JIT. |
Fixes #61940