-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm64: Readjust the heuristics to materialize double constants using fmov #64794
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsCurrently, if a double constant is something that can be encoded using public static double issue1(int n ,double y)
{
double result = 0.0;
for (int i = 0; i < n; i++)
{
result += y;
if (result == 4.5)
{
result -= 4.5;
}
else if (result < 4.5)
{
result += 4.5;
}
else if (result > 4.5)
{
result += (4.5 + y);
}
}
return result;
} G_M23855_IG01: ;; offset=0000H
A9BF7BFD stp fp, lr, [sp,#-16]!
910003FD mov fp, sp
;; bbWeight=1 PerfScore 1.50
G_M23855_IG02: ;; offset=0008H
4F00E410 movi v16.16b, #0x00
2A1F03E1 mov w1, wzr
7100001F cmp w0, #0
540002CD ble G_M23855_IG08
align [0 bytes for IG03]
align [0 bytes]
align [0 bytes]
align [0 bytes]
;; bbWeight=1 PerfScore 2.50
G_M23855_IG03: ;; offset=0018H
1E602A10 fadd d16, d16, d0
1E625011 fmov d17, #4.5000
1E712200 fcmp d16, d17
54000061 bne G_M23855_IG05
;; bbWeight=4 PerfScore 22.00
G_M23855_IG04: ;; offset=0028H
4F00E410 movi v16.16b, #0x00
1400000D b G_M23855_IG07
;; bbWeight=2 PerfScore 3.00
G_M23855_IG05: ;; offset=0030H
1E625011 fmov d17, #4.5000
1E712200 fcmp d16, d17
54000082 bhs G_M23855_IG06
1E625011 fmov d17, #4.5000
1E712A10 fadd d16, d16, d17
14000007 b G_M23855_IG07
;; bbWeight=2 PerfScore 14.00
G_M23855_IG06: ;; offset=0048H
1E625011 fmov d17, #4.5000
1E712200 fcmp d16, d17
5400008D ble G_M23855_IG07
1E625011 fmov d17, #4.5000
1E712811 fadd d17, d0, d17
1E712A10 fadd d16, d16, d17
;; bbWeight=2 PerfScore 18.00
G_M23855_IG07: ;; offset=0060H
11000421 add w1, w1, #1
6B00003F cmp w1, w0
54FFFD8B blt G_M23855_IG03
;; bbWeight=4 PerfScore 8.00
G_M23855_IG08: ;; offset=006CH
0EB01E00 mov v0.8b, v16.8b
;; bbWeight=1 PerfScore 0.50
G_M23855_IG09: ;; offset=0070H
A8C17BFD ldp fp, lr, [sp],#16
D65F03C0 ret lr
;; bbWeight=1 PerfScore 2.00 On x64, we do the right thing - https://godbolt.org/z/67h7rzWaK We need to adjust the heuristic below to determine if it is profitable to use runtime/src/coreclr/jit/gentree.cpp Lines 3489 to 3494 in 7065279
Perhaps, this can adversely affect the code because of increased register pressure, but we should validate if that is the case.
|
@dotnet/jit-contrib |
Ah, GH was down when I submitted the issue and seems I hit it multiple times, so it created it multiple times. |
I forgot that I filed similar issue at #35257. Closing this one. |
Currently, if a double constant is something that can be encoded using
fmov
, we always load it usingfmov
without trying to CSE it. For cases where the occurrence of a constant is in a loop, we load it in every iteration. Below is an amplified version of the problem to understand how we generate todayOn x64, we do the right thing - https://godbolt.org/z/67h7rzWaK
We need to adjust the heuristic below to determine if it is profitable to use
fmov
or not. One criteria would be if it the usage is inside a hot block (compCurBB->bbWeight > BB_UNITY_WEIGHT
), then do not do load it usingfmov
.runtime/src/coreclr/jit/gentree.cpp
Lines 3489 to 3494 in 7065279
Perhaps, this can adversely affect the code because of increased register pressure, but we should validate if that is the case.
The text was updated successfully, but these errors were encountered: