-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: extend redundant relop to handle side effects #61275
JIT: extend redundant relop to handle side effects #61275
Conversation
Handle the case where the side-effecting redundant relop appears just before the jump tree relop. Also revamp how we search for related VNs so it can be done as a search loop.
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsHandle the case where the side-effecting redundant relop appears Also revamp how we search for related VNs so it can be done as a
|
cc @dotnet/jit-contrib SPMI diffs (as before, somewhat inflated because of the checked SPC) aspnet.run.windows.x64.checked.mch:
Detail diffs
benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
libraries.crossgen2.windows.x64.checked.mch:
Detail diffs
libraries.pmi.windows.x64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.x64.checked.mch:
Detail diffs
|
Sample diff. Note we could not get this before because ;; before
; Assembly listing for method System.IO.Strategies.OSFileStreamStrategy:Flush(bool):this
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; fully interruptible
; No matching PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 1 inlinees without PGO data
; Final local variable assignments
;
; V00 this [V00,T00] ( 5, 3.50) ref -> rcx this class-hnd single-def
; V01 arg1 [V01,T01] ( 3, 3 ) bool -> rdx single-def
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [rsp+00H] "OutgoingArgSpace"
; V03 tmp1 [V03,T03] ( 2, 1 ) bool -> rax "Inline return value spill temp"
; V04 tmp2 [V04,T02] ( 2, 2 ) ref -> rax class-hnd single-def "Inlining Arg"
;
; Lcl frame size = 0
G_M9306_IG01: ; gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref, nogc <-- Prolog IG
;; bbWeight=1 PerfScore 0.00
G_M9306_IG02: ; gcrefRegs=00000002 {rcx}, byrefRegs=00000000 {}, byref, isz
; gcrRegs +[rcx]
test dl, dl
je SHORT G_M9306_IG05
;; bbWeight=1 PerfScore 1.25
G_M9306_IG03: ; gcrefRegs=00000002 {rcx}, byrefRegs=00000000 {}, byref, isz
mov rax, gword ptr [rcx+16]
; gcrRegs +[rax]
test byte ptr [rax+16], 1
jne SHORT G_M9306_IG05
test byte ptr [rcx+48], 2
setne al
; gcrRegs -[rax]
movzx rax, al
test eax, eax
je SHORT G_M9306_IG05
mov rcx, gword ptr [rcx+16]
;; bbWeight=0.50 PerfScore 6.75
G_M9306_IG04: ; , epilog, nogc, extend
jmp System.IO.Strategies.FileStreamHelpers:FlushToDisk(Microsoft.Win32.SafeHandles.SafeFileHandle)
; gcr arg pop 0
;; bbWeight=0.50 PerfScore 1.00
G_M9306_IG05: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, epilog, nogc
; gcrRegs -[rcx]
ret
; Total bytes of code 38, prolog size 0, PerfScore 13.30, instruction count 13, allocated bytes for code 38 (MethodHash=e8aedba5) for method System.IO.Strategies.OSFileStreamStrategy:Flush(bool):this
;; after
; Assembly listing for method System.IO.Strategies.OSFileStreamStrategy:Flush(bool):this
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; fully interruptible
; No matching PGO data
; 0 inlinees with PGO data; 1 single block inlinees; 1 inlinees without PGO data
; Final local variable assignments
;
; V00 this [V00,T00] ( 5, 3.50) ref -> rcx this class-hnd single-def
; V01 arg1 [V01,T01] ( 3, 3 ) bool -> rdx single-def
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [rsp+00H] "OutgoingArgSpace"
;* V03 tmp1 [V03,T03] ( 0, 0 ) bool -> zero-ref "Inline return value spill temp"
; V04 tmp2 [V04,T02] ( 2, 2 ) ref -> rax class-hnd single-def "Inlining Arg"
;
; Lcl frame size = 0
G_M9306_IG01: ; gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref, nogc <-- Prolog IG
;; bbWeight=1 PerfScore 0.00
G_M9306_IG02: ; gcrefRegs=00000002 {rcx}, byrefRegs=00000000 {}, byref, isz
; gcrRegs +[rcx]
test dl, dl
je SHORT G_M9306_IG05
;; bbWeight=1 PerfScore 1.25
G_M9306_IG03: ; gcrefRegs=00000002 {rcx}, byrefRegs=00000000 {}, byref, isz
mov rax, gword ptr [rcx+16]
; gcrRegs +[rax]
test byte ptr [rax+16], 1
jne SHORT G_M9306_IG05
test byte ptr [rcx+48], 2
je SHORT G_M9306_IG05
mov rcx, gword ptr [rcx+16]
;; bbWeight=0.50 PerfScore 6.00
G_M9306_IG04: ; , epilog, nogc, extend
jmp System.IO.Strategies.FileStreamHelpers:FlushToDisk(Microsoft.Win32.SafeHandles.SafeFileHandle)
; gcr arg pop 0
;; bbWeight=0.50 PerfScore 1.00
G_M9306_IG05: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, epilog, nogc
; gcrRegs -[rax rcx]
ret
;; bbWeight=0.50 PerfScore 0.50
; Total bytes of code 30, prolog size 0, PerfScore 11.75, instruction count 10, allocated bytes for code 30 (MethodHash=e8aedba5) for method System.IO.Strategies.OSFileStreamStrategy:Flush(bool):this @dotnet/jit-contrib PTAL |
ValueNumPair origVNP = substituteTree->gtVNPair; | ||
|
||
// Update the tree. Note we don't actually swap operands...? | ||
// | ||
substituteTree->SetOper(GenTree::ReverseRelop(substituteTree->OperGet())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment incomplete?
Also, we can use Nevermind, I misread the code.substitudeTree->SetOper(..., ValueNumberUpdate::PRESERVE_VN);
here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one question: I notice we allow exception side effects in the JTRUE tree, is there a risk that we swap the order of side effects in the RHS of the assignment and the JTRUE?
Good point. I need to verify that the RHS exec set "covers" the JTRUE set so any exception we would have thrown we'll still throw. |
Not sure exc set overing is sufficient to prove that we'll throw the same exception for the same reason, but seems like it may be close enough? Will think about this some more. The extra check didn't lead to any new diffs. [EDIT] -- seems like CSE would run into this problem too -- we verify two trees are value-congruent and that all uses have exception sets that are (roughly speaking) the intersection of all the def sets. |
Formatter failure seems like some random CI glitch.
Ditto with the arm build failure:
|
Formatting is good locally, so will merge and hope for the best. |
Windows-x64 improvements: dotnet/perf-autofiling-issues#2373 |
Handle the case where the side-effecting redundant relop appears
just before the jump tree relop.
Also revamp how we search for related VNs so it can be done as a
search loop.