-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline properties during importation #96325
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsWhen we see a call to something in the shape of a property, we check if it does the same thing as an auto property and then pretend that we saw the LDFLD or STFLD instead of the CALL in the caller. This bypasses most of the inliner complexity and managed to improve the startup time of AvaloniaILSpy in my tests from 1483ms to 1373ms (7.5% improvement) @EgorBo as discussed in discord
|
afbe3ab
to
0023bf1
Compare
src/coreclr/jit/importer.cpp
Outdated
@@ -8685,6 +8687,24 @@ void Compiler::impImportBlockCode(BasicBlock* block) | |||
combine(combine(CORINFO_CALLINFO_ALLOWINSTPARAM, CORINFO_CALLINFO_SECURITYCHECKS), | |||
(opcode == CEE_CALLVIRT) ? CORINFO_CALLINFO_CALLVIRT : CORINFO_CALLINFO_NONE), | |||
&callInfo); | |||
|
|||
if (callInfo.kind == CORINFO_CALL && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be skipped for debug code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impTryFindField
calls canInline
which does this check
runtime/src/coreclr/vm/jitinterface.cpp
Lines 7879 to 7887 in 9fa5128
// If the callee wants debuggable code, don't allow it to be inlined | |
{ | |
// Combining the next two lines, and eliminating jitDebuggerFlags, leads to bad codegen in x86 Release builds using Visual C++ 19.00.24215.1. | |
CORJIT_FLAGS jitDebuggerFlags = GetDebuggerCompileFlags(pCallee->GetModule(), CORJIT_FLAGS()); | |
if (jitDebuggerFlags.IsSet(CORJIT_FLAGS::CORJIT_FLAG_DEBUG_CODE)) | |
{ | |
result = INLINE_NEVER; | |
szFailReason = "Inlinee is debuggable"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is for callee. Inlining needs to be disabled for debuggable caller too. It is typically done by checking opts.compDbgCode
in the JIT.
src/coreclr/jit/importercalls.cpp
Outdated
@@ -32,6 +32,107 @@ | |||
#pragma warning(disable : 21000) // Suppress PREFast warning about overly large function | |||
#endif | |||
|
|||
bool Compiler::impTryFindField(CORINFO_METHOD_HANDLE methHnd, CORINFO_RESOLVED_TOKEN* pResolvedToken, OPCODE* opcode) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should issue beginInlining
/ reportInliningDecision
callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like we can end up issuing duplicate callbacks now (e.g. this one failing, then regular inlining succeeding). In general it seems unfortunate to end up with two separate inliners if we can avoid it, so some way of integrating this behavior into the existing inliner would be much preferred. For example, this new inlining should most likely affect the inline tree that gets reported by reportRichMappings
too, and proper debug information should be created to refer to it.
ab9cfbb
to
36c6db4
Compare
d067dc1
to
3560ea4
Compare
b99b37d
to
b46461b
Compare
Tested this PR on a large 1P service: Before
After
|
@AndyAyersMS the PR is fully functional and correct enough to pass CI and improves startup speed as described in the initial post. My hope was to get more numbers on this version (which will have the lowest possible overhead) and then investigate jakobs suggestion to integrate it more with the existing inliner, although i don't really see how that would work, and then get new numbers. Then we could compare this version, which might be a little bit ugly, with the architectural nicer version, to decide if the ugly'ness is worth it or not. Egor did try inlining property sized methods in tier 0 code before but in his experiment, it regressed startup time (which motivated this PR in the first place), so i'm not hopeful that integrating this into the existing inliner will yield the same benefit. Egors experiment was based on a size treshold though, instead of precisely analyzing properties by IL as i do, which might move the bar but i would need to try. |
I've just checked - it has no visible impact on TE. Mainly because TE benchmarks are too simplistic, there are only like 5k methods to jit. At the same time, this PR has nice impact on the 1P service I posted in #96325 (comment)
Yes, mine was too generic, I expect yours to be more efficient in that scenario. The main issue with startup is that inlining may trigger unnecessary type load events etc, but, presumably, your case should not do that. So I'd just call the existing inline routine if an inlinee is precisely a getter/setter (so that we can avoid duplicating the logic and solve the debug info concerns Jakob raised). I'll collect new numbers if you do that |
@Suchiman, could you please take care of this? |
b46461b
to
c319a5b
Compare
@EgorBo, PTAL at this community PR. |
@Suchiman, we see that you pushed some commits. Is it ready for code review or are you still working on it? |
I'm still working on the feedback, but am currently on vacation |
When we see a call to something in the shape of a property, we check if it does the same thing as an auto property and then pretend that we saw the LD(S)FLD or ST(S)FLD instead of the CALL in the caller - resolve tokens in the scope and with the generic context of the callee instead of the caller - call getFieldInfo with the scope of the callee to make visibility checks happy - make sure the callee is non virtual - respect basic no inlining instructions - getMethodInfo returns a success bool but might also throw
c319a5b
to
9afe6f6
Compare
I've started work on the alternative approach here Suchiman@ad2c505 but i still need to play whack-a-mole with all the asserts that popped up from not having "MINOPTS" flags set in tier 0 compilation (OSR is specifically unhappy about that). Is that going in the right direction @EgorBo ?
|
@EgorBo, PTAL. |
Taking a look |
Sorry for the delayed response. We plan to eventually move inliner into an utility we can invoke from different phases, but we're not there yet. I agree with Jakob that the way it's written in this PR looks a bit fragile due to copy-paste from the general inliner making it more complicated to maintain and it should be somehow unified. |
I've been looking at unifying
I'll try to give that a shot, otherwise, i'm running thin on options and motivation |
When we see a call to something in the shape of a property, we check if it does the same thing as an auto property and then pretend that we saw the LDFLD or STFLD instead of the CALL in the caller. This bypasses most of the inliner complexity and managed to improve the startup time of AvaloniaILSpy in my tests from 1483ms to 1373ms (7.5% improvement)
@EgorBo as discussed in discord