Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: update jit config defaults for PGO #49267

Merged
merged 1 commit into from
Mar 9, 2021

Conversation

AndyAyersMS
Copy link
Member

Enable edge profiling by default, when jitting, or when prejitting for R2R.
Keep block profiling there for classic ngen (so there is an entry probe).

Enable minimal probing by default, when jitting. Keep full probing there for
prejitting so presence of counts in a method implies method was executed.

Enable class profiling by default when jitting. Still TBD if we will also
do this when prejitting.

Enable guarded devirtualization by default (will only kick in for PGO).

Adjust experimental CI legs to reflect the above.

Net effect is that when jitting, COMPlus_TieredPGO=1 turns on dynamic PGO
with the expected set of behaviors, without any other options.

We'll still need to set COMPlus_TC_QuickJitForLoops=1 to have methods with
loops pass through Tier0.

Enable edge profiling by default, when jitting, or when prejitting for R2R.
Keep block profiling there for classic ngen (so there is an entry probe).

Enable minimal probing by default, when jitting. Keep full probing there for
prejitting so presence of counts in a method implies method was executed.

Enable class profiling by default when jitting. Still TBD if we will also
do this when prejitting.

Enable guarded devirtualization by default (will only kick in for PGO).

Adjust experimental CI legs to reflect the above.

Net effect is that when jitting, COMPlus_TieredPGO=1 turns on dynamic PGO
with the expected set of behaviors, without any other options.

We'll still need to set COMPlus_TC_QuickJitForLoops=1 to have methods with
loops pass through Tier0.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 6, 2021
@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib

No diffs in Tier1 codegen for a number of tests I tried manually (versus non-minimal probing).

@jkotas
Copy link
Member

jkotas commented Mar 7, 2021

Keep block profiling there for classic ngen

Is block profiling generally useful to keep around, or is it just a classic ngen legacy?

If it is the latter, I think it would be ok to just make it no-op for classic ngen. It is what we have been doing in other places.

@AndyAyersMS
Copy link
Member Author

Is block profiling generally useful to keep around,

It is still used if OSR is enabled, since OSR methods can't consume edge profiles just yet.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-jit-experimental

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

There are some expected failures in the jit-experimental runs. However there is a novel failure in bytemark, with the assert:

Assertion failed 'weight >= BB_LOOP_WEIGHT_SCALE' in 'EMFloat:DivideInternalFPF(byref,byref,byref)' 
    during 'Update flow graph opt pass' (IL size 631)
File: D:\workspace\_work\1\s\src\coreclr\jit\optimizer.cpp Line: 385

This is the code in the optimizer that scales non-pgo loop counts.

Since this failure comes from the inline-pgo leg and we're not enabling the pgo inlining policy yet, I'm going to ignore these here.

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib anyone able to review?

@AndyAyersMS AndyAyersMS merged commit a4c765d into dotnet:main Mar 9, 2021
@AndyAyersMS AndyAyersMS deleted the JitPgoConfigUpdates branch March 9, 2021 16:35
@clamp03
Copy link
Member

clamp03 commented Mar 10, 2021

@AndyAyersMS
I tested PGO on crossgen2. It doesn't work. I found this patch that enables edge profiling by default. And I found edge profiler cannot allocate instrumentation data.

HRESULT res = info.compCompHnd->allocPgoInstrumentationBySchema(info.compMethodHnd, schema.data(),
(UINT32)schema.size(), &profileMemory);

because of the below source codes.
in crossge2

// Validate that each schema item is only used for a basic block count
for (uint iSchema = 0; iSchema < countSchemaItems; iSchema++)
{
if (pSchema[iSchema].InstrumentationKind != PgoInstrumentationKind.BasicBlockIntCount)
return HRESULT.E_NOTIMPL;
if (pSchema[iSchema].Count != 1)
return HRESULT.E_NOTIMPL;
}

in crossgen1 (R2R)
// Validate that each schema item is only used for a basic block count
for (UINT32 iSchema = 0; iSchema < countSchemaItems; iSchema++)
{
if (pSchema[iSchema].InstrumentationKind != PgoInstrumentationKind::BasicBlockIntCount)
return E_NOTIMPL;
if (pSchema[iSchema].Count != 1)
return E_NOTIMPL;
}

The default profiling method is changed, but other codes only support BasicBlockIntCount.
Could you please update?

@AndyAyersMS
Copy link
Member Author

@clamp03 thanks for pointing this out.

I'll change the defaults for prejitting back to block profiling.

Our plan going forward is to do instrumentation at runtime instead of when prejitting. See the dotnet-pgo spec.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this pull request Mar 10, 2021
Both crossgen1 and crossgen2 currently reject schemas that contain edge profile
counters. Change the jit defaults back to block profiling when prejitting.

See notes in dotnet#49267.
AndyAyersMS added a commit that referenced this pull request Mar 11, 2021
Both crossgen1 and crossgen2 currently reject schemas that contain edge profile
counters. Change the jit defaults back to block profiling when prejitting.

See notes in #49267.
@AndyAyersMS AndyAyersMS mentioned this pull request Mar 19, 2021
54 tasks
@AndyAyersMS
Copy link
Member Author

@davidwrighton now that I think about it, it seems like the static PGO path might want to disable the minimal probing so that static tooling can reason about what methods were executed and how often they were called.

I had though the jit/prejit distinction was the right one here, but both static and dynamic PGO leverage dynamic instrumentation, so the interesting distinction is really whether we're going to force methods to remain in Tier0, or let them get re-optimized.

I wonder if we should have a more direct signal for this instead of inferring it from say an unusually high setting for COMPLUS_TC_CallCountThreshold -- knowing what we're trying to measure might also influence the sampling scheme used for class profiling.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 18, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants