Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: profile checking through loop opts #99367

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Mar 6, 2024

Keep profile checks enabled until after we have finished running the loop optmizations (recall this is currently just checking for edge likelihood consistency).

Fix various maintenance issues to make this possible. Most are straightforward, but a few are not:

  • Whenever we create a new BBJ_COND we have to figure out the right likelihoods. If we're copying an existing one (say loop inversion) we currently duplicate the likelihoods. This is a choice, and it may not accurately represent what happends, but we have no better information.
  • If we invent new branching structures we need to put in reasonable likelihoods. For cloning we assume flowing to the cold loop is unlikely but can happen.

Block weights and edge likelihoods are not yet consistent. The plan is to get all the edge likelihoods "correct" and self-consistent, and then start rectifying edge likelihoods and block weights.

Contributes to #93020.

Diffs

Keep profile checks enabled until after we have finished running the
loop optmizations (recall this is currently just checking for edge
likelihood consistency).

Fix various maintenance issues to make this possible. Most are straightforward,
but a few are not:
* Whenever we create a new BBJ_COND we have to figure out the right
likelihoods. If we're copying an existing one (say loop inversion) we currently
duplicate the likelihoods. This is a choice, and it may not accurately
represent what happends, but we have no better information.
* If we invent new branching structures we need to put in reasonable
likelihoods. For cloning we assume flowing to the cold loop is unlikely
but can happen.

Block weights and edge likelihoods are not yet consistent. The plan is
to get all the edge likelihoods "correct" and self-consistent, and then
start rectifying edge likelihoods and block weights.

Contributes to dotnet#93020.
@ghost ghost assigned AndyAyersMS Mar 6, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 6, 2024
@ghost
Copy link

ghost commented Mar 6, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Keep profile checks enabled until after we have finished running the loop optmizations (recall this is currently just checking for edge likelihood consistency).

Fix various maintenance issues to make this possible. Most are straightforward, but a few are not:

  • Whenever we create a new BBJ_COND we have to figure out the right likelihoods. If we're copying an existing one (say loop inversion) we currently duplicate the likelihoods. This is a choice, and it may not accurately represent what happends, but we have no better information.
  • If we invent new branching structures we need to put in reasonable likelihoods. For cloning we assume flowing to the cold loop is unlikely but can happen.

Block weights and edge likelihoods are not yet consistent. The plan is to get all the edge likelihoods "correct" and self-consistent, and then start rectifying edge likelihoods and block weights.

Contributes to #93020.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@amanasifkhalid PTAL
cc @dotnet/jit-contrib

It's possible getting past loop opts was the hard part. We'll see.

@jakobbotsch
Copy link
Member

Would be good to run some extended testing.

@AndyAyersMS
Copy link
Member Author

Would be good to run some extended testing.

Also seeing some diffs, which is a bit unexpected; didn't think anything relied on likelihoods yet

// TODO: this is a bit of out sync with what we do for block weights.
// Reconcile.
//
const weight_t fastLikelihood = 0.999;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, where does this number of significant figures come from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just an arbitrary factor. For most cloned loops we actually never expect the cold loop to run, since we think it's execution may cause exceptions. But setting likelihood to zero seemed to drastic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optCloneLoop scales the fast loop blocks by 0.99. There is a comment about it:

// We assume that the fast path will run 99% of the time, and thus should get 99% of the block weights.
// The slow path will, correspondingly, get only 1% of the block weights. It could be argued that we should
// mark the slow path as "run rarely", since it really shouldn't execute (given the currently optimized loop
// conditions) except under exceptional circumstances.
const weight_t fastPathWeightScaleFactor = 0.99;
const weight_t slowPathWeightScaleFactor = 1.0 - fastPathWeightScaleFactor;

It seems like we should use the same likelihood in both places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. I will tweak this in the next round of changes.

Copy link
Member

@amanasifkhalid amanasifkhalid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Mar 6, 2024

Would be good to run some extended testing.

Also seeing some diffs, which is a bit unexpected; didn't think anything relied on likelihoods yet

CI diffs are a lot more modest than my local diffs. I will dig into a few and see what's up.

@AndyAyersMS
Copy link
Member Author

          55 (1.11 % of base) : 110047.dasm - OrchardCore.ContentManagement.Display.ContentDisplay.ContentPartDisplayDriverResolver:GetDisplayModeDrivers(System.String,System.String):System.Collections.Generic.IList`1[OrchardCore.ContentManagement.Display.ContentDisplay.IContentPartDisplayDriver]:this (Tier1)
        -165 (-12.87 % of base) : 71288.dasm - System.Collections.Concurrent.ConcurrentDictionary`2[Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.ModelMetadataIdentity,Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadataProvider+ModelMetadataCacheEntry]:TryGetValueInternal(System.Collections.Concurrent.ConcurrentDictionary`2+Tables[Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.ModelMetadataIdentity,Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadataProvider+ModelMetadataCacheEntry],Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.ModelMetadataIdentity,int,byref):ubyte (Tier1)

We mark blocks rare that were not rare before, and this modifies layout and alters allocation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants