Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Run 3-opt once across all regions #111989

Merged
merged 1 commit into from
Jan 30, 2025

Conversation

amanasifkhalid
Copy link
Member

@amanasifkhalid amanasifkhalid commented Jan 30, 2025

Part of #107749. 3-opt layout already refuses to align branches that cross EH regions, so there doesn't seem to be much utility in reordering each EH region independently. Thus, remove 3-opt's per-region ordering constraints, and run 3-opt once.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 30, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

/azp run runtime-coreclr libraries-jitstress, runtime-coreclr libraries-pgo

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@amanasifkhalid amanasifkhalid marked this pull request as ready for review January 30, 2025 16:32
@amanasifkhalid
Copy link
Member Author

libraries-pgo failures are from profile inconsistencies during loop inversion. #111684 has a fix for this.

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs show churn from 3-opt considering edges in a different order now.

This change isn't all that important on its own, but for a while now, I've wanted to consolidate all of our separate layout phases (RPO order, fgMoveHotJumps, cold code motion, and 3-opt) into something simpler. Instead of all these intermediate reorderings, I was thinking we'd feed a loop-aware RPO traversal that skipped all the cold blocks into 3-opt directly, and then reorder the block list once, while respecting EH invariants such that we don't need to completely rebuild EH regions. To get an idea of what this would look like, I have a prototype in #112004.

While working on this prototype, I found 3-opt's per-region ordering semantics to be problematic, since they implicitly depend on the initial layout keeping EH regions contiguous, or else the ordering of a nested region could change the end block of the parent region, breaking the bounds checking of the next 3-opt iteration. This seems like a fragile invariant, and getting 3-opt to only consider branches within a region is pretty simple, so I found it easiest to run 3-opt once for the entire hot region, hence this PR.

If my plan doesn't seem ill-advised to you, I'd succeed this PR with the following:

  • Remove fgMoveHotJumps -- I don't think this has been all that important ever since we turned 3-opt on. Removing it churns the initial layout into 3-opt, which can lead to subtly different layouts that still have locally-minimal scores, so I don't think the churn should dissuade us too much. Regressions from this ought to serve as motivation for improving 3-opt's ability to find globally-minimal layout scores.
  • Get rid of RPO layout and cold code motion. In their place, compute a loop-aware RPO that skips the cold blocks (such that they "sink" to the end of the method), and feed this into 3-opt as its initial layout. This has the nice side effect of never scrambling the block list in between phases, plus we can simplify our post-layout EH region repair to only fix the exception table, since we never split up the actual regions.
  • Clean up any remaining invariants that can be relaxed.

Thanks!

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the plan you're proposing makes sense, we should have just one primary algorithm for relocating blocks.

@amanasifkhalid amanasifkhalid merged commit 84f4c2a into dotnet:main Jan 30, 2025
122 of 129 checks passed
@amanasifkhalid amanasifkhalid deleted the run-3-opt-once branch January 30, 2025 18:15
grendello added a commit to grendello/runtime that referenced this pull request Jan 30, 2025
* main: (31 commits)
  More native AOT Pri-1 test tree bring up (dotnet#111994)
  Fix BigInteger outerloop test (dotnet#111841)
  JIT: Run 3-opt once across all regions (dotnet#111989)
  JIT: Check for profile consistency throughout JIT backend (dotnet#111684)
  [JIT] Add legacy extended EVEX encoding and EVEX.ND/NF feature to x64 emitter backend (dotnet#108796)
  [iOS][globalization] Fix IndexOf on empty strings on iOS to return -1 (dotnet#111898)
  System.Speech: Use intellisense xml from dotnet-api-docs (dotnet#111983)
  [mono][mini] Disable inlining if we encounter class initialization failure (dotnet#111754)
  [main] Update dependencies from dotnet/roslyn (dotnet#111946)
  Update dependencies from https://github.com/dotnet/arcade build 20250129.2 (dotnet#111996)
  Try changing the ICustomQueryInterface implementation to always return NotHandled instead of Failed to defer back to the ComWrappers impl. (dotnet#111978)
  Combined dependency update (dotnet#111852)
  Replace OPTIMIZE_FOR_SIZE with feature switch (dotnet#111743)
  Fix failed assertion 'FPbased == FPbased2' (dotnet#111787)
  Add remark to `ConditionalSelect` (dotnet#111945)
  JIT: fix try region cloning when try is nested in a handler (dotnet#111975)
  Use IRootFunctions in Tensor.StdDev (dotnet#110641)
  Remove zlib dependencies from Docker containers (dotnet#111939)
  Avoid `Unsafe.As` for `Memory<T>` and `ReadOnlyMemory<T>` conversion (dotnet#111023)
  Cleanup membarrier portability (dotnet#111943)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants