Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create superpmi-asmdiffs pipeline #61194

Merged
merged 21 commits into from
Nov 10, 2021

Conversation

BruceForstall
Copy link
Member

@BruceForstall BruceForstall commented Nov 4, 2021

Create a new runtime-coreclr superpmi-asmdiffs pipeline that runs SuperPMI asmdiffs for every change in the JIT directory.

The diffs are run on two platforms: Windows x64 and Windows x86. Linux, and Arm64 and Arm32, asm diffs are done using cross-compilers, as follows:

Platform Asm diffs
Windows x64 win-x64, win-arm64, linux-x64, linux-arm64
Windows x86 win-x86, linux-arm

The resulting summary .md files are uploaded into the pipeline artifacts, one .md file per platform (so, one for the Windows x64 runs and one for the Windows x86 runs). The results are also displayed in "Extensions" page of the AzDO pipeline.

It looks like the runs take about 50 minutes to complete (assuming not much waiting for machines).

The asm diffs pipeline is similar to the "superpmi-replay" pipeline, except:

  1. It determines what an appropriate baseline JIT would be based on the PR commit and how it merges with the main branch. Given this, it downloads the matching baseline JITs from the JIT rolling build artifacts in Azure Storage.
  2. It clones the jitutils repo and builds the jit-analyze tool, needed to generate the summary .md file.
  3. It downloads and adds to the Helix machine payload a "portable" git installation, as git diff is used by jit-analyze for analyzing the generated .dasm files of the diff.
  4. It collects all the various summary.md files into one per platform on which the runs are done, and publishes that to the artifacts and the Extensions page.
  5. It only does one replay (asmdiffs) run, not one for each of a set of multiple stress modes.

As part of this implementation,
a. The azdo_pipelines_util.py was renamed to jitutil.py, and a lot of utility functions from superpmi.py were moved over to it. This was mostly to share the code for downloading and uncompressing .zip files. (There might be some slight changes to the output from the superpmi.py download commands that I'll have to look into.) However, I also moved a bunch of simple, more general helpers, for possible future sharing.
b. jitrollingbuild.py download can now take no arguments and download a baseline JIT (from the JIT rolling build Azure Storage location), for the current enlistment, to the default location. Previously, it required a specific git_hash and target directory. There is similar logic in superpmi.py, but not quite the same.
c. The superpmi.py --no_progress option was made global, and applied in a few more places.

Fixes #59445

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 4, 2021
@ghost
Copy link

ghost commented Nov 4, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

`asmdiffs` can download coredistools, for example, so don't
output download progress if the user wants to suppress it.

Also, move the `download` log file to the "uploads" area so it gets uploaded
to the AzDO files.

Temporarily, only do diffs with benchmarks, so speed things up
Add downloading of portable git for use on Helix for jit-analyze.
It is added to the correlation payload that is copied to Helix machines
(currently, we only use Windows Helix machines).
Overall summary.md file should be uploaded as well.
@BruceForstall
Copy link
Member Author

Sample run AzDO results with forced diffs (loop cloning disabled for diff compiler), with summary in the "Extensions" tab:

https://dev.azure.com/dnceng/public/_build/results?buildId=1457493&view=ms.vss-build-web.run-extensions-tab

and the published artifacts contain the "overall summary" Markdown files in the SuperPMI_Logs_<arch>_checked folders:

https://dev.azure.com/dnceng/public/_build/results?buildId=1457493&view=artifacts&pathAsName=false&type=publishedArtifacts

Add a few more comments
@BruceForstall BruceForstall marked this pull request as ready for review November 6, 2021 06:58
@BruceForstall
Copy link
Member Author

@dotnet/jit-contrib This is ready for review. PTAL.

@BruceForstall
Copy link
Member Author

Some ideas for improvements:

  • If there are no diffs, make that explicit, don't just show empty "overall_....md" files in the Extensions page
  • Create summary.md files for other metrics, especially PerfScore. Maybe these should be summarized as one liners in the Extensions page, but not fully expanded, to prevent that page from getting too cluttered. Make them available for download.
  • Improve the headers in the overall summary output, e.g., add per-platform headers.
  • Q: can there be multiple levels of "disclosure" / tree in the output, so we can "expand" to view just one platform at a time, to prevent clutter?
  • Make sure that JIT assertions, if any, are shown or at least reported very prominently in the Extensions output.
  • Figure out why the recent runs aren't getting good data when looking for a JIT baseline (e.g., https://dev.azure.com/dnceng/public/_build/results?buildId=1457968&view=logs&j=011e1ec8-6569-5e69-4f06-baf193d1351e&t=bf6cf4cf-6432-59cf-d384-6b3bcf32ede2)

@AndyAyersMS
Copy link
Member

The extensions thing is pretty cool; I had no idea that existed.

I find the current format requires a lot of scrolling to fully comprehend -- there is a lot of data and the interesting bits can be buried. It would be nice to have everything in one table and perhaps sorted by largest impact or some such. Doesn't work as well for detail/expand collapse but perhaps we can get to this via internal links or something?

Also (assuming interesting/ unexpected diffs appear) is it obvious how to repro exactly what the CI did?

Maybe echo out the repro commands somewhere (including exact base jit version used / exact collections used if we ever start versioning beyond the jit guid)

@BruceForstall
Copy link
Member Author

I find the current format requires a lot of scrolling to fully comprehend

The example I gave (here) has a lot of diffs, because I disabled loop cloning to force diffs. However, we are generating diffs for 6 platforms. And we propose to also generate PerfScore diff results.

Do you have any suggestion on how to summarize this better? Kunal suggested something like "Impact" table here.

Printing out repro commands makes sense. Can probably specify the precise baseline git hash used, but assume the diff is the current tree, and then just specify the correct -arch, -target_arch, -target_os options.

Note, however, I don't want to gate this change on creating the "perfect" output.

@AndyAyersMS
Copy link
Member

Do you have any suggestion on how to summarize this better? Kunal suggested something like "Impact" table here.

Yes, something very much like that would be great.

Note, however, I don't want to gate this change on creating the "perfect" output.

Agreed -- no need to hold this up based on my feedback; it is quite useful even as-is (eg trigger on community PRs that we think should be no diff). As we get mileage on it we will figure out enhancements to make it more useful.

@BruceForstall
Copy link
Member Author

eg trigger on community PRs that we think should be no diff

Currently it's set to trigger on every PR that touches the JIT directory. I think we should keep that level of checking.

@kunalspathak
Copy link
Member

One more optimization we spoke about it to skip creating .dasm files altogether and just return the metrics numbers but it is currently blocked by #52877.

This affects `superpmi.py download` output
@BruceForstall
Copy link
Member Author

@kunalspathak, others: I updated this change for the feedback.

@BruceForstall
Copy link
Member Author

Here's the Extensions page result we now generate if there are no diffs:

https://dev.azure.com/dnceng/public/_build/results?buildId=1462505&view=ms.vss-build-web.run-extensions-tab

This run completed successfully except the aspnet collection replay has MISSING data, and that is reported as an error code.

…script

There will be more changes required if we ever run on non-Windows platforms,
so don't keep these partial measures.
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for doing this, can't wait to trigger it on one of my PRs.

@kunalspathak
Copy link
Member

I don't see the logic to upload other metric summary.md, e.g. PerfScore. Are you planning to do that in a follow-up PR?

@AndyAyersMS
Copy link
Member

Thanks for doing this, can't wait to trigger it on one of my PRs.

Agree -- thanks for working on this.

👍

@BruceForstall
Copy link
Member Author

I don't see the logic to upload other metric summary.md, e.g. PerfScore. Are you planning to do that in a follow-up PR?

True, I haven't done that.

I can pass -metrics CodeSize -metrics PerfScore to superpmi.py to add PerfScore to the summary.md. Then, PerfScore and CodeSize will be mixed together in the MD files. IMO, that's ok, even if it potentially creates additional "clutter". However, it disables the recent work to use the "actual" code size instead of just the code size for functions with diffs, which is unfortunate. See #61254 (comment).

Generating the PerfScore metrics separately would require iterating over all the respective MCH directories, invoking jit-analyze specifically for each base/diff pair, and then collecting a summary_PerfScore.md file, which would be separately summarized and uploaded. This is more work than I want to do.

So I think for now I'm not going to add additional metrics to the summary.

@kunalspathak
Copy link
Member

However, it disables the recent work to use the "actual" code size instead of just the code size for functions with diffs, which is unfortunate.

What do you mean? It will print the PerfScore diff at least, right? It won't be accurate with actual PerfScore of everything combined?

@BruceForstall
Copy link
Member Author

What do you mean? It will print the PerfScore diff at least, right? It won't be accurate with actual PerfScore of everything combined?

The implementation of the accurate CodeSize doesn't kick in when any metrics are specifically requested:

if self.coreclr_args.metrics:
command += [ "--metrics", ",".join(self.coreclr_args.metrics) ]
elif base_bytes is not None and diff_bytes is not None:
command += [ "--override-total-base-metric", str(base_bytes), "--override-total-diff-metric", str(diff_bytes) ]

There is no "accurate" total PerfScore number currently, just CodeSize.

@kunalspathak
Copy link
Member

Can we do something like this in superpmi.py itself? I am mostly interested in PerfScore and trying to see easy way to get it in along with this PR.

      command = [ jit_analyze_path, "--md", md_summary_file, "-r", "--base", base_asm_location, "--diff", diff_asm_location ]

      if base_bytes is not None and diff_bytes is not None:
          m_command = command + [ "--override-total-base-metric", str(base_bytes), "--override-total-diff-metric", str(diff_bytes) ]

      run_and_log(m_command, logging.INFO)

    # run again for non-code size metrics.
     non_code_size_metrics = [m for m in self.coreclr_args.metrics if m != 'CodeSize']
      if len(non_code_size_metrics ) > 0:
          m_command = command + [ "--metrics", ",".join(non_code_size_metrics) ]
          run_and_log(m_command, logging.INFO)

@kunalspathak
Copy link
Member

By the way, I didn't realize until now that we can pass -metric to superpmi.py script.

@BruceForstall
Copy link
Member Author

Something like that would work, but we'd then have two .md files, or have to force the second jit-analyze to append to the first one.

I think a better solution is to pass "override" data on a per-metric basis. That is, teach jit-analyze to understand --override-total-base-metric <metric>,<value> instead of just --override-total-base-metric <value> (and support multiple overrides, one per metric). Then, superpmi.py could invoke jit-analyze exactly once. And for now, just pass the CodeSize override arguments.

A not-mutually-exclusive alternative is to teach superpmi.py to (optionally?) create one summary.md file for each metric, e.g., summary_<metric>.md, by either invoking jit-analyze multiple times, or teaching jit-analyze to split the results (the latter would presumably be much faster).

@kunalspathak
Copy link
Member

or teaching jit-analyze to split the results (the latter would presumably be much faster).

I agree. We can just ask jit-analyze to dump the metrics to <md_filename_from_args>.metrics_name.md.

But then, we need to still change the logic around passing --override-total-base-metric to jit-analyze, right?

@BruceForstall
Copy link
Member Author

But then, we need to still change the logic around passing --override-total-base-metric to jit-analyze, right?

Yes, or generalize it as I suggest.

@kunalspathak
Copy link
Member

Yes, or generalize it as I suggest.

Ok, whatever is easy and clean is fine then.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

superpmi: create asm diffs AzDO pipeline
3 participants