Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option of retainOnlyTopFiles to jit-analyze #347

Merged
merged 3 commits into from
Nov 17, 2021

Conversation

kunalspathak
Copy link
Member

RetainBigDiffFilesOnly will make jit-analyze to delete the .dasm files and retain just the top count files that contains largest diffs. This will help us in reducing the .dasm files size when we upload them as part of CI. By default, this option is off and hence we will never delete those files.

Thoughts?

@kunalspathak
Copy link
Member Author

kunalspathak commented Nov 16, 2021

@dotnet/jit-contrib , @BruceForstall - any thoughts?

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea seems good. Some comments/suggestions:

  1. We already have a list of the base/diff files (baseList / diffList in Main) in a FileInfo list, that includes the full path. Those paths are copied into the FileDelta structure. So it seems like we shouldn't do a directory search for files to delete, we should use the file paths we already have.
  2. What if you initialize a member of FileInfo (or FileDelta), retain = false, only used if the retain option is used, set it to true in the DisplayFileMetric code, then simply iterate over the lists towards the end of Main deleting files where this is false?
  3. Nit: name arg -retainOnlyTopFiles instead? Why? "Diff" implies it wouldn't delete "Base" files (which it would/should). "Top" implies the highest metric value, which might not be the "biggest". Maybe even more general: "-retainReportedFiles", meaning to retain any files on which we report a regression/improvement. We might choose to report both the largest, weighted, code size diffs. But we might also want to report the smallest, which are often the easiest to analyze. In fact, for the purpose of saving things in the CI, it would be quite useful to include both those sets.

@kunalspathak
Copy link
Member Author

Done.

@kunalspathak kunalspathak changed the title Add option of RetainBigDiffFilesOnly to jit-analyze Add option of retainOnlyTopFiles to jit-analyze Nov 17, 2021
@kunalspathak
Copy link
Member Author

Related PR where this option will be consumed - dotnet/runtime#61700

@kunalspathak
Copy link
Member Author

But we might also want to report the smallest, which are often the easiest to analyze. In fact, for the purpose of saving things in the CI, it would be quite useful to include both those sets.

I will keep that as a future improvement. Let us monitor and see how big the size of top (largest) subset is and if there is a need to see at smaller subset. I usually just pick the smallest of the top files that are reported.

src/jit-analyze/jit-analyze.cs Outdated Show resolved Hide resolved
Co-authored-by: Bruce Forstall <brucefo@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants