Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* cse-tuning branch 1. Changed csdLiveAcrossCall to a bool (zero-diff) * 2. Added the remaining zero-diff changes from my old coreclr branch (zero-diff) * 3. Incoming stack arguments don't use any local stack frame slots x64 5 improvements 0 regressions, Total PerfScore diff: -10.72 x86 16 improvements 5 regressions, Total PerfScore diff: -72.95 * 4. Locals with no references aren't enregistered (zero-diffs) * 5. Fix handling of long integer types, they only use one register not two. x64 250 improvements 51 regressions, Total PerfScore diff: -459.09 arm64 162 improvements 16 regressions, Total PerfScore diff: -1712.52 * 6. Adjust computation of moderateRefCnt and aggressiveRefCnt values x64 280 improvements 81 regressions, Total PerfScore diff: -274.78 arm64 264 improvements 61 regressions, Total PerfScore diff: -911.00 x86 87 improvements 42 regressions, Total PerfScore diff: -123.46 arm32 195 improvements 81 regressions, Total PerfScore diff: -239.10 * 7. slotCount refactor (zero-diffs) * 8. Enable the use of the live across call information x64 125 improvements 136 regressions, Total PerfScore diff: +427.43 arm64 83 improvements 153 regressions, Total PerfScore diff: +260.68 x86 218 improvements 193 regressions, Total PerfScore diff: +199.81 arm32 145 improvements 181 regressions, Total PerfScore diff: -33283.10 arm32 method with improvement: -33864.40 (-2.87% of base) : System.Private.CoreLib.dasm - TypeBuilder:CreateTypeNoLock():TypeInfo:this (2 methods) * 9. Adjust the cse_use_costs for the LiveAcrossCall case x64 61 improvements 61 regressions, Total PerfScore diff: -189.03 arm64 90 improvements 49 regressions, Total PerfScore diff: -463.42 x86 88 improvements 80 regressions, Total PerfScore diff: -238.61 arm32 101 improvements 63 regressions, Total PerfScore diff: -259.50 * 10. If this CSE is live across a call then we may need to spill an additional caller save register x64 73 improvements 45 regressions, Total PerfScore diff: -279.88 arm64 45 improvements 76 regressions, Total PerfScore diff: -90.94 x86 13 improvements 14 regressions, Total PerfScore diff: -21.55 arm32 45 improvements 33 regressions, Total PerfScore diff: -78.60 * 11. (x64 only) floating point loads/stores encode larger, so adjust the cse def/use cost for SMALL_CODE No diffs in System.Private.Corelib * 12. Remove extra cse de/use costs for methods that have a largeFrame or a hugeFrame x64 199 improvements 50 regressions, Total PerfScore diff: -2061.36 arm64 11 improvements 3 regressions, Total PerfScore diff: -46.84 x86 136 improvements 80 regressions, Total PerfScore diff: -1795.00 arm32 50 improvements 35 regressions, Total PerfScore diff: -132.30 * clang-format * Code review feedback Removed increment of enregCount on _TARGET_X86_ when we have compLongUsed: Framework diffs Total PerfScoreUnits of diff: -654.75 (-0.00% of base) diff is an improvement. 79 total methods with Perf Score differences (55 improved, 24 regressed), 146432 unchanged. Fixed setting of largeFrame/hugeFrame for ARM64 Zero framework diffs. : * run jit-format * correct some wording in comments * reword a comment
- Loading branch information