Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in System.Numerics.Tests.Constructor #68410

Closed
performanceautofiler bot opened this issue Apr 5, 2022 · 4 comments
Closed

Regressions in System.Numerics.Tests.Constructor #68410

performanceautofiler bot opened this issue Apr 5, 2022 · 4 comments
Assignees
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI runtime-coreclr specific to the CoreCLR runtime
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x86
OS Windows 10.0.18362
Baseline 631389fb941eb618fef883975ca5f4f5b03a93cf
Compare 9b2e2a830a4e2e67c920aa200329533baba5c363
Diff Diff

Regressions in System.Numerics.Tests.Constructor

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
SpanCastBenchmark_Double - Duration of single invocation 5.92 ns 7.14 ns 1.21 0.01 False
SpanCastBenchmark_Int16 - Duration of single invocation 6.08 ns 7.14 ns 1.17 0.07 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Numerics.Tests.Constructor*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Constructor.SpanCastBenchmark_Double


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 7.139542309121308 > 6.213530397532811.
IsChangePoint: Marked as a change because one of 3/7/2022 10:45:01 PM, 3/11/2022 8:42:43 PM, 3/31/2022 5:34:47 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -166.00530257814987 (T) = (0 -7.1661131362762305) / Math.Sqrt((0.0013573709364404836 / (35)) + (0.0003398131217755381 / (20))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (20) - 2, .025) and -0.20919156977653053 = (5.926367099632189 - 7.1661131362762305) / 5.926367099632189 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Numerics.Tests.Constructor.SpanCastBenchmark_Int16

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 7.138994615986093 > 6.382707076214422.
IsChangePoint: Marked as a change because one of 2/23/2022 10:45:30 PM, 3/7/2022 10:45:01 PM, 3/31/2022 5:34:47 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -17.371584344969687 (T) = (0 -7.160887237990245) / Math.Sqrt((0.11903633236762591 / (35)) + (0.0014548617096536063 / (20))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (20) - 2, .025) and -0.16683325346415603 = (6.137027048835492 - 7.160887237990245) / 6.137027048835492 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr untriaged New issue has not been triaged by the area owner labels Apr 5, 2022
@kunalspathak
Copy link
Member

Introduced in #65803 @SingleAccretion

@kunalspathak kunalspathak changed the title [Perf] Changes at 3/31/2022 9:50:44 PM Regressions in System.Numerics.Tests.Constructor Apr 22, 2022
@kunalspathak kunalspathak transferred this issue from dotnet/perf-autofiling-issues Apr 22, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 22, 2022
@ghost
Copy link

ghost commented Apr 22, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x86
OS Windows 10.0.18362
Baseline 631389fb941eb618fef883975ca5f4f5b03a93cf
Compare 9b2e2a830a4e2e67c920aa200329533baba5c363
Diff Diff

Regressions in System.Numerics.Tests.Constructor

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
SpanCastBenchmark_Double - Duration of single invocation 5.92 ns 7.14 ns 1.21 0.01 False
SpanCastBenchmark_Int16 - Duration of single invocation 6.08 ns 7.14 ns 1.17 0.07 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Numerics.Tests.Constructor*'

Payloads

Baseline
Compare

Histogram

System.Numerics.Tests.Constructor.SpanCastBenchmark_Double


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 7.139542309121308 > 6.213530397532811.
IsChangePoint: Marked as a change because one of 3/7/2022 10:45:01 PM, 3/11/2022 8:42:43 PM, 3/31/2022 5:34:47 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -166.00530257814987 (T) = (0 -7.1661131362762305) / Math.Sqrt((0.0013573709364404836 / (35)) + (0.0003398131217755381 / (20))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (20) - 2, .025) and -0.20919156977653053 = (5.926367099632189 - 7.1661131362762305) / 5.926367099632189 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Numerics.Tests.Constructor.SpanCastBenchmark_Int16

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 7.138994615986093 > 6.382707076214422.
IsChangePoint: Marked as a change because one of 2/23/2022 10:45:30 PM, 3/7/2022 10:45:01 PM, 3/31/2022 5:34:47 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -17.371584344969687 (T) = (0 -7.160887237990245) / Math.Sqrt((0.11903633236762591 / (35)) + (0.0014548617096536063 / (20))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (20) - 2, .025) and -0.16683325346415603 = (6.137027048835492 - 7.160887237990245) / 6.137027048835492 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged, refs/heads/main, RunKind=micro, Windows 10.0.18362, Regression, CoreClr, x86

Milestone: -

@SingleAccretion
Copy link
Contributor

So the assembly diff here is very simple:

        mov       eax,2
        mul       edx
        mov       [ebp+0FFF0],eax
        mov       [ebp+0FFF4],edx
-       mov       eax,[ebp+0FFF0]
-       mov       edx,[ebp+0FFF4]
-       push      edx
-       push      eax
+       push      [ebp+0FFF4]
+       push      [ebp+0FFF0]
        push      0
        push      10
        call      CORINFO_HELP_ULDIV

And I can reproduce the time difference. To test things out, I've created a JitPasStkArg knob that inserts nops before each FIELD_LIST.

Here are the benchmarking results:

| JitPadStkArg | Base/Diff     |
|--------------|---------------|
| 0 (none)     | 1.24 +/- 0.4  |
| 1            | 1.05 +/- 0.02 |
| 2            | 1.26 +/- 0.08 |
| 3            | 1.34 +/- 0.07 |
| 4            | 1.13 +/- 0.04 |

So it looks to me like an alignment-induced regression. I don't think we can do anything specifically related to #65803 here. I suppose the benchmark's performance will be improved nicely once we implement long multi-regs on x86 (which should not be too involved for this specific MUL_LONG case).

The regressions in #4380 look related to #67335, will investigate those next.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label May 5, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone May 5, 2022
@kunalspathak
Copy link
Member

This seems to be noise, looking at the latest trend:

image

image

@ghost ghost locked as resolved and limited conversation to collaborators Jun 17, 2022
@jeffhandley jeffhandley added runtime-coreclr specific to the CoreCLR runtime arch-x86 and removed CoreClr labels Dec 28, 2022
@jeffhandley jeffhandley removed the x86 label Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI runtime-coreclr specific to the CoreCLR runtime
Projects
None yet
Development

No branches or pull requests

4 participants