Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Changes at 4/3/2022 6:13:34 PM #68433

Closed
performanceautofiler bot opened this issue Apr 5, 2022 · 6 comments
Closed

[Perf] Changes at 4/3/2022 6:13:34 PM #68433

performanceautofiler bot opened this issue Apr 5, 2022 · 6 comments
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI runtime-coreclr specific to the CoreCLR runtime
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x86
OS Windows 10.0.18362
Baseline a0f7927c0ce4cfa8d1c832e70461b0145389a8be
Compare 0b4af007f758b7f265a54565251c633b632cc999
Diff Diff

Regressions in PerfLabTests.CastingPerf2.CastingPerf

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IFooObjIsIFoo - Duration of single invocation 533.95 μs 585.14 μs 1.10 0.04 False
FooObjIsDescendant - Duration of single invocation 408.03 μs 487.88 μs 1.20 0.18 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'PerfLabTests.CastingPerf2.CastingPerf*'

Payloads

Baseline
Compare

Histogram

PerfLabTests.CastingPerf2.CastingPerf.IFooObjIsIFoo


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 585.142199074074 > 560.6315528445513.
IsChangePoint: Marked as a change because one of 3/10/2022 1:12:51 AM, 3/22/2022 2:38:09 PM, 4/3/2022 1:52:36 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -26.073826926551938 (T) = (0 -585079.1995517344) / Math.Sqrt((122660132.96965332 / (49)) + (2379427.143725326 / (6))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (49) + (6) - 2, .025) and -0.08212072884065917 = (540678.3032227511 - 585079.1995517344) / 540678.3032227511 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### PerfLabTests.CastingPerf2.CastingPerf.FooObjIsDescendant

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 487.88003246753243 > 428.6117251602564.
IsChangePoint: Marked as a change because one of 2/25/2022 1:44:54 AM, 3/4/2022 7:00:20 AM, 3/7/2022 10:45:01 PM, 3/16/2022 5:02:20 PM, 3/21/2022 1:28:14 PM, 3/31/2022 5:34:47 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -17.992324298006597 (T) = (0 -477791.5116915254) / Math.Sqrt((57977535.77841977 / (35)) + (250451561.61398742 / (20))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (20) - 2, .025) and -0.16522805280935102 = (410041.2022690114 - 477791.5116915254) / 410041.2022690114 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x86
OS Windows 10.0.18362
Baseline a0f7927c0ce4cfa8d1c832e70461b0145389a8be
Compare 0b4af007f758b7f265a54565251c633b632cc999
Diff Diff

Regressions in System.Collections.IndexerSet<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ConcurrentDictionary - Duration of single invocation 20.22 μs 24.85 μs 1.23 0.16 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.IndexerSet&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.IndexerSet<Int32>.ConcurrentDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 24.84945149172474 > 21.34707510538262.
IsChangePoint: Marked as a change because one of 3/7/2022 10:45:01 PM, 3/9/2022 10:22:25 AM, 4/3/2022 1:52:36 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -10.91617552412988 (T) = (0 -24360.34486246879) / Math.Sqrt((268967.87390529225 / (49)) + (712236.7726556616 / (6))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (49) + (6) - 2, .025) and -0.18753689218135378 = (20513.33733112236 - 24360.34486246879) / 20513.33733112236 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x86
OS Windows 10.0.18362
Baseline a0f7927c0ce4cfa8d1c832e70461b0145389a8be
Compare 0b4af007f758b7f265a54565251c633b632cc999
Diff Diff

Regressions in System.Collections.ContainsKeyFalse<String, String>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IDictionary - Duration of single invocation 17.66 μs 20.17 μs 1.14 0.01 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.ContainsKeyFalse&lt;String, String&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsKeyFalse<String, String>.IDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 20.166918817204298 > 18.506320741572186.
IsChangePoint: Marked as a change because one of 3/17/2022 11:24:40 PM, 3/31/2022 5:34:47 PM, 4/3/2022 1:52:36 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -36.0126424900174 (T) = (0 -20143.384864486816) / Math.Sqrt((164629.9030833987 / (49)) + (5577.151301824765 / (6))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (49) + (6) - 2, .025) and -0.13261720640768843 = (17784.812689165658 - 20143.384864486816) / 17784.812689165658 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr untriaged New issue has not been triaged by the area owner labels Apr 5, 2022
@kunalspathak
Copy link
Member

Dup of #68410

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@kunalspathak kunalspathak transferred this issue from dotnet/perf-autofiling-issues Apr 23, 2022
@kunalspathak kunalspathak reopened this Apr 23, 2022
@kunalspathak
Copy link
Member

As per @SingleAccretion in #68410 (comment)

The regressions could be related to #67335.

@SingleAccretion SingleAccretion self-assigned this Apr 23, 2022
@SingleAccretion
Copy link
Contributor

1) PerfLabTests.CastingPerf2.CastingPerf.IFooObjIsIFoo: the diff is 4af53b5...966a4a7, implicating #67335:

image

That same bump if we look at the broader history:

image

This looks like modal behavior to me (notably, the lower mode stayed the same at about 533k).

I was also not able to reproduce the regression locally, even though there was an assembly diff: lea edx, [static_addr] replaced by a one byte smaller mov eax, static_addr.

So I think it is safe to say that this regression is not caused by #67335.

2) PerfLabTests.CastingPerf2.CastingPerf.FooObjIsDescendant: the diff is fe0f600...9b2e2a8, implicating #65803:

image

It also looks like modal behavior:

image

The benchmark is once again calling ChkCastAny, which does not have any longs in it.

Finally, I was not able to reproduce the regression locally, so I do not think #65803 is to blame.

3) System.Collections.IndexerSet<Int32>.ConcurrentDictionary: the diff is 4af53b5...966a4a7, implicating #67335:

image

Looking at the history, we once again see modal behavior:

image

And I was not able to reproduce the regression locally - presuming #67335 is not the cause for this one too.

4) System.Collections.ContainsKeyFalse<String, String>.IDictionary: here we actually have two bumps, implicating both #67335 and #65803:

image

However, since then, the benchmark has returned to the before-regressions state:

image

As with the other benchmarks, I was not able to reproduce the regression locally, and so do not think #67335 and #65803 are to blame.

@SingleAccretion SingleAccretion removed their assignment Apr 23, 2022
@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 24, 2022
@ghost
Copy link

ghost commented Apr 24, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x86
OS Windows 10.0.18362
Baseline a0f7927c0ce4cfa8d1c832e70461b0145389a8be
Compare 0b4af007f758b7f265a54565251c633b632cc999
Diff Diff

Regressions in PerfLabTests.CastingPerf2.CastingPerf

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IFooObjIsIFoo - Duration of single invocation 533.95 μs 585.14 μs 1.10 0.04 False
FooObjIsDescendant - Duration of single invocation 408.03 μs 487.88 μs 1.20 0.18 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'PerfLabTests.CastingPerf2.CastingPerf*'

Payloads

Baseline
Compare

Histogram

PerfLabTests.CastingPerf2.CastingPerf.IFooObjIsIFoo


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 585.142199074074 > 560.6315528445513.
IsChangePoint: Marked as a change because one of 3/10/2022 1:12:51 AM, 3/22/2022 2:38:09 PM, 4/3/2022 1:52:36 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -26.073826926551938 (T) = (0 -585079.1995517344) / Math.Sqrt((122660132.96965332 / (49)) + (2379427.143725326 / (6))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (49) + (6) - 2, .025) and -0.08212072884065917 = (540678.3032227511 - 585079.1995517344) / 540678.3032227511 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### PerfLabTests.CastingPerf2.CastingPerf.FooObjIsDescendant

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 487.88003246753243 > 428.6117251602564.
IsChangePoint: Marked as a change because one of 2/25/2022 1:44:54 AM, 3/4/2022 7:00:20 AM, 3/7/2022 10:45:01 PM, 3/16/2022 5:02:20 PM, 3/21/2022 1:28:14 PM, 3/31/2022 5:34:47 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -17.992324298006597 (T) = (0 -477791.5116915254) / Math.Sqrt((57977535.77841977 / (35)) + (250451561.61398742 / (20))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (35) + (20) - 2, .025) and -0.16522805280935102 = (410041.2022690114 - 477791.5116915254) / 410041.2022690114 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x86
OS Windows 10.0.18362
Baseline a0f7927c0ce4cfa8d1c832e70461b0145389a8be
Compare 0b4af007f758b7f265a54565251c633b632cc999
Diff Diff

Regressions in System.Collections.IndexerSet<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ConcurrentDictionary - Duration of single invocation 20.22 μs 24.85 μs 1.23 0.16 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.IndexerSet&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.IndexerSet<Int32>.ConcurrentDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 24.84945149172474 > 21.34707510538262.
IsChangePoint: Marked as a change because one of 3/7/2022 10:45:01 PM, 3/9/2022 10:22:25 AM, 4/3/2022 1:52:36 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -10.91617552412988 (T) = (0 -24360.34486246879) / Math.Sqrt((268967.87390529225 / (49)) + (712236.7726556616 / (6))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (49) + (6) - 2, .025) and -0.18753689218135378 = (20513.33733112236 - 24360.34486246879) / 20513.33733112236 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x86
OS Windows 10.0.18362
Baseline a0f7927c0ce4cfa8d1c832e70461b0145389a8be
Compare 0b4af007f758b7f265a54565251c633b632cc999
Diff Diff

Regressions in System.Collections.ContainsKeyFalse<String, String>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IDictionary - Duration of single invocation 17.66 μs 20.17 μs 1.14 0.01 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.ContainsKeyFalse&lt;String, String&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsKeyFalse<String, String>.IDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 20.166918817204298 > 18.506320741572186.
IsChangePoint: Marked as a change because one of 3/17/2022 11:24:40 PM, 3/31/2022 5:34:47 PM, 4/3/2022 1:52:36 PM, 4/5/2022 2:02:34 AM falls between 3/26/2022 9:07:40 PM and 4/5/2022 2:02:34 AM.
IsRegressionStdDev: Marked as regression because -36.0126424900174 (T) = (0 -20143.384864486816) / Math.Sqrt((164629.9030833987 / (49)) + (5577.151301824765 / (6))) is less than -2.005745995316835 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (49) + (6) - 2, .025) and -0.13261720640768843 = (17784.812689165658 - 20143.384864486816) / 17784.812689165658 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged, refs/heads/main, RunKind=micro, Windows 10.0.18362, Regression, CoreClr, x86

Milestone: -

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Apr 27, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Apr 27, 2022
@AndyAyersMS
Copy link
Member

These are all "noisy" tests, eg the regression on 4/3 is just normal fluctuation.

newplot - 2022-04-28T111501 542

as is

newplot - 2022-04-28T111619 372

Going to close this.

@ghost ghost locked as resolved and limited conversation to collaborators May 28, 2022
@jeffhandley jeffhandley added runtime-coreclr specific to the CoreCLR runtime and removed CoreClr labels Dec 28, 2022
@jeffhandley jeffhandley added arch-x86 and removed x86 labels Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI runtime-coreclr specific to the CoreCLR runtime
Projects
None yet
Development

No branches or pull requests

6 participants