Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in System.MathBenchmarks.Double #85985

Closed
performanceautofiler bot opened this issue May 9, 2023 · 7 comments
Closed

Regressions in System.MathBenchmarks.Double #85985

performanceautofiler bot opened this issue May 9, 2023 · 7 comments
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 3695b6ddd869e53eb663f7674e0f130eaf895b03
Compare 3e8f17a65a068fca3d19fa5cd43a7e1cd414a5ae
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.MathBenchmarks.Double

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Sqrt - Duration of single invocation 9.35 μs 28.22 μs 3.02 0.00 True 30934.800080369703 32743.496672716275 1.0584680226685648) Trace Trace

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.MathBenchmarks.Double*'

Payloads

Baseline
Compare

Histogram

System.MathBenchmarks.Double.Sqrt


Description of detection logic

IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 28.217955081921694 > 9.81972192923399.
IsChangePoint: Marked as a change because one of 5/1/2023 6:56:14 PM, 5/9/2023 7:24:34 AM falls between 4/30/2023 6:17:41 PM and 5/9/2023 7:24:34 AM.
IsRegressionStdDev: Marked as regression because -2456.13357226457 (T) = (0 -28234.404781782963) / Math.Sqrt((0.02128795533441095 / (14)) + (1359.3234780046578 / (23))) is less than -2.0301079282477414 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (14) + (23) - 2, .025) and -2.0190451457621967 = (9352.097573438183 - 28234.404781782963) / 9352.097573438183 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked as regression because Edge Detector said so.

JIT Disasms

Baseline
Compare
Diff

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arch-x64 os-windows runtime-coreclr specific to the CoreCLR runtime untriaged New issue has not been triaged by the area owner labels May 9, 2023
@cincuranet
Copy link
Contributor

The commit range is 6f19e37...da0aa0c.

@cincuranet cincuranet changed the title [Perf] Windows/x64: 1 Regression on 5/2/2023 12:34:35 AM Regressions in System.MathBenchmarks.Double May 9, 2023
@cincuranet cincuranet removed the untriaged New issue has not been triaged by the area owner label May 9, 2023
@cincuranet cincuranet transferred this issue from dotnet/perf-autofiling-issues May 9, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 9, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 9, 2023
@cincuranet
Copy link
Contributor

@ghost
Copy link

ghost commented May 10, 2023

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Name Value
Architecture x64
OS Windows 10.0.18362
Queue TigerWindows
Baseline 3695b6ddd869e53eb663f7674e0f130eaf895b03
Compare 3e8f17a65a068fca3d19fa5cd43a7e1cd414a5ae
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.MathBenchmarks.Double

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Sqrt - Duration of single invocation 9.35 μs 28.22 μs 3.02 0.00 True 30934.800080369703 32743.496672716275 1.0584680226685648) Trace Trace

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline
Compare

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.MathBenchmarks.Double*'

Payloads

Baseline
Compare

Histogram

System.MathBenchmarks.Double.Sqrt


Description of detection logic

IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 28.217955081921694 > 9.81972192923399.
IsChangePoint: Marked as a change because one of 5/1/2023 6:56:14 PM, 5/9/2023 7:24:34 AM falls between 4/30/2023 6:17:41 PM and 5/9/2023 7:24:34 AM.
IsRegressionStdDev: Marked as regression because -2456.13357226457 (T) = (0 -28234.404781782963) / Math.Sqrt((0.02128795533441095 / (14)) + (1359.3234780046578 / (23))) is less than -2.0301079282477414 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (14) + (23) - 2, .025) and -2.0190451457621967 = (9352.097573438183 - 28234.404781782963) / 9352.097573438183 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked as regression because Edge Detector said so.

JIT Disasms

Baseline
Compare
Diff

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: -
Labels:

area-System.Runtime, os-windows, arch-x64, untriaged, runtime-coreclr, needs-area-label

Milestone: -

@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 11, 2023
@tannergooding tannergooding removed the untriaged New issue has not been triaged by the area owner label May 26, 2023
@ericstj ericstj added this to the 8.0.0 milestone Jul 24, 2023
@tannergooding
Copy link
Member

No codegen differences. Only difference is BDN changed how many operations its executing

CC. @adamsitnik

@adamsitnik
Copy link
Member

Only difference is BDN changed how many operations its executing

I am not sure if I follow. BDN estimates how many invocations should be performed per iteration (250ms), it almost always changes between runs but this should not have a big impact on the reported time.

@tannergooding
Copy link
Member

There were 0 changes to codegen for the actual benchmark itself between base and diff, the only actual change is that we have some more methods (on the general startup path) that start in T0, rather than starting in T1 (were no longer marked AggressiveOptimization).

We semi-regularly see cases like this in the weekly triage. We also see cases where BDN doesn't work as expected with functions that have very small execution times (dotnet/BenchmarkDotNet#1802), which in turn can impact the overhead measurement of an empty call.

You can see the asm diff here: https://perfsupport.azurewebsites.net/diff?old=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_MathBenchmarks_Double_Sqrt_baseline_a81af296-a311-4f09-861a-365111575051.log&new=https%3A%2F%2Fpvscmdupload.blob.core.windows.net%2Fautofilereport%2Fjitdasms%2F05_09_2023%2FSystem_MathBenchmarks_Double_Sqrt_compare_a81af296-a311-4f09-861a-365111575051.log

The methods that are actually being measured can be seen by searching for sqrts.

@ghost ghost locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants