Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Regressions in Dictionary and Hashset on 9/24/2022 1:05:07 PM #76256

Closed
performanceautofiler bot opened this issue Sep 27, 2022 · 35 comments
Closed
Assignees
Labels
arch-x64 area-System.Collections os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Sep 27, 2022

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 05aa1fdf72daa876e482f3a762d72d40e82c50b8
Compare 3922b81fc9c408639c0a090eebcd65f3c0db96ad
Diff Diff

Regressions in System.Collections.TryGetValueFalse<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IDictionary - Duration of single invocation 3.94 μs 6.19 μs 1.57 0.21 False
Dictionary - Duration of single invocation 3.91 μs 5.31 μs 1.36 0.10 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

Edge Detector Info

Collection Data

System.Collections.TryGetValueFalse<Int32, Int32>.ConcurrentDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 2.5190243849206353 > 2.4036435081910237.
IsChangePoint: Marked as a change because one of 8/6/2022 4:25:34 AM, 8/15/2022 9:13:40 PM, 9/13/2022 12:23:36 PM, 9/24/2022 7:43:48 AM, 9/27/2022 3:32:36 AM falls between 9/18/2022 2:31:46 PM and 9/27/2022 3:32:36 AM.
IsRegressionStdDev: Marked as regression because -18.96761416807978 (T) = (0 -2531.2749920005836) / Math.Sqrt((4762.9302636739385 / (45)) + (102.93410430935235 / (13))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (13) - 2, .025) and -0.0868651462502898 = (2328.968778448312 - 2531.2749920005836) / 2328.968778448312 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;.IDictionary(Size: 512)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 6.189232560069029 > 4.154752209147254.
IsChangePoint: Marked as a change because one of 9/23/2022 1:16:35 AM, 9/27/2022 3:32:36 AM falls between 9/18/2022 2:31:46 PM and 9/27/2022 3:32:36 AM.
IsRegressionStdDev: Marked as regression because -14.445665294315434 (T) = (0 -5399.457485776981) / Math.Sqrt((7800.283818254304 / (38)) + (188851.75236255076 / (20))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (38) + (20) - 2, .025) and -0.3564589621258156 = (3980.553512149795 - 5399.457485776981) / 3980.553512149795 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;.Dictionary(Size: 512)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 5.312845970588235 > 4.10624376084124.
IsChangePoint: Marked as a change because one of 8/23/2022 11:46:46 PM, 9/1/2022 8:04:48 PM, 9/13/2022 12:23:36 PM, 9/24/2022 1:05:07 PM, 9/27/2022 3:32:36 AM falls between 9/18/2022 2:31:46 PM and 9/27/2022 3:32:36 AM.
IsRegressionStdDev: Marked as regression because -21.186422586815983 (T) = (0 -5116.308816871219) / Math.Sqrt((36197.789434224804 / (46)) + (24833.31452200058 / (12))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (46) + (12) - 2, .025) and -0.28421284578723527 = (3984.003768265432 - 5116.308816871219) / 3984.003768265432 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr untriaged New issue has not been triaged by the area owner labels Sep 27, 2022
@dotnet dotnet deleted a comment from performanceautofiler bot Sep 27, 2022
@DrewScoggins DrewScoggins transferred this issue from dotnet/perf-autofiling-issues Sep 27, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@DrewScoggins DrewScoggins added os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark arch-x64 and removed refs/heads/main labels Sep 27, 2022
@DrewScoggins
Copy link
Member

DrewScoggins commented Sep 27, 2022

Seems related to #75663, @stephentoub. We are seeing regressions in a large number of Dictionary and Hashset performance tests across all configurations as well.

@DrewScoggins DrewScoggins changed the title [Perf] Alpine/x64: 1 Regression on 9/24/2022 1:05:07 PM [Perf] Regressions in Dictionary and Hashset on 9/24/2022 1:05:07 PM Sep 27, 2022
@stephentoub
Copy link
Member

Do current builds include PGO updates based on my change?

@DrewScoggins
Copy link
Member

They do not. We can look at this again once we have updated PGO data, if you think that is a big factor here.

@stephentoub
Copy link
Member

I suspect it is. We'll see. If it ends up persisting after a PGO update, I'll revert and take another whack at it.

@ghost
Copy link

ghost commented Sep 28, 2022

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 05aa1fdf72daa876e482f3a762d72d40e82c50b8
Compare 3922b81fc9c408639c0a090eebcd65f3c0db96ad
Diff Diff

Regressions in System.Collections.TryGetValueFalse<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IDictionary - Duration of single invocation 3.94 μs 6.19 μs 1.57 0.21 False
Dictionary - Duration of single invocation 3.91 μs 5.31 μs 1.36 0.10 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

Edge Detector Info

Collection Data

System.Collections.TryGetValueFalse<Int32, Int32>.ConcurrentDictionary(Size: 512)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 2.5190243849206353 > 2.4036435081910237.
IsChangePoint: Marked as a change because one of 8/6/2022 4:25:34 AM, 8/15/2022 9:13:40 PM, 9/13/2022 12:23:36 PM, 9/24/2022 7:43:48 AM, 9/27/2022 3:32:36 AM falls between 9/18/2022 2:31:46 PM and 9/27/2022 3:32:36 AM.
IsRegressionStdDev: Marked as regression because -18.96761416807978 (T) = (0 -2531.2749920005836) / Math.Sqrt((4762.9302636739385 / (45)) + (102.93410430935235 / (13))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (13) - 2, .025) and -0.0868651462502898 = (2328.968778448312 - 2531.2749920005836) / 2328.968778448312 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;.IDictionary(Size: 512)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 6.189232560069029 > 4.154752209147254.
IsChangePoint: Marked as a change because one of 9/23/2022 1:16:35 AM, 9/27/2022 3:32:36 AM falls between 9/18/2022 2:31:46 PM and 9/27/2022 3:32:36 AM.
IsRegressionStdDev: Marked as regression because -14.445665294315434 (T) = (0 -5399.457485776981) / Math.Sqrt((7800.283818254304 / (38)) + (188851.75236255076 / (20))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (38) + (20) - 2, .025) and -0.3564589621258156 = (3980.553512149795 - 5399.457485776981) / 3980.553512149795 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;.Dictionary(Size: 512)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 5.312845970588235 > 4.10624376084124.
IsChangePoint: Marked as a change because one of 8/23/2022 11:46:46 PM, 9/1/2022 8:04:48 PM, 9/13/2022 12:23:36 PM, 9/24/2022 1:05:07 PM, 9/27/2022 3:32:36 AM falls between 9/18/2022 2:31:46 PM and 9/27/2022 3:32:36 AM.
IsRegressionStdDev: Marked as regression because -21.186422586815983 (T) = (0 -5116.308816871219) / Math.Sqrt((36197.789434224804 / (46)) + (24833.31452200058 / (12))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (46) + (12) - 2, .025) and -0.28421284578723527 = (3984.003768265432 - 5116.308816871219) / 3984.003768265432 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: kunalspathak
Labels:

area-System.Collections, os-linux, tenet-performance, tenet-performance-benchmarks, arch-x64, untriaged

Milestone: -

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Sep 28, 2022
@eiriktsarpalis eiriktsarpalis added this to the 8.0.0 milestone Sep 28, 2022
@sblom
Copy link
Contributor

sblom commented Mar 30, 2023

I'm still working on this. I'm getting very close to having it running on .NET 8.0 preview 2. There's an issue that makes preview 3 and newer fail on ARM. I'm working on a fix for that also, but current priority is to get us up to date with at least preview 2.

@sblom
Copy link
Contributor

sblom commented Apr 6, 2023

The .NET 8.0 preview 2 update is complete, and PGO/IBC packages have been submitted to darc for the next round of dependency updates.

@stephentoub
Copy link
Member

@EgorBo, I'm not sure what to make of this issue. There are two tests here, one for Dictionary and one for IDictionary. The Dictionary one returned to normal, but the IDictionary one didn't, yet the IDictionary test is exactly the same as the one for Dictionary just going through its IDictionary interface. Ideas?

@EgorBo
Copy link
Member

EgorBo commented Apr 10, 2023

@EgorBo, I'm not sure what to make of this issue. There are two tests here, one for Dictionary and one for IDictionary. The Dictionary one returned to normal, but the IDictionary one didn't, yet the IDictionary test is exactly the same as the one for Dictionary just going through its IDictionary interface. Ideas?

I still blame PGO here, I guess it's still not here.

Win-x64 default:

image

Same machine but with DOTNET_DynamicPGO=1:

image

@stephentoub
Copy link
Member

stephentoub commented Apr 10, 2023

That latter picture is for Dictionary, not IDictionary. Also, dynamic pgo would enable devirtualization via IDictionary in the test... static wouldn't, such that they're not comparable, right?

@EgorBo
Copy link
Member

EgorBo commented Apr 10, 2023

That latter picture is for Dictionary, not IDictionary. Also, dynamic pgo would enable devirtualization via IDictionary... would static?

The point that I don't see this regression on the runs with PGO enabled. For both IDictionary and Dictionary, see here https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows%2010.0.18362_PGOType%3Dfullpgo/AllTestindex.html (although there is some spike on that date but then it got optimized)

@stephentoub
Copy link
Member

stephentoub commented Apr 10, 2023

The point that I don't see this regression on the runs with PGO enabled

But dynamic pgo would make the Dictionary and IDictionary tests effectively equivalent. Static pgo wouldn't, so I'm not understanding.

@EgorBo
Copy link
Member

EgorBo commented Apr 10, 2023

The point that I don't see this regression on the runs with PGO enabled

But dynamic pgo would make the tests effectively equivalent. Static pgo wouldn't, so I'm not understanding.

It's not necessary that it's devirtualization that kicks in here, any change to C# impl also makes overall static profile stale until it's regenerated. Stale profile can trigger regressions for sure.

From what I see, currently our repo still uses static PGO that was trained on November 2022's version of dotnet/runtime code, so it's definitely stale.

@stephentoub
Copy link
Member

The .NET 8.0 preview 2 update is complete, and PGO/IBC packages have been submitted to darc for the next round of dependency updates.

currently our repo still uses static PGO that was trained on November 2022's version of dotnet/runtime code

Ok, I guess we continue to wait for everything to propagate. We need to find a way to close this gap. @sblom, what is the expected time between when a change is checked into dotnet/runtime main and that change's impact is reflected in pgo data also merged into runtime main? From your most recent comment about "Preview 2", am I to infer that we plan to only do so once a month?

@EgorBo
Copy link
Member

EgorBo commented Apr 10, 2023

#83624 PR updates PGO data but it also does not yet include the commit we need (it brings 103c1eaca9ad80cdd1746abfb97c7f3c9d0b0f3b while we need 7093fbd7705c978e6811b408d7bb33ea1c4b9148

@sblom
Copy link
Contributor

sblom commented Apr 10, 2023

what is the expected time between when a change is checked into dotnet/runtime main and that change's impact is reflected in pgo data also merged into runtime main?

I think under normal circumstances, the biggest delay between a code change and its contribution to profile data will be mediated by Maestro. We can take runtime update PRs from dotnet-bot pretty much continuously, and from watching the last cycle, it looks like once we've published profiles, Mastro will PR them into runtime main within about a week. (I'm admittedly less clear on that part.)

From your most recent comment about "Preview 2", am I to infer that we plan to only do so once a month?

I think it'll be more frequent than that. The reason we're on Preview 2 isn't policy--it's that dotnet --info for Preview 3 and newer builds is failing on our ARM64 machines. I tried going all the way to latest main, but stopped at Preview 2 along the way because it worked and it seemed like 6 months+ of progress was a good idea.

I'll work on getting latest main to work.

@stephentoub
Copy link
Member

Got it. Thanks.

@sblom
Copy link
Contributor

sblom commented Apr 12, 2023

I got a preview 4 build through the pipeline this afternoon. Profiles should get picked up by maestro's next pr into runtime. Let me know if you see anything that looks wrong.

@stephentoub
Copy link
Member

Excellent. Thank you.

@stephentoub
Copy link
Member

Remaining regression went away. Looks like there might be a new one, but that'll be covered by a different issue.

@stephentoub stephentoub closed this as not planned Won't fix, can't repro, duplicate, stale Apr 27, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 27, 2023
@ericstj
Copy link
Member

ericstj commented Oct 10, 2023

@stephentoub can you take another look and see if there is anything we still need to do here to address a potential regression?

@dotnet dotnet unlocked this conversation Nov 24, 2023
@stephentoub
Copy link
Member

I don't think there's anything further actionable here. Thanks.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x64 area-System.Collections os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

8 participants