Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf -14%] Microsoft.Extensions.Logging.LoggingOverhead (3) #3249

Open
performanceautofiler bot opened this issue Nov 10, 2020 · 16 comments
Open

[Perf -14%] Microsoft.Extensions.Logging.LoggingOverhead (3) #3249

performanceautofiler bot opened this issue Nov 10, 2020 · 16 comments

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline b6f791d1984f998384267b8b6683bd02f64747f2
Compare 7a2e0ececef83f6b5626a6410bd266fbbf1cb071

Regressions in Microsoft.Extensions.Logging.LoggingOverhead

Benchmark Baseline Test Test/Base Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
NoArguments 31.18 ns 36.00 ns 1.15 321.089102434843 373.8511616032678 1.1643221734039744 Trace Trace
NoArguments_DefineMessage 33.48 ns 39.50 ns 1.18 363.90220302121344 410.7404024333668 1.1287109531717316 Trace Trace
TwoArguments_DefineMessage 47.76 ns 51.93 ns 1.09 510.5725446164502 563.1364961266828 1.1029509950436513 Trace Trace

Related Issue on x64 ubuntu

[Perf 21%] Microsoft.Extensions.Logging.LoggingOverhead (2)

graph
graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'Microsoft.Extensions.Logging.LoggingOverhead*'

Histogram

Microsoft.Extensions.Logging.LoggingOverhead.NoArguments

[30.664 ; 31.779) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[31.779 ; 33.024) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[33.024 ; 34.256) | @@@@@@@@@@@@@@@@@@@@@@@@@@
[34.256 ; 35.409) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@
[35.409 ; 36.583) | @@@@@@@@@@@@@@@@
[36.583 ; 38.067) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[38.067 ; 38.868) | @
[38.868 ; 40.350) | @
[40.350 ; 41.443) | @

Microsoft.Extensions.Logging.LoggingOverhead.NoArguments_DefineMessage

[32.379 ; 33.671) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[33.671 ; 34.636) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[34.636 ; 35.572) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[35.572 ; 37.027) | @@@@@@@@@@@@@@@@@@@@@@@@@
[37.027 ; 37.836) | @@@@@@@@@@@@@@
[37.836 ; 38.659) | @@@@@
[38.659 ; 39.623) | @@@@
[39.623 ; 40.446) | @@@@@@@@@@@@@@@@@@@@
[40.446 ; 42.220) | @@@@@@@@@@@@@@@@

Microsoft.Extensions.Logging.LoggingOverhead.TwoArguments_DefineMessage

[45.767 ; 47.102) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[47.102 ; 48.982) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[48.982 ; 50.707) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[50.707 ; 51.838) | @@@@@@@@@@@@@@@@@@@@@@@@@
[51.838 ; 53.213) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@
[53.213 ; 54.924) | @@@@@@@@@@@@@@@@@@@@@@@@@@@
[54.924 ; 57.230) | @@@@@@@@@@@@
[57.230 ; 60.030) | @

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@DrewScoggins
Copy link
Owner

@AndyAyersMS

@AndyAyersMS
Copy link
Collaborator

Benchmark-interval only analysis for NoArguments (from ETL) suggests this is something in the core runtime itself.

Baseline

Samples for corerun: 7193 events for Benchmark Intervals
Jitting           : 00.24% 9E+06    samples 1573 methods
  JitInterface    : 00.08% 3E+06    samples
Jit-generated code: 59.63% 2.27E+09 samples
  Jitted code     : 59.63% 2.27E+09 samples
  MinOpts code    : 00.00% 0        samples
  FullOpts code   : 17.42% 6.62E+08 samples
  Tier-0 code     : 00.00% 0        samples
  Tier-1 code     : 42.21% 1.6E+09  samples
  R2R code        : 00.00% 0        samples

02.92%   1.11E+08    ?        Unknown
37.13%   1.411E+09   native   coreclr.dll
16.55%   6.29E+08    FullOpt  [Microsoft.Extensions.Logging]Logger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
12.00%   4.56E+08    Tier-1   [Microsoft.Extensions.Logging.Abstractions]LoggerExtensions.Log(class Microsoft.Extensions.Logging.ILogger,value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class System.Exception,class System.String,class System.Object[])
11.08%   4.21E+08    Tier-1   [Microsoft.Extensions.Logging]Logger.<Log>g__LoggerLog|12_0(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class Microsoft.Extensions.Logging.ILogger,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>,class System.Collections.Generic.List`1<class System.Exception>&,!!0&)
07.53%   2.86E+08    Tier-1   [Microsoft.Extensions.Logging.Abstractions]FormattedLogValues..ctor(class System.String,class System.Object[])
06.05%   2.3E+08     Tier-1   [MicroBenchmarks]LoggingOverhead.NoArguments()
04.68%   1.78E+08    Tier-1   [Microsoft.Extensions.Logging]MessageLogger.IsEnabled(value class Microsoft.Extensions.Logging.LogLevel)
00.87%   3.3E+07     FullOpt  [c5630227-85b0-454f-88ac-8126d9409a39]Runnable_0.WorkloadActionUnroll(int64)
00.84%   3.2E+07     Tier-1   [MicroBenchmarks]LoggingBenchmarkBase+NoopLogger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
00.16%   6E+06       native   clrjit.dll
00.08%   3E+06       native   ntoskrnl.exe
00.05%   2E+06       native   intelppm.sys

Change

Samples for corerun: 7797 events for Benchmark Intervals
Jitting           : 00.26% 1E+07    samples 1573 methods
  JitInterface    : 00.03% 1E+06    samples
Jit-generated code: 54.03% 2.08E+09 samples
  Jitted code     : 54.03% 2.08E+09 samples
  MinOpts code    : 00.00% 0        samples
  FullOpts code   : 16.07% 6.18E+08 samples
  Tier-0 code     : 00.00% 0        samples
  Tier-1 code     : 37.96% 1.46E+09 samples
  R2R code        : 00.00% 0        samples

01.38%   5.3E+07     ?        Unknown
44.25%   1.702E+09   native   coreclr.dll
15.21%   5.85E+08    FullOpt  [Microsoft.Extensions.Logging]Logger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
10.95%   4.21E+08    Tier-1   [Microsoft.Extensions.Logging]Logger.<Log>g__LoggerLog|12_0(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class Microsoft.Extensions.Logging.ILogger,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>,class System.Collections.Generic.List`1<class System.Exception>&,!!0&)
10.69%   4.11E+08    Tier-1   [Microsoft.Extensions.Logging.Abstractions]LoggerExtensions.Log(class Microsoft.Extensions.Logging.ILogger,value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,class System.Exception,class System.String,class System.Object[])
06.73%   2.59E+08    Tier-1   [Microsoft.Extensions.Logging.Abstractions]FormattedLogValues..ctor(class System.String,class System.Object[])
04.50%   1.73E+08    Tier-1   [Microsoft.Extensions.Logging]MessageLogger.IsEnabled(value class Microsoft.Extensions.Logging.LogLevel)
04.39%   1.69E+08    Tier-1   [MicroBenchmarks]LoggingOverhead.NoArguments()
00.86%   3.3E+07     FullOpt  [86029241-0098-4ac1-a544-cffed1af7a77]Runnable_0.WorkloadActionUnroll(int64)
00.70%   2.7E+07     Tier-1   [MicroBenchmarks]LoggingBenchmarkBase+NoopLogger.Log(value class Microsoft.Extensions.Logging.LogLevel,value class Microsoft.Extensions.Logging.EventId,!!0,class System.Exception,class System.Func`3<!!0,class System.Exception,class System.String>)
00.23%   9E+06       native   clrjit.dll
00.08%   3E+06       native   ntoskrnl.exe

Benchmark: found 15 intervals; mean interval 255.670ms

@AndyAyersMS
Copy link
Collaborator

Perfview can't find the symbols for coreclr for those two traces so we can't drill in further.

@AndyAyersMS
Copy link
Collaborator

With symbols, the diff seems to be all in JIT_VirtualFunctionPointer. Source for this doesn't appear to have changed during this time span.

image

@AndyAyersMS
Copy link
Collaborator

It could be that this helper is just called more often; the callers are logging methods. Source for those callers hasn't changed either.

Suggests we need to look for recent jit codegen diffs in the key managed methods.

@AndyAyersMS
Copy link
Collaborator

I don't see anything in the commit history that jumps out as a likely culprit.

Am building before/after runtimes so I can drill in further.

@AndyAyersMS
Copy link
Collaborator

I don't see any jit codegen changes, but still need to make sure I'm looking at the right version of the logging dll.

@AndyAyersMS
Copy link
Collaborator

I can repro this locally. Still investigating.

Method Job Toolchain Mean
NoArguments Job-AILDUX \runtime1\artifacts\tests\coreclr\Windows_NT.x64.Release\Tests\Core_Root\corerun.exe 25.08 ns
NoArguments Job-MJIGDO \runtime2\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe 28.48 ns

@AndyAyersMS
Copy link
Collaborator

Seems like perhaps we're just taking longer to look things up in the generic handle cache? Need to check how much of the lookup code gets inlined into JIT_VirtualFunctionPointer.

@VSadov can you think of any recent changes (Oct 29-Nov 4) that might have caused something like this? I don't see anything in jit or runtime.

@VSadov
Copy link

VSadov commented Nov 11, 2020

Nothing really stands out from the diffs.

There was some change for the JIT sorting, switching from qsort to a stable sort. But that should either affect nothing or affect a lot of scenarios. I can't see how it would impact just this.

JIT_VirtualFunctionPointer is just a hash lookup and I do not see anything changed there in an obvious way. And regression there would probably affect more cases.

@AndyAyersMS
Copy link
Collaborator

New hypothesis: perhaps the windows_nt -> windows change broke how PGO is applied to the runtime, and these tests are particularly sensitive to native PGO?

Trying to verify now.

@AndyAyersMS
Copy link
Collaborator

AndyAyersMS commented Nov 11, 2020

Seems plausible it was ? From the build logs I have on hand:

;; hash 0199252802537aa4d8b393c3923b4d4cfe371872 (Oct 29)
3561 of 4076 (87.4%) original invalid call sites were matched.
4655 new call sites were added.
2324 of 33506 (  6.94%) profiled functions will be compiled for speed, and the rest of the functions will be compiled for size
150853 of 336462 inline instances were from dead/cold paths 
33226 of 33887 functions (98.0%) were optimized using profile data, and the rest of the functions were optimized without using profile data
49189162072 of 51438814196 instructions (95.6%) were optimized using profile data, and the rest of the instructions were optimized without using profile data

;; hash  03701ba6d2656c1fb52848744980c7b799aabccf (Nov 6)
3593 of 4111 (87.4%) original invalid call sites were matched.
4678 new call sites were added.
194 of 33461 (  0.58%) profiled functions will be compiled for speed, and the rest of the functions will be compiled for size
140438 of 315786 inline instances were from dead/cold paths 
33173 of 33887 functions (97.9%) were optimized using profile data, and the rest of the functions were optimized without using profile data
49184524871 of 51438814196 instructions (95.6%) were optimized using profile data, and the rest of the instructions were optimized without using profile data

Note how the number of methods compiled for speed dropped dramatically.

Am going to try building at dotnet/runtime@e691753 (before) and dotnet/runtime@b37f10a (after) to verify.

@AndyAyersMS
Copy link
Collaborator

Both those hashes show the PGO data is not applied effectively, so the impactful change was sometime before. Am going to keep looking.

Next to scrutinize:

@AndyAyersMS
Copy link
Collaborator

Looks like it was dotnet/runtime@8c6a049 (dotnet/runtime#41897).

Suspect some of the build flags got changed and this disqualified some methods from leveraging existing PGO data; am going to try and verify by extracting the flags for jithelpers.obj.

@AndyAyersMS
Copy link
Collaborator

Only real flag diffs I see are for PCH; that might be enough as the code we inline into JIT_VirtualFunctionPointer likely comes from PCH...

So seems likely that if we regenerate native PGO data we'll get this perf back.

@brianrob do you know when we next expect to see a PGO update for the runtime repo?
@jkoritzinsky FYI

@brianrob
Copy link

Ownership of PGO in the runtime repo is currently under discussion, so I don't currently have a timeline. Once this gets resolved, I'd expect updates to start coming.

DrewScoggins pushed a commit that referenced this issue Aug 14, 2023
…0810.15 (#3249)

Microsoft.Extensions.Logging , Microsoft.NET.ILLink , Microsoft.NET.ILLink.Tasks , System.Threading.Channels
 From Version 8.0.0-rc.1.23410.11 -> To Version 8.0.0-rc.1.23410.15

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants