Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject method NGEN image if there's a rejit request for that method. #6184

Merged
merged 11 commits into from
Oct 28, 2024

Conversation

tonyredondo
Copy link
Member

@tonyredondo tonyredondo commented Oct 23, 2024

Summary of changes

This PR changes the CLR profiler to reject a method NGEN image if there's a rejit request for that method.

Reason for change

There's a cases were a rejit request is not handled because we end up accepting the NGEN image for that method, so the rewriting is never triggered.

Implementation details

  • Now we keep a list with all (ModuleId and MethodTokenDef) pairs that we are requesting rejit.
  • We change the JITCachedFunctionSearchStart method to only accept the NGEN image if we are no requesting a rejit for that method.

Test coverage

Fixes the issue-6124 in my local machine. I expect that @andrewlock run a more extensive tests. 😜

Other details

Confirmed that the .NET NGEN test passes without this fix and the r2r tests fail without this fix in this build

@tonyredondo tonyredondo marked this pull request as ready for review October 23, 2024 09:36
@tonyredondo tonyredondo requested a review from a team as a code owner October 23, 2024 09:36
@DataDog DataDog deleted a comment from datadog-ddstaging bot Oct 24, 2024
@DataDog DataDog deleted a comment from andrewlock Oct 24, 2024
@DataDog DataDog deleted a comment from andrewlock Oct 24, 2024
@DataDog DataDog deleted a comment from andrewlock Oct 24, 2024
@andrewlock
Copy link
Member

andrewlock commented Oct 24, 2024

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6184) - mean (71ms)  : 67, 74
     .   : milestone, 71,
    master - mean (71ms)  : 68, 75
     .   : milestone, 71,

    section CallTarget+Inlining+NGEN
    This PR (6184) - mean (1,120ms)  : 1101, 1139
     .   : milestone, 1120,
    master - mean (1,118ms)  : 1096, 1140
     .   : milestone, 1118,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6184) - mean (110ms)  : 107, 113
     .   : milestone, 110,
    master - mean (110ms)  : 107, 114
     .   : milestone, 110,

    section CallTarget+Inlining+NGEN
    This PR (6184) - mean (773ms)  : 758, 788
     .   : milestone, 773,
    master - mean (778ms)  : 762, 793
     .   : milestone, 778,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6184) - mean (94ms)  : 91, 97
     .   : milestone, 94,
    master - mean (94ms)  : 90, 97
     .   : milestone, 94,

    section CallTarget+Inlining+NGEN
    This PR (6184) - mean (732ms)  : 716, 748
     .   : milestone, 732,
    master - mean (733ms)  : 718, 749
     .   : milestone, 733,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6184) - mean (190ms)  : 186, 194
     .   : milestone, 190,
    master - mean (190ms)  : 188, 193
     .   : milestone, 190,

    section CallTarget+Inlining+NGEN
    This PR (6184) - mean (1,200ms)  : 1178, 1222
     .   : milestone, 1200,
    master - mean (1,201ms)  : 1175, 1228
     .   : milestone, 1201,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6184) - mean (278ms)  : 271, 285
     .   : milestone, 278,
    master - mean (276ms)  : 271, 280
     .   : milestone, 276,

    section CallTarget+Inlining+NGEN
    This PR (6184) - mean (945ms)  : 928, 962
     .   : milestone, 945,
    master - mean (949ms)  : 927, 971
     .   : milestone, 949,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6184) - mean (265ms)  : 260, 270
     .   : milestone, 265,
    master - mean (265ms)  : 261, 268
     .   : milestone, 265,

    section CallTarget+Inlining+NGEN
    This PR (6184) - mean (931ms)  : 914, 948
     .   : milestone, 931,
    master - mean (924ms)  : 905, 943
     .   : milestone, 924,

Loading

@andrewlock
Copy link
Member

andrewlock commented Oct 24, 2024

Throughput/Crank Report ⚡

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6184) (11.242M)   : 0, 11242276
    master (11.292M)   : 0, 11291882
    benchmarks/2.9.0 (11.081M)   : 0, 11080577

    section Automatic
    This PR (6184) (7.409M)   : 0, 7409015
    master (7.471M)   : 0, 7470618
    benchmarks/2.9.0 (7.732M)   : 0, 7732233

    section Trace stats
    master (7.854M)   : 0, 7853541

    section Manual
    master (11.178M)   : 0, 11177916

    section Manual + Automatic
    This PR (6184) (6.882M)   : 0, 6882049
    master (6.983M)   : 0, 6983283

    section DD_TRACE_ENABLED=0
    master (10.353M)   : 0, 10352817

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6184) (9.396M)   : 0, 9395562
    master (9.759M)   : 0, 9758573
    benchmarks/2.9.0 (9.798M)   : 0, 9798067

    section Automatic
    This PR (6184) (6.597M)   : 0, 6596705
    master (6.447M)   : 0, 6446582

    section Trace stats
    master (6.536M)   : 0, 6536083

    section Manual
    master (9.551M)   : 0, 9551044

    section Manual + Automatic
    This PR (6184) (6.145M)   : 0, 6145325
    master (6.282M)   : 0, 6282382

    section DD_TRACE_ENABLED=0
    master (8.955M)   : 0, 8955266

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6184) (9.944M)   : 0, 9943878
    master (9.819M)   : 0, 9818915
    benchmarks/2.9.0 (10.067M)   : 0, 10067315

    section Automatic
    This PR (6184) (6.413M)   : 0, 6412693
    master (6.429M)   : 0, 6429392
    benchmarks/2.9.0 (7.552M)   : 0, 7552193

    section Trace stats
    master (7.215M)   : 0, 7214864

    section Manual
    master (9.879M)   : 0, 9879197

    section Manual + Automatic
    This PR (6184) (6.181M)   : 0, 6180914
    master (5.890M)   : 0, 5889975

    section DD_TRACE_ENABLED=0
    master (9.240M)   : 0, 9240298

Loading

@andrewlock
Copy link
Member

andrewlock commented Oct 24, 2024

Benchmarks Report for tracer 🐌

Benchmarks for #6184 compared to master:

  • 3 benchmarks are faster, with geometric mean 1.178
  • 1 benchmarks are slower, with geometric mean 1.136
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 7.66μs 38.2ns 175ns 0.0119 0.00396 0 5.42 KB
master StartStopWithChild netcoreapp3.1 9.84μs 53.7ns 294ns 0.0147 0.00489 0 5.62 KB
master StartStopWithChild net472 16.3μs 49.8ns 186ns 1.03 0.318 0.0896 6.06 KB
#6184 StartStopWithChild net6.0 7.69μs 43.3ns 303ns 0.0191 0.00764 0 5.43 KB
#6184 StartStopWithChild netcoreapp3.1 10.1μs 51.8ns 231ns 0.0248 0.00991 0 5.61 KB
#6184 StartStopWithChild net472 16.2μs 54.1ns 210ns 1.02 0.321 0.0964 6.06 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 471μs 231ns 865ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 621μs 295ns 1.14μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 893μs 330ns 1.19μs 0.437 0 0 3.3 KB
#6184 WriteAndFlushEnrichedTraces net6.0 484μs 351ns 1.36μs 0 0 0 2.7 KB
#6184 WriteAndFlushEnrichedTraces netcoreapp3.1 622μs 118ns 410ns 0 0 0 2.7 KB
#6184 WriteAndFlushEnrichedTraces net472 831μs 369ns 1.28μs 0.414 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 196μs 1.1μs 8.59μs 0.212 0 0 18.45 KB
master SendRequest netcoreapp3.1 225μs 1.32μs 12.4μs 0.218 0 0 20.61 KB
master SendRequest net472 9.94E‑06ns 9.94E‑06ns 3.85E‑05ns 0 0 0 0 b
#6184 SendRequest net6.0 201μs 1.25μs 12.4μs 0.216 0 0 18.45 KB
#6184 SendRequest netcoreapp3.1 221μs 1.28μs 10.1μs 0.206 0 0 20.61 KB
#6184 SendRequest net472 0ns 0ns 0ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 572μs 2.04μs 7.37μs 0.584 0 0 41.52 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 678μs 3.36μs 13.9μs 0.327 0 0 41.83 KB
master WriteAndFlushEnrichedTraces net472 885μs 3.04μs 11.4μs 8.74 2.62 0.437 53.37 KB
#6184 WriteAndFlushEnrichedTraces net6.0 575μs 2.35μs 8.8μs 0.573 0 0 41.53 KB
#6184 WriteAndFlushEnrichedTraces netcoreapp3.1 687μs 3.75μs 20.9μs 0.334 0 0 41.75 KB
#6184 WriteAndFlushEnrichedTraces net472 898μs 2.54μs 9.86μs 8.42 2.66 0.443 53.36 KB
Benchmarks.Trace.DbCommandBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #6184

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.DbCommandBenchmark.ExecuteNonQuery‑net6.0 1.136 1,413.89 1,245.02

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.41μs 1.37ns 5.29ns 0.0142 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.76μs 1.52ns 5.9ns 0.0141 0 0 1.02 KB
master ExecuteNonQuery net472 2.06μs 2.19ns 8.48ns 0.156 0 0 987 B
#6184 ExecuteNonQuery net6.0 1.24μs 0.751ns 2.71ns 0.0145 0 0 1.02 KB
#6184 ExecuteNonQuery netcoreapp3.1 1.79μs 1.64ns 6.34ns 0.0132 0 0 1.02 KB
#6184 ExecuteNonQuery net472 2.09μs 1.53ns 5.93ns 0.156 0 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #6184

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch‑net6.0 1.164 1,251.65 1,075.44

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.25μs 0.447ns 1.67ns 0.0138 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.5μs 0.832ns 2.88ns 0.0132 0 0 976 B
master CallElasticsearch net472 2.51μs 1.42ns 5.3ns 0.158 0 0 995 B
master CallElasticsearchAsync net6.0 1.36μs 0.515ns 1.93ns 0.0131 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.58μs 0.55ns 2.06ns 0.0135 0 0 1.02 KB
master CallElasticsearchAsync net472 2.54μs 1.6ns 5.99ns 0.166 0 0 1.05 KB
#6184 CallElasticsearch net6.0 1.08μs 0.482ns 1.87ns 0.0135 0 0 976 B
#6184 CallElasticsearch netcoreapp3.1 1.64μs 0.685ns 2.47ns 0.0131 0 0 976 B
#6184 CallElasticsearch net472 2.61μs 2.2ns 8.52ns 0.157 0 0 995 B
#6184 CallElasticsearchAsync net6.0 1.33μs 0.468ns 1.75ns 0.0134 0 0 952 B
#6184 CallElasticsearchAsync netcoreapp3.1 1.56μs 0.601ns 2.17ns 0.0141 0 0 1.02 KB
#6184 CallElasticsearchAsync net472 2.61μs 1.71ns 6.61ns 0.166 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.2μs 0.484ns 1.68ns 0.0132 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.61μs 0.434ns 1.62ns 0.0129 0 0 952 B
master ExecuteAsync net472 1.8μs 0.733ns 2.64ns 0.145 0 0 915 B
#6184 ExecuteAsync net6.0 1.17μs 0.424ns 1.59ns 0.0136 0 0 952 B
#6184 ExecuteAsync netcoreapp3.1 1.58μs 0.602ns 2.25ns 0.0126 0 0 952 B
#6184 ExecuteAsync net472 1.78μs 2.08ns 8.04ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.15μs 2.37ns 8.88ns 0.031 0 0 2.22 KB
master SendAsync netcoreapp3.1 5.09μs 2.33ns 8.71ns 0.0381 0 0 2.76 KB
master SendAsync net472 7.76μs 1.9ns 7.35ns 0.498 0 0 3.15 KB
#6184 SendAsync net6.0 4.26μs 1.31ns 5.09ns 0.0299 0 0 2.22 KB
#6184 SendAsync netcoreapp3.1 4.96μs 2.7ns 10.1ns 0.0373 0 0 2.76 KB
#6184 SendAsync net472 7.89μs 2.06ns 7.97ns 0.496 0 0 3.15 KB
Benchmarks.Trace.ILoggerBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6184

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.ILoggerBenchmark.EnrichedLog‑net472 1.136 2,490.29 2,829.06

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.55μs 0.725ns 2.62ns 0.0231 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.08μs 0.824ns 3.08ns 0.022 0 0 1.64 KB
master EnrichedLog net472 2.49μs 1.36ns 5.07ns 0.249 0 0 1.57 KB
#6184 EnrichedLog net6.0 1.44μs 0.824ns 3.19ns 0.023 0 0 1.64 KB
#6184 EnrichedLog netcoreapp3.1 2.11μs 0.659ns 2.38ns 0.0222 0 0 1.64 KB
#6184 EnrichedLog net472 2.83μs 1.15ns 4.46ns 0.249 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 116μs 153ns 573ns 0.0583 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 122μs 86.3ns 311ns 0.0606 0 0 4.28 KB
master EnrichedLog net472 153μs 148ns 572ns 0.69 0.23 0 4.46 KB
#6184 EnrichedLog net6.0 116μs 166ns 644ns 0.0576 0 0 4.28 KB
#6184 EnrichedLog netcoreapp3.1 120μs 279ns 1.08μs 0.0595 0 0 4.28 KB
#6184 EnrichedLog net472 152μs 111ns 428ns 0.685 0.228 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.06μs 0.767ns 2.87ns 0.0307 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.22μs 1.49ns 5.77ns 0.0294 0 0 2.2 KB
master EnrichedLog net472 4.89μs 2.67ns 10.4ns 0.32 0 0 2.02 KB
#6184 EnrichedLog net6.0 3.11μs 2.39ns 9.26ns 0.0301 0 0 2.2 KB
#6184 EnrichedLog netcoreapp3.1 4.13μs 2.08ns 8.06ns 0.0289 0 0 2.2 KB
#6184 EnrichedLog net472 5.01μs 1.65ns 6.19ns 0.32 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.39μs 0.816ns 3.16ns 0.016 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.79μs 0.844ns 2.92ns 0.0156 0 0 1.14 KB
master SendReceive net472 2.18μs 1.04ns 4.05ns 0.183 0.00109 0 1.16 KB
#6184 SendReceive net6.0 1.38μs 1.15ns 4.45ns 0.0159 0 0 1.14 KB
#6184 SendReceive netcoreapp3.1 1.77μs 0.904ns 3.5ns 0.015 0 0 1.14 KB
#6184 SendReceive net472 2.09μs 1.26ns 4.87ns 0.183 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.76μs 0.913ns 3.42ns 0.022 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.89μs 2.25ns 8.44ns 0.0215 0 0 1.65 KB
master EnrichedLog net472 4.46μs 4.67ns 18.1ns 0.322 0 0 2.04 KB
#6184 EnrichedLog net6.0 2.64μs 0.97ns 3.76ns 0.0224 0 0 1.6 KB
#6184 EnrichedLog netcoreapp3.1 3.81μs 1.45ns 5.23ns 0.0209 0 0 1.65 KB
#6184 EnrichedLog net472 4.29μs 3.22ns 12ns 0.323 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #6184

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 1.237 490.44 396.59

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 490ns 0.31ns 1.2ns 0.00809 0 0 576 B
master StartFinishSpan netcoreapp3.1 603ns 0.916ns 3.3ns 0.00775 0 0 576 B
master StartFinishSpan net472 683ns 0.282ns 1.09ns 0.0916 0 0 578 B
master StartFinishScope net6.0 493ns 0.255ns 0.987ns 0.00968 0 0 696 B
master StartFinishScope netcoreapp3.1 754ns 4.12ns 22.5ns 0.0093 0 0 696 B
master StartFinishScope net472 902ns 0.969ns 3.75ns 0.104 0 0 658 B
#6184 StartFinishSpan net6.0 397ns 0.188ns 0.702ns 0.00801 0 0 576 B
#6184 StartFinishSpan netcoreapp3.1 545ns 0.259ns 0.934ns 0.00778 0 0 576 B
#6184 StartFinishSpan net472 761ns 1ns 3.88ns 0.0918 0 0 578 B
#6184 StartFinishScope net6.0 541ns 0.25ns 0.967ns 0.00976 0 0 696 B
#6184 StartFinishScope netcoreapp3.1 784ns 0.538ns 2.01ns 0.0092 0 0 696 B
#6184 StartFinishScope net472 904ns 0.536ns 2.07ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 647ns 0.516ns 2ns 0.00971 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 960ns 0.623ns 2.41ns 0.00958 0 0 696 B
master RunOnMethodBegin net472 1.09μs 0.976ns 3.78ns 0.104 0 0 658 B
#6184 RunOnMethodBegin net6.0 716ns 0.424ns 1.53ns 0.00971 0 0 696 B
#6184 RunOnMethodBegin netcoreapp3.1 950ns 0.7ns 2.62ns 0.00937 0 0 696 B
#6184 RunOnMethodBegin net472 1.17μs 0.428ns 1.66ns 0.104 0 0 658 B

@andrewlock andrewlock requested review from a team as code owners October 25, 2024 08:34
Copy link
Contributor

Snapshots difference summary

The following differences have been observed in committed snapshots. It is meant to help the reviewer.
The diff is simplistic, so please check some files anyway while we improve it.

1 occurrences of :

+    Name: initial,
+    Resource: initial,
+    Service: Samples.ManualInstrumentation,
+    Tags: {
+      env: integration_tests,
+      language: dotnet,
+      runtime-id: Guid_1
+    },
+    Metrics: {
+      process_id: 0,
+      _dd.top_level: 1.0,
+      _dd.tracer_kr: 1.0,
+      _sampling_priority_v1: 1.0
+    }
+  },
+  {
+    TraceId: Id_3,
+    SpanId: Id_4,

1 occurrences of :

-    TraceId: Id_1,
-    SpanId: Id_3,
+    TraceId: Id_3,
+    SpanId: Id_5,
[...]
-    ParentId: Id_2,
+    ParentId: Id_4,
[...]
-    TraceId: Id_1,
-    SpanId: Id_4,
+    TraceId: Id_3,
+    SpanId: Id_6,
[...]
-    ParentId: Id_3,
+    ParentId: Id_5,

1 occurrences of :

-    TraceId: Id_5,
-    SpanId: Id_6,
+    TraceId: Id_7,
+    SpanId: Id_8,

1 occurrences of :

-    TraceId: Id_7,
-    SpanId: Id_8,
+    TraceId: Id_9,
+    SpanId: Id_10,

1 occurrences of :

-    TraceId: Id_7,
-    SpanId: Id_9,
+    TraceId: Id_9,
+    SpanId: Id_11,
[...]
-    ParentId: Id_8,
+    ParentId: Id_10,

1 occurrences of :

-    TraceId: Id_7,
-    SpanId: Id_10,
+    TraceId: Id_9,
+    SpanId: Id_12,
[...]
-    ParentId: Id_9,
+    ParentId: Id_11,

1 occurrences of :

-    TraceId: Id_11,
-    SpanId: Id_12,
+    TraceId: Id_13,
+    SpanId: Id_14,

1 occurrences of :

-    TraceId: Id_13,
-    SpanId: Id_14,
+    TraceId: Id_15,
+    SpanId: Id_16,

1 occurrences of :

-    TraceId: Id_13,
-    SpanId: Id_15,
+    TraceId: Id_15,
+    SpanId: Id_17,
[...]
-    ParentId: Id_14,
+    ParentId: Id_16,
[...]
-    TraceId: Id_16,
-    SpanId: Id_17,
+    TraceId: Id_18,
+    SpanId: Id_19,

1 occurrences of :

-    TraceId: Id_18,
-    SpanId: Id_19,
+    TraceId: Id_20,
+    SpanId: Id_21,

1 occurrences of :

-    TraceId: Id_18,
-    SpanId: Id_20,
+    TraceId: Id_20,
+    SpanId: Id_22,
[...]
-    ParentId: Id_19,
+    ParentId: Id_21,
[...]
-    TraceId: Id_18,
-    SpanId: Id_21,
+    TraceId: Id_20,
+    SpanId: Id_23,
[...]
-    ParentId: Id_20,
+    ParentId: Id_22,

1 occurrences of :

-    TraceId: Id_22,
-    SpanId: Id_23,
+    TraceId: Id_24,
+    SpanId: Id_25,

1 occurrences of :

-    TraceId: Id_24,
-    SpanId: Id_25,
+    TraceId: Id_26,
+    SpanId: Id_27,

1 occurrences of :

-    TraceId: Id_26,
-    SpanId: Id_27,
+    TraceId: Id_28,
+    SpanId: Id_29,

1 occurrences of :

-    TraceId: Id_26,
-    SpanId: Id_28,
+    TraceId: Id_28,
+    SpanId: Id_30,
[...]
-    ParentId: Id_27,
+    ParentId: Id_29,

1 occurrences of :

-    TraceId: Id_26,
-    SpanId: Id_29,
+    TraceId: Id_28,
+    SpanId: Id_31,
[...]
-    ParentId: Id_28,
+    ParentId: Id_30,
[...]
-    TraceId: Id_26,
-    SpanId: Id_30,
+    TraceId: Id_28,
+    SpanId: Id_32,
[...]
-    ParentId: Id_29,
+    ParentId: Id_31,

1 occurrences of :

-    TraceId: Id_31,
-    SpanId: Id_32,
+    TraceId: Id_33,
+    SpanId: Id_34,

1 occurrences of :

-    TraceId: Id_31,
-    SpanId: Id_33,
+    TraceId: Id_33,
+    SpanId: Id_35,
[...]
-    ParentId: Id_32,
+    ParentId: Id_34,
[...]
-    TraceId: Id_31,
-    SpanId: Id_34,
+    TraceId: Id_33,
+    SpanId: Id_36,
[...]
-    ParentId: Id_33,
+    ParentId: Id_35,
[...]
-    TraceId: Id_31,
-    SpanId: Id_35,
+    TraceId: Id_33,
+    SpanId: Id_37,
[...]
-    ParentId: Id_34,
+    ParentId: Id_36,

1 occurrences of :

-    TraceId: Id_36,
-    SpanId: Id_37,
+    TraceId: Id_38,
+    SpanId: Id_39,

1 occurrences of :

-    TraceId: Id_38,
-    SpanId: Id_39,
+    TraceId: Id_40,
+    SpanId: Id_41,

1 occurrences of :

-    TraceId: Id_38,
-    SpanId: Id_40,
+    TraceId: Id_40,
+    SpanId: Id_42,
[...]
-    ParentId: Id_39,
+    ParentId: Id_41,
[...]
-    TraceId: Id_38,
-    SpanId: Id_41,
+    TraceId: Id_40,
+    SpanId: Id_43,
[...]
-    ParentId: Id_40,
+    ParentId: Id_42,
[...]
-    TraceId: Id_38,
-    SpanId: Id_42,
+    TraceId: Id_40,
+    SpanId: Id_44,
[...]
-    ParentId: Id_41,
+    ParentId: Id_43,

1 occurrences of :

-    TraceId: Id_43,
-    SpanId: Id_44,
+    TraceId: Id_45,
+    SpanId: Id_46,

1 occurrences of :

-    TraceId: Id_45,
-    SpanId: Id_46,
+    TraceId: Id_47,
+    SpanId: Id_48,

1 occurrences of :

-    TraceId: Id_45,
-    SpanId: Id_47,
+    TraceId: Id_47,
+    SpanId: Id_49,
[...]
-    ParentId: Id_46,
+    ParentId: Id_48,
[...]
-    TraceId: Id_45,
-    SpanId: Id_48,
+    TraceId: Id_47,
+    SpanId: Id_50,
[...]
-    ParentId: Id_47,
+    ParentId: Id_49,
[...]
-    TraceId: Id_45,
-    SpanId: Id_49,
+    TraceId: Id_47,
+    SpanId: Id_51,
[...]
-    ParentId: Id_48,
+    ParentId: Id_50,

1 occurrences of :

-    TraceId: Id_50,
-    SpanId: Id_51,
+    TraceId: Id_52,
+    SpanId: Id_53,

1 occurrences of :

-    TraceId: Id_52,
-    SpanId: Id_53,
+    TraceId: Id_54,
+    SpanId: Id_55,

@andrewlock andrewlock force-pushed the tony/reject-ngen-images-due-rejitting branch 2 times, most recently from ea180ac to 24e0ee4 Compare October 25, 2024 16:03
@andrewlock andrewlock force-pushed the tony/reject-ngen-images-due-rejitting branch from 24e0ee4 to 1420a2d Compare October 28, 2024 11:46
Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this!

@tonyredondo tonyredondo merged commit c76545c into master Oct 28, 2024
80 checks passed
@tonyredondo tonyredondo deleted the tony/reject-ngen-images-due-rejitting branch October 28, 2024 16:45
@github-actions github-actions bot added this to the vNext-v3 milestone Oct 28, 2024
andrewlock added a commit that referenced this pull request Oct 29, 2024
## Summary of changes

Try to make sure the new Ngen tests don't flake

## Reason for change

#6184 added some new tests which use NGEN. Those tests seem to be flaky
(on x86 particularly). AFAICT it's a timeout issue with the `ngen.exe`
execution, not an issue with the underlying implementation.

## Implementation details

- Increase the ngen.exe timeout
- Remove the call to `ngen display` - it was only added for debugging,
and seems to take a _long_ time
- Add the tests to a collection (in .NET FX) so that they don't run in
parallel with other tests

## Test coverage

This is the test
andrewlock added a commit that referenced this pull request Oct 29, 2024
…able (#6154)

## Summary of changes

Before injecting the startup hook, check if the type has an explicit
static constructor. If so, skip injecting.

## Reason for change

This fixes an issue with manual instrumentation in v3 when the
entrypoint contains an explicit static constructor. A simple
reproduction is the following:

```csharp
public static class Program
{
    private static readonly Datadog.Trace.Tracer _tracer = Datadog.Trace.Tracer.Instance;
    static Program()
    {
        _ = _tracer;
    }

    public static void Main(string[] args)
    {
        var builder = WebApplication.CreateBuilder(args);
        var app = builder.Build();

        app.MapGet("/", () => {
            using var scope = _tracer.StartActive("custom-operation-name");
            return "Hello World!";
        });

        app.Run();
    }
}
```

In this example, we inject into the entrypoint `Program.Main()`. We add
the startup hook which initializes the instrumentation. However, the JIT
sees that there is a static constructor and manually inserts a call to a
static field to trigger its execution. This is inserted at the start of
`Program.Main` i.e. _before_ our startup hook.

> Note that this _only_ happens in .NET Core - in .NET Framework the
`.cctor` is re-jitted before the entrypoint, so the problem doesn't
exist. So we limit this implementation to .NET Core to reduce the blast
radius.

In this situation, we don't see the initialization of the tracer, so we
miss the ReJit calls, and the manual instrumentation library ends up in
a half-instrumented state.

This is particularly noticeable for the manual instrumentation case,
because we have a lot of instrumentation, but technically it applies to
anything that is called from a static constructor in the entrypoint.


## Implementation details

When choosing whether to inject the startup hook, check whether the
entrypoint type contains a static constructor. If it does, _and the
constructor is not an implicit constructor_, then skip instrumenting, on
the basis that the JIT will subsequently inject a call to the static
constructor, which we will then instrument, ensuring we _are_ the first
call in the app.

In some cases (i.e. IIS) we're specifically choosing where we want to
instrument
(`System.Web.Compilation.BuildManager.InvokePreStartInitMethods()`) so
we don't try to change anything in that case.

The implicit static constructor is annoying. If there's a static field,
but no explicit static constructor, then the profiling API will return a
static constructor. However, we _shouldn't_ skip on the basis of this
implicit constructor, because `JITCompilationStarted` is never called
for it, so we would end up initializing too late. To detect this, [we
check whether `beforefieldinit` is
set](https://csharpindepth.com/Articles/BeforeFieldInit) to try to
decide whether to rely on the static constructor actually being called
or not.

There's obviously a risk with all this; causing things to initialize in
the wrong order is a perennial problem. If for some (strange) reason the
static constructor is _not_ invoked next, then we may end up injection
somewhere else, which would be... strange... but also Bad™😅 At least
we're not doing this behavior in .NET FX which is often the source of
weirdness!

## Test coverage

Added a reproduction to `Samples.ManualInstrumentation`. Without the
change to startup hook injection the tests fail at
`ThrowIf(string.IsNullOrEmpty(Tracer.Instance.DefaultServiceName))`
(because it _is_ empty) and also at `Tracer.Instance.StartActive()`

## Other details

Discovered this behaviour while trying to solve #6124
#6184 actually resolved the issue, so this is just an edge case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants