Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable QJFL and OSR by default for x64 and arm64 #63642

Closed

Conversation

AndyAyersMS
Copy link
Member

Change these default values when the jit targets x64 or arm64:

  • COMPlus_TC_QuickJitForLoops=1
  • COMPlus_TC_OnStackReplacement=1

The upshot is that on x64/arm64 more methods will be jitted at Tier0,
and we will rely on OSR to get out of long-running Tier0 methods.

Other architectures continue to use the old behavior for now, as
OSR is not yet supported for x86 or arm.

Change these default values when the jit targets x64 or arm64:

* COMPlus_TC_QuickJitForLoops=1
* COMPlus_TC_OnStackReplacement=1

The upshot is that on x64/arm64 more methods will be jitted at Tier0,
and we will rely on OSR to get out of long-running Tier0 methods.

Other architectures continue to use the old behavior for now, as
OSR is not yet supported for x86 or arm.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 11, 2022
@ghost ghost assigned AndyAyersMS Jan 11, 2022
@ghost
Copy link

ghost commented Jan 11, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Change these default values when the jit targets x64 or arm64:

  • COMPlus_TC_QuickJitForLoops=1
  • COMPlus_TC_OnStackReplacement=1

The upshot is that on x64/arm64 more methods will be jitted at Tier0,
and we will rely on OSR to get out of long-running Tier0 methods.

Other architectures continue to use the old behavior for now, as
OSR is not yet supported for x86 or arm.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib FYI -- not yet ready to merge this, but need a PR to kick off a private perf run.

@AndyAyersMS
Copy link
Member Author

Merged to pick up stress fixes from #62980 and hopefully better CI health.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr pgo

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr libraries-pgo

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

Looks like the various extra test legs are not running cleanly, so expect some failures there.

@AndyAyersMS
Copy link
Member Author

Lots of arm64 failures, but suspect it's a bad machine (DDARM64-131)

baseservices/threading/regressions/13662 may be a real jitstress failure (w/osr...)

Assert failure(PID 8956 [0x000022fc], Thread: 12108 [0x2f4c]): !CREATE_CHECK_STRING(!"Detected use of a corrupted OBJECTREF. Possible GC hole.")

CORECLR! Object::ValidateInner'::1'::catch$12 + 0x12C (0x00007ff8b41944cc) CORECLR! CallSettingFrame + 0x68 (0x00007ff8b3bc2ec0)
CORECLR! _FrameHandler3::CxxCallCatchBlock + 0x130 (0x00007ff8b40f7110) NTDLL! RtlRestoreContext + 0xEC (0x00007ff8ea4e368c)
CORECLR! Object::ValidateInner + 0x168 (0x00007ff8b3cbf1d0) CORECLR! Object::Validate + 0xC0 (0x00007ff8b3cbf040)
CORECLR! GcInfoDecoder::ReportStackSlotToGC + 0x138 (0x00007ff8b3eed8c0) CORECLR! GcInfoDecoder::ReportSlotToGC + 0x1F8 (0x00007ff8b3eed768)
CORECLR! GcInfoDecoder::EnumerateLiveSlots + 0x61C (0x00007ff8b3eebd2c) CORECLR! EECodeManager::EnumGcRefs + 0x2CC (0x00007ff8b3c300ec)
File: D:\a_work\1\s\src\coreclr\vm\object.cpp Line: 591
Image: D:\h\w\AE9609A4\p\corerun.exe

Return code:      1
Raw output file:      D:\h\w\AE9609A4\w\B15D099F\uploads\Reports\baseservices.threading\regressions\13662\13662-a\13662-a.output.txt
Raw output:
BEGIN EXECUTION
"D:\h\w\AE9609A4\p\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false"  13662-a.dll
Loop 1
Loop 2
Loop 3
Loop 4
Loop 5
Loop 6
Loop 7
Loop 8
Loop 9
Loop 10
Loop -1692357495
Expected: 100

System.Text.Json.Serialization.Tests.StreamTests_Sync.HandleCollectionsAsync may be a real failure in libraries PGO

  Discovering: System.Text.Json.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Text.Json.Tests (found 2808 of 2866 test cases)
  Starting:    System.Text.Json.Tests (parallel test collections = on, max threads = 4)
    System.Text.Json.Serialization.Tests.StreamTests_Sync.HandleCollectionsAsync [FAIL]
      Assert.Equal() Failure
                                       ↓ (pos 12838)
      Expected: ···eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee···
      Actual:   ···eeeeeeeeeeeeeeeeeeeettttttttttttttttttttttttttttttttttttttttt···
                                       ↑ (pos 12838)
      Stack Trace:
        /_/src/libraries/System.Text.Json/tests/Common/JsonTestHelper.cs(61,0): at System.Text.Json.JsonTestHelper.AssertJsonEqual(JsonElement expected, JsonElement actual)
        /_/src/libraries/System.Text.Json/tests/Common/JsonTestHelper.cs(45,0): at System.Text.Json.JsonTestHelper.AssertJsonEqual(JsonElement expected, JsonElement actual)
        /_/src/libraries/System.Text.Json/tests/Common/JsonTestHelper.cs(55,0): at System.Text.Json.JsonTestHelper.AssertJsonEqual(JsonElement expected, JsonElement actual)
        /_/src/libraries/System.Text.Json/tests/Common/JsonTestHelper.cs(45,0): at System.Text.Json.JsonTestHelper.AssertJsonEqual(JsonElement expected, JsonElement actual)
        /_/src/libraries/System.Text.Json/tests/Common/JsonTestHelper.cs(20,0): at System.Text.Json.JsonTestHelper.AssertJsonEqual(String expected, String actual)
        /_/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Stream.Collections.cs(65,0): at System.Text.Json.Serialization.Tests.StreamTests.PerformSerialization[TElement](Object obj, Type type, JsonSerializerOptions options)
        /_/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Stream.Collections.cs(54,0): at System.Text.Json.Serialization.Tests.StreamTests.RunTestAsync[TElement]()
        /_/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Stream.Collections.cs(25,0): at System.Text.Json.Serialization.Tests.StreamTests.HandleCollectionsAsync()
        --- End of stack trace from previous location ---
  Finished:    System.Text.Json.Tests

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Jan 20, 2022

I can repro the System.Text.Json.Serialization.Tests.StreamTests_Sync.HandleCollectionsAsync failure but it's intermittent; have not yet pinned down where things go wrong, or how it's related to QJFL/OSR.

On my RPI4, I can repro this about 10% of the time; each run takes ~400s currently, so no insights yet. There is a second failure symptom that crops up too:

    System.Text.Json.Serialization.Tests.StreamTests_Sync.HandleCollectionsAsync [FAIL]
      System.Text.Json.JsonReaderException : '0x00' is invalid within a JSON string. The string should be correctly escaped. LineNumber: 9 | BytePositionInLine: 12573.
      Stack Trace:
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/ThrowHelper.cs(279,0): at System.Text.Json.ThrowHelper.ThrowJsonReaderException(Utf8JsonReader& json, ExceptionResource resource, Byte nextByte, ReadOnlySpan`1 bytes)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(1365,0): at System.Text.Json.Utf8JsonReader.ConsumeStringAndValidate(ReadOnlySpan`1 data, Int32 idx)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(1293,0): at System.Text.Json.Utf8JsonReader.ConsumeString()
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(1036,0): at System.Text.Json.Utf8JsonReader.ConsumeValue(Byte marker)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(1801,0): at System.Text.Json.Utf8JsonReader.ConsumeNextToken(Byte marker)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(1689,0): at System.Text.Json.Utf8JsonReader.ConsumeNextTokenOrRollback(Byte marker)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(884,0): at System.Text.Json.Utf8JsonReader.ReadSingleSegment()
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs(272,0): at System.Text.Json.Utf8JsonReader.Read()
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.cs(961,0): at System.Text.Json.JsonDocument.Parse(ReadOnlySpan`1 utf8JsonSpan, JsonReaderOptions readerOptions, MetadataDb& database, StackRowStack& stack)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.Parse.cs(692,0): at System.Text.Json.JsonDocument.Parse(ReadOnlyMemory`1 utf8Json, JsonReaderOptions readerOptions, Byte[] extraRentedArrayPoolBytes, PooledByteBufferWriter extraPooledByteBufferWriter)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.Parse.cs(261,0): at System.Text.Json.JsonDocument.Parse(ReadOnlyMemory`1 json, JsonDocumentOptions options)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonDocument.Parse.cs(321,0): at System.Text.Json.JsonDocument.Parse(String json, JsonDocumentOptions options)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/tests/Common/JsonTestHelper.cs(19,0): at System.Text.Json.JsonTestHelper.AssertJsonEqual(String expected, String actual)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Stream.Collections.cs(65,0): at System.Text.Json.Serialization.Tests.StreamTests.PerformSerialization[TElement](Object obj, Type type, JsonSerializerOptions options)
        /home/andy/repos/runtime/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Stream.Collections.cs(53,0): at System.Text.Json.Serialization.Tests.StreamTests.RunTestAsync[TElement]()
        /home/andy/repos/runtime/src/libraries/System.Text.Json/tests/System.Text.Json.Tests/Serialization/Stream.Collections.cs(23,0): at System.Text.Json.Serialization.Tests.StreamTests.HandleCollectionsAsync()

Via DumpJittedMethods there are maybe 100 OSR instances created, so it could very well be OSR related.

Various attempts at getting this to repro faster and more consistently haven't panned out yet:

  • Tried reproing with release builds: no luck.
  • Tried narrowing things down to the failing test and passing the test assembly 32x on command line: no luck

Will see if enabling OSR stress gets us anywhere.

@AndyAyersMS
Copy link
Member Author

From comparing methods jitted in the passing and the failing runs, if it is OSR, it is likely one of these 4 methods:

  • System.Collections.Concurrent.ConcurrentDictionary`2[__Canon,__Canon][System.__Canon,System.__Canon]::GrowTable, IL size = 515, hash=0xaecb51f2 Tier1-OSR 0x16a
  • System.Text.Json.JsonWriterHelper::ToUtf8, IL size = 1025, hash=0xc84b6466 Tier1-OSR 0x67
  • System.Text.Json.Serialization.Converters.IEnumerableDefaultConverter`2[__Canon,__Canon][System.__Canon,System.__Canon]::OnWriteResume, IL size = 148, hash=0xa505af5a Tier1-OSR 0x46
  • System.Text.Json.Serialization.Converters.ListOfTConverter`2[__Canon,Int32][System.__Canon,System.Int32]::OnWriteResume, IL size = 169, hash=0xe34a3d49 Tier1-OSR 0x39

@AndyAyersMS
Copy link
Member Author

Via OSR stress, this method seems to cause problems:

  • System.Text.Json.Serialization.Converters.ObjectWithParameterizedConstructorConverter`1[__Canon][System.__Canon]::ReadConstructorArgumentsWithContinuation, IL size = 192, hash=0x065be1b8 Tier1-OSR 0x0

@AndyAyersMS
Copy link
Member Author

Think I may have spotted one issue. If the OSR method needs to report generic context, it can sometimes use the context reported by the Tier0 frame, but if that frame didn't report the context, the OSR method needs to report it on the OSR frame.

But in CodeGen::genReportGenericContextArg the OSR method was always assuming the Tier0 method had taken care of this; this diverged from the logic in lvaAssignVirtualFrameOffsetsToLocals. So we had an OSR method that setup a slot on its frame and reported it to GC but never initialized the slot.

On x64 perhaps we get lucky, and that slot tends to be zeroed which is why we didn't see failures? At any rate these two bits of logic need to agree. Testing a fix now.

@AndyAyersMS
Copy link
Member Author

Still a bit more work to do on OSR to fix x64 epilog unwind, so am going to close this.

@AndyAyersMS AndyAyersMS closed this Feb 2, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Feb 23, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants