Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insert a breakpoint instruction after throwing calls #101037

Closed
wants to merge 3 commits into from

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Apr 15, 2024

A follow up for #100900

See discussion at: #100900 (comment)

Instead of inserting a breakpoint at the end of BBJ_THROW blocks that end on a throwing call, we insert a breakpoint after throwing calls.

The change should end up with roughly the same number of added breakpoint instructions, since a BBJ_THROW nearly always ends with a throwing call anyways. However, this approach would be more robust in rare situations when a throwing call is followed by some extra code, which may potentially be removed at later phases.

@VSadov VSadov added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 15, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@VSadov VSadov marked this pull request as ready for review April 16, 2024 20:01
@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

A fairly trivial change - mostly moving insertion of BRK from one place to another which should result in covering a few places that older approach missed.

Somehow it causes crossgen2 to assert and then hang when building clr subset:

  Opening /home/vsadov/dotnet002/runtime/artifacts/mibc/linux/arm/data/DotNet_Adhoc.mibc
  Opening /home/vsadov/dotnet002/runtime/artifacts/mibc/linux/arm/data/DotNet_FirstTimeXP.mibc
  Opening /home/vsadov/dotnet002/runtime/artifacts/mibc/linux/arm/data/DotNet_FSharp.mibc
  Opening /home/vsadov/dotnet002/runtime/artifacts/mibc/linux/arm/data/DotNet_HelloWorld.mibc
  Opening /home/vsadov/dotnet002/runtime/artifacts/mibc/linux/arm/data/DotNet_OrchardCore.mibc
  Opening /home/vsadov/dotnet002/runtime/artifacts/mibc/linux/arm/data/DotNet_TechEmpower.mibc
  Generated /home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/StandardOptimizationData.mibc
  Validated /home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/StandardOptimizationData.mibc
  Generating native image of System.Private.CoreLib for linux.arm.Checked. Logging to
  /home/vsadov/dotnet002/runtime/dotnet.sh /home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/arm64/crossgen2/tools/crossgen2.dll -o:/home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/System.Private.CoreLib.dll -r:/home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/IL/*.dll --targetarch:arm --targetos:linux -m:/home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/StandardOptimizationData.mibc --embed-pgo-data -O --verify-type-and-field-layout /home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/IL/System.Private.CoreLib.dll --perfmap-format-version:1 --perfmap --perfmap-path:/home/vsadov/dotnet002/runtime/artifacts/bin/coreclr/linux.arm.Checked/
  /home/vsadov/dotnet002/runtime/src/coreclr/jit/emit.h:2406
  Assertion failed 'offset < (emitTotalHotCodeSize + emitTotalColdCodeSize)' in 'System.Threading.EventWaitHandle:TryOpenExisting(System.String,byref):ubyte' during 'Emit code' (IL size 11; hash 0xae6c977d; FullOpts)

  /home/vsadov/dotnet002/runtime/src/coreclr/jit/emit.h:2406
  Assertion failed 'offset < (emitTotalHotCodeSize + emitTotalColdCodeSize)' in 'System.Threading.Semaphore:TryOpenExisting(System.String,byref):ubyte' during 'Emit code' (IL size 11; hash 0x32a77fca; FullOpts)

If crossgen is disabled, product builds and tests run.
I suspect, perhaps it is something with applying .mibs

@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

The failures are indeed around places where we could pick a few cases not covered before.

EventWaitHandle.TryOpenExisting

        [SupportedOSPlatform("windows")]
        public static bool TryOpenExisting(string name, [NotNullWhen(true)] out EventWaitHandle? result) =>
            OpenExistingWorker(name, out result!) == OpenExistingResult.Success;

and OpenExistingWorker throws PNSE on Unix instead of returning a value

        private static OpenExistingResult OpenExistingWorker(string name, out EventWaitHandle? result)
        {
            throw new PlatformNotSupportedException(SR.PlatformNotSupported_NamedSynchronizationPrimitives);
        }

Since the throwing call is an argument to a comparison, it is very possible that it had some dead code after the call and was not seen by the previous approach, since the call is not at the end of the block.

But weird thing is that this does not impact normal execution of tests and seems to only impact crossgen.
Crossgen has extra inputs that are not specific to the code being built, so I think maybe this is some disconnect between expected/actual.

@@ -394,7 +394,7 @@
Test="true" Category="tools"/>
</ItemGroup>

<ItemGroup Condition="$(_subset.Contains('+clr.nativecorelib+'))">
<ItemGroup Condition="$(_subset.Contains('+clr.nativecorelib+')) and '$(TargetArchitecture)' != 'arm'">
Copy link
Member Author

@VSadov VSadov Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A temporary workaround for the crossgen assert/hang.
Maybe there is a better solution...

@AndyAyersMS
Copy link
Member

Did you look into breaking trees with no return calls and discarding anything that comes later? Would be nice if there weren't special cases like this.

@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

Did you look into breaking trees with no return calls and discarding anything that comes later? Would be nice if there weren't special cases like this.

From the discussion #100900 (comment) it seemed that there are cases where this situation cannot be easily fixed and adding a breakpoint after calls is a more robust strategy.

@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

I agree that removing dead code is a good thing on its own though. Dead code could be a nuisance, since it may not conform with general invariants and in the end is not needed, so removing it is often better than figuring how to deal with situations that arise.

I have seen code that tries to remove everything in a block after a throwing call, but it looks like it does not always work. Perhaps some cases materialize too late.

@AndyAyersMS
Copy link
Member

I have seen code that tries to remove everything in a block after a throwing call, but it looks like it does not always work. Perhaps some cases materialize too late.

Morph does this, but at statement level granularity, so if the throwing call is a subtree it may leave stuff up above sometimes. It may be painful to clean up in tree form. Once we get to linear form it should be simpler to get rid of whatever comes after the call.

@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

Once we get to linear form it should be simpler to get rid of whatever comes after the call.

That sounds like just stop emitting the block (and perhaps wiping the rest) after seeing a no-return call. Is that the idea?

@AndyAyersMS
Copy link
Member

Once we get to linear form it should be simpler to get rid of whatever comes after the call.

That sounds like just stop emitting the block after seeing a no-return call. Is that the idea?

More or less, yes.

There may be a bit more to it than that, if the throwing call's use has other uses you may need to mark them as unused or something.

@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

Would be an interesting thing to try.

But we will still insert BRK in CodeGen::genCall, right?

One reason for this could be that CodeGen::genCall may try to emit more instructions after the call - seems like it may try to swap return registers between ABI and expected, there is something SIMD related, etc... Not sure how much of that could be possibly applicable to a no-return call, but seems like some of that might.

Even if we insert BRK in codegen, CodeGen::genCall would need to know to ensure the call instruction is the last one if the call is no-return. Then why not just insert break point there and be sure that nothing can sneak in between.

@VSadov
Copy link
Member Author

VSadov commented Apr 16, 2024

Then why not just insert break point there and be sure that nothing can sneak in between.

Actually that is not very important. The extra instructions could be redundant, but if the breakpoint comes after, it is still ok.

@VSadov
Copy link
Member Author

VSadov commented Apr 17, 2024

Interesting failure. We morph binary operators that are statically overflowing into comma(throw(OVF), 0) .

tree = gtNewOperNode(GT_COMMA, tree->TypeGet(), op1, op2);

It looks like 0 literals can consume temps, so if I just drop the rest of the block, I get

Assertion failed 'tmpGetCount == 0' in 'ILGEN_0xc2f3d0c8:Method_0x5cc2(double):int'

Is there a way to manufacture a value that would not use any kind of temp?
Assuming that we know that we will not emit it.

Alternatively - maybe we can tell that this instance of CORINFO_HELP_OVERFLOW actually returns nonvoid type and use it as-is, not wrap it in a comma.
(I tried, but got some asserts about unmorphed nodes, does it have to be comma?)

Also the problem with temps could be more general than just this case. I am not sure yet.

@jakobbotsch
Copy link
Member

Can you share a jitdump? It sounds to me like the call node has a user if it is ending up with a temp, and that the 0 there is unrelated. How are you removing the nodes after the call? It should be simple to do that at the end of LowerCall for the no-return calls, with code similar to this:
https://github.com/dotnet/runtimelab/blob/e69dda51c7d796b8122d0f55b560bc44094a4bec/src/coreclr/jit/lower.cpp#L4720-L4723

@VSadov
Copy link
Member Author

VSadov commented Jun 25, 2024

I think this is obsolete now. We do the same as a result of interruptibility change which inserts breakpoints in these locations.

@VSadov VSadov closed this Jun 25, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants