Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the MacOS remote unwinder for VS4Mac #63405

Merged
merged 2 commits into from
Jan 13, 2022

Conversation

mikem8361
Copy link
Member

@mikem8361 mikem8361 commented Jan 5, 2022

Customer Impact

The VS4Mac team reported that native frames missing crashreport.json causing problems with their Watson analysis.

The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

The arm64 remote unwinding that createdump uses was very broken. Fixed various issues (see the below commit for details).

Issue: #63309

Testing

Local testing. Ran diagnostics tests against this change. Will get VS4Mac team to verify after this PR's build is finished.

Risk

Low. The changes are restricted to MacOS arm64 with a couple of low risk exceptions. Part of this change undo's a change/regression from PR #61063.

@mikem8361 mikem8361 self-assigned this Jan 5, 2022
@mikem8361 mikem8361 added this to the 6.0.x milestone Jan 5, 2022
@ghost
Copy link

ghost commented Jan 5, 2022

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Customer Impact

The VS4Mac team reported that native frames missing crashreport.json causing problems with their Watson analysis.

The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

Issue: #63309

Testing

Local testing. Ran diagnostics tests against this change. Will get VS4Mac team to verify after this PR's build is finished.

Risk

Low. It undo's a change/regression from PR #61063.

Author: mikem8361
Assignees: mikem8361
Labels:

area-Diagnostics-coreclr

Milestone: -

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. We should take for consideration in 6.0.x

@mikem8361
Copy link
Member Author

/cc: @kdubau

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@leecow leecow added the Servicing-consider Issue for next servicing release review label Jan 5, 2022
@kdubau
Copy link
Member

kdubau commented Jan 6, 2022

Talked to Mike offline, it looks fixed for x64 but not for Arm64.

@leecow leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Jan 6, 2022
@leecow leecow modified the milestones: 6.0.x, 6.0.2 Jan 6, 2022
@jeffschwMSFT
Copy link
Member

@mikem8361 is this ready to merge?

@mikem8361
Copy link
Member Author

There is still this issue on arm64 and have a fix that I'm testing. Can we wait until this afternoon?

@tommcdon tommcdon added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 6, 2022
@mikem8361
Copy link
Member Author

mikem8361 commented Jan 6, 2022

We should go ahead and merge. The arm64 changes are taking longer than I thought and will require more testing. I'll create another PR.

@mikem8361 mikem8361 removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 6, 2022
@jeffschwMSFT
Copy link
Member

Checking with @leecow / @mmitche if merging this or waiting for the additional commit is better. I think we are going to cherry pick the fix, so it might be ok to just hold and reassess tomorrow.

@leecow
Copy link
Member

leecow commented Jan 6, 2022

Let's hold until all of the fixes are in hand.

@mikem8361 mikem8361 added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 6, 2022
@safern
Copy link
Member

safern commented Jan 11, 2022

@mikem8361 @mmitche are we taking this in to release/6.0 still or should we retarget to the vs4mac branch?

@jeffschwMSFT
Copy link
Member

We have another set of fixes that we plan on brining along with this one. It is pending validation and porting to this branch.

The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

Issue: dotnet#63309
From PR in main: dotnet#63598

Add arm64 version of StepWithCompactNoEncoding for syscall leaf node wrappers that have compact encoding of 0.

Fix ReadCompactEncodingRegister so it actually decrements the addr.

Change StepWithCompactEncodingArm64 to match what MacOS libunwind does for framed and frameless stepping.

arm64 can have frames with the same SP (but different IPs). Increment SP for this condition so createdump's unwind
loop doesn't break out on the "SP not increasing" check and the frames are added to the thread frame list in the
correct order.

Add getting the unwind info for tail called functions like this:

__ZL14PROCEndProcessPvji:
   36630:       f6 57 bd a9     stp     x22, x21, [sp, #-48]!
   36634:       f4 4f 01 a9     stp     x20, x19, [sp, dotnet#16]
   36638:       fd 7b 02 a9     stp     x29, x30, [sp, dotnet#32]
   3663c:       fd 83 00 91     add     x29, sp, dotnet#32
...
   367ac:       e9 01 80 52     mov     w9, dotnet#15
   367b0:       7f 3e 02 71     cmp     w19, dotnet#143
   367b4:       20 01 88 1a     csel    w0, w9, w8, eq
   367b8:       2e 00 00 94     bl      _PROCAbort
_TerminateProcess:
-> 367bc:       22 00 80 52     mov     w2, dotnet#1
   367c0:       9c ff ff 17     b       __ZL14PROCEndProcessPvji

The IP (367bc) returns the (incorrect) frameless encoding with nothing on the stack (uses an incorrect LR to unwind). To fix this
get the unwind info for PC -1 which points to PROCEndProcess with the correct unwind info. This matches how lldb unwinds this frame.

Always address module segment to IP lookup list instead of checking the module regions.

Strip pointer authentication bits on PC/LR.
// are absolute address.
if (segment->fileoff == 0 && segment->filesize > 0)
// segment to get the actual address.
if (strcmp(segment->segname, SEG_TEXT) == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any guarantee that the first segment is the text segment, and that there's only one ever (the spec seems to imply that there is always at least one, but nothing about there being multiple or the ordering)? Otherwise we may get here more than once or have a wrong offset.

Also, what happens with the bias of data segments? Granted, this was already never considered if the first segment was always text.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is only __TEXT segment and it always seems to be the first one.

Copy link
Member

@hoyosjs hoyosjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is not a concern, LGTM

@kdubau
Copy link
Member

kdubau commented Jan 13, 2022

I verified the latest from Mike on both x64 and Arm64 and it looks like the issue is resolved. 👍

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. We should consider for 6.0.x

@jeffschwMSFT jeffschwMSFT removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 13, 2022
@jeffschwMSFT
Copy link
Member

@leecow and @mmitche this change is now ready. Is the January release still open for this change?

@leecow
Copy link
Member

leecow commented Jan 13, 2022

Let's discuss in Tactics this morning.

@jeffschwMSFT jeffschwMSFT merged commit 3c2d2a1 into dotnet:release/6.0 Jan 13, 2022
@safern
Copy link
Member

safern commented Jan 13, 2022

/backport to release/6.0-vs4mac

@github-actions
Copy link
Contributor

Started backporting to release/6.0-vs4mac: https://github.com/dotnet/runtime/actions/runs/1694232125

@mikem8361 mikem8361 deleted the fixunwind60 branch January 20, 2022 19:13
@ghost ghost locked as resolved and limited conversation to collaborators Feb 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants