Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native frames missing crashreport.json #63309

Closed
kdubau opened this issue Jan 3, 2022 · 4 comments
Closed

Native frames missing crashreport.json #63309

kdubau opened this issue Jan 3, 2022 · 4 comments
Assignees
Milestone

Comments

@kdubau
Copy link
Member

kdubau commented Jan 3, 2022

Description

The crashreport.json file seems to be missing all native frames up until the first managed frame, except for the very first one, which is missing the unmanaged_name field.

MicrosoftTeams-image

Reproduction Steps

Enabled the core dump, and the crash report, and induce a crash. Compare the json with what you see in LLDB with clrstack -f.

Expected behavior

All stack frames should be in the json report.

Actual behavior

Missing frames.

Regression?

Yes, introduced with this PR #60995

Known Workarounds

None

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 3, 2022
@mikem8361 mikem8361 self-assigned this Jan 3, 2022
@mikem8361 mikem8361 added this to the 6.0.x milestone Jan 3, 2022
@mikem8361 mikem8361 added area-Diagnostics-coreclr and removed untriaged New issue has not been triaged by the area owner labels Jan 3, 2022
@ghost
Copy link

ghost commented Jan 3, 2022

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

The crashreport.json file seems to be missing all native frames up until the first managed frame, except for the very first one, which is missing the unmanaged_name field.

MicrosoftTeams-image

Reproduction Steps

Enabled the core dump, and the crash report, and induce a crash. Compare the json with what you see in LLDB with clrstack -f.

Expected behavior

All stack frames should be in the json report.

Actual behavior

Missing frames.

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

Author: kdubau
Assignees: mikem8361
Labels:

area-Diagnostics-coreclr

Milestone: 6.0.x

@kdubau
Copy link
Member Author

kdubau commented Jan 5, 2022

@mikem8361 I also just noticed, if you look at the first frame in the json, it's tagged "is_managed": "true" but it's a native frame.

mikem8361 added a commit to mikem8361/runtime that referenced this issue Jan 13, 2022
The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

Issue: dotnet#63309
jeffschwMSFT pushed a commit that referenced this issue Jan 13, 2022
* Fix the MacOS remote unwinder for VS4Mac

The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

Issue: #63309

* Fix native frame unwind in syscall on arm64 for VS4Mac crash report

From PR in main: #63598

Add arm64 version of StepWithCompactNoEncoding for syscall leaf node wrappers that have compact encoding of 0.

Fix ReadCompactEncodingRegister so it actually decrements the addr.

Change StepWithCompactEncodingArm64 to match what MacOS libunwind does for framed and frameless stepping.

arm64 can have frames with the same SP (but different IPs). Increment SP for this condition so createdump's unwind
loop doesn't break out on the "SP not increasing" check and the frames are added to the thread frame list in the
correct order.

Add getting the unwind info for tail called functions like this:

__ZL14PROCEndProcessPvji:
   36630:       f6 57 bd a9     stp     x22, x21, [sp, #-48]!
   36634:       f4 4f 01 a9     stp     x20, x19, [sp, #16]
   36638:       fd 7b 02 a9     stp     x29, x30, [sp, #32]
   3663c:       fd 83 00 91     add     x29, sp, #32
...
   367ac:       e9 01 80 52     mov     w9, #15
   367b0:       7f 3e 02 71     cmp     w19, #143
   367b4:       20 01 88 1a     csel    w0, w9, w8, eq
   367b8:       2e 00 00 94     bl      _PROCAbort
_TerminateProcess:
-> 367bc:       22 00 80 52     mov     w2, #1
   367c0:       9c ff ff 17     b       __ZL14PROCEndProcessPvji

The IP (367bc) returns the (incorrect) frameless encoding with nothing on the stack (uses an incorrect LR to unwind). To fix this
get the unwind info for PC -1 which points to PROCEndProcess with the correct unwind info. This matches how lldb unwinds this frame.

Always address module segment to IP lookup list instead of checking the module regions.

Strip pointer authentication bits on PC/LR.
github-actions bot pushed a commit that referenced this issue Jan 13, 2022
The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

Issue: #63309
safern pushed a commit that referenced this issue Jan 13, 2022
* Fix the MacOS remote unwinder for VS4Mac

The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.

Issue: #63309

* Fix native frame unwind in syscall on arm64 for VS4Mac crash report

From PR in main: #63598

Add arm64 version of StepWithCompactNoEncoding for syscall leaf node wrappers that have compact encoding of 0.

Fix ReadCompactEncodingRegister so it actually decrements the addr.

Change StepWithCompactEncodingArm64 to match what MacOS libunwind does for framed and frameless stepping.

arm64 can have frames with the same SP (but different IPs). Increment SP for this condition so createdump's unwind
loop doesn't break out on the "SP not increasing" check and the frames are added to the thread frame list in the
correct order.

Add getting the unwind info for tail called functions like this:

__ZL14PROCEndProcessPvji:
   36630:       f6 57 bd a9     stp     x22, x21, [sp, #-48]!
   36634:       f4 4f 01 a9     stp     x20, x19, [sp, #16]
   36638:       fd 7b 02 a9     stp     x29, x30, [sp, #32]
   3663c:       fd 83 00 91     add     x29, sp, #32
...
   367ac:       e9 01 80 52     mov     w9, #15
   367b0:       7f 3e 02 71     cmp     w19, #143
   367b4:       20 01 88 1a     csel    w0, w9, w8, eq
   367b8:       2e 00 00 94     bl      _PROCAbort
_TerminateProcess:
-> 367bc:       22 00 80 52     mov     w2, #1
   367c0:       9c ff ff 17     b       __ZL14PROCEndProcessPvji

The IP (367bc) returns the (incorrect) frameless encoding with nothing on the stack (uses an incorrect LR to unwind). To fix this
get the unwind info for PC -1 which points to PROCEndProcess with the correct unwind info. This matches how lldb unwinds this frame.

Always address module segment to IP lookup list instead of checking the module regions.

Strip pointer authentication bits on PC/LR.

Co-authored-by: Mike McLaughlin <mikem@microsoft.com>
@mikem8361
Copy link
Member

This has been fixed in main/6.0 now.

@ghost ghost locked as resolved and limited conversation to collaborators Feb 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants