Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

False positive "call stack was corrupted" warning on 0.3 #277

Closed
jonas-schievink opened this issue Nov 15, 2021 · 4 comments · Fixed by rust-embedded/cortex-m-rt#337
Closed
Labels
status: needs PR Issue just needs a Pull Request implementing the changes topic: unwinding type: bug Something isn't working

Comments

@jonas-schievink
Copy link
Contributor

Describe the bug
I'm seeing a (HOST) WARN call stack was corrupted; unwinding could not be completed, even though unwinding seems to have worked fine and nothing happened that could have corrupted the stack.

To Reproduce
Steps to reproduce the behavior:

Expected and observed behavior

Actual output:

    Finished test [unoptimized + debuginfo] target(s) in 0.01s
     Running tests/integration.rs (/home/jonas/dev/bxcan/target/thumbv7m-none-eabi/debug/deps/integration-7c7caa16bcb9d813)
(HOST) INFO  flashing program (16.20 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
(1/11) running `split_filters`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:18
(2/11) running `basic_roundtrip`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:43
(3/11) running `no_filters_no_frames`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:58
(4/11) running `filter_mask32_std`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:68
(5/11) running `filter_mask32_ext`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:112
(6/11) running `filter_mask16`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:153
(7/11) running `filter_list32_std`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:193
(8/11) running `filter_list32_ext`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:226
(9/11) running `dequeue_lower_priority_frame`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:264
(10/11) running `enable_non_blocking`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:317
(11/11) running `ext_roundtrip`...
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:330
all tests passed!
└─ integration::tests::__defmt_test_entry @ tests/integration.rs:4
────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: _ZN3lib6inline6__bkpt17h78c2a3232bb722b9E
        at ./asm/inline.rs:13:5
   1: __bkpt
        at ./asm/lib.rs:49:17
   2: _ZN10defmt_test6export4exit17h45e5ec6423f16273E
        at /home/jonas/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/defmt-test-0.3.0/src/export.rs:8:9
   3: main
        at tests/integration.rs:4:1
   4: Reset
(HOST) WARN  call stack was corrupted; unwinding could not be completed
(HOST) INFO  device halted without error
     Running tests/interrupts.rs (/home/jonas/dev/bxcan/target/thumbv7m-none-eabi/debug/deps/interrupts-fc17b8c8f4ece88b)
(HOST) INFO  flashing program (19.87 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
(1/6) running `tx_interrupt`...
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:37
(2/6) running `rx_interrupt_message_pending`...
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:71
(3/6) running `rx_interrupt_fifo_full`...
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:105
(4/6) running `rx_interrupt_fifo_overrun`...
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:145
(5/6) running `sce_interrupt_sleep`...
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:195
(6/6) running `sce_interrupt_wakeup`...
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:219
all tests passed!
└─ interrupts::tests::__defmt_test_entry @ tests/interrupts.rs:4
────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: _ZN3lib6inline6__bkpt17h78c2a3232bb722b9E
        at ./asm/inline.rs:13:5
   1: __bkpt
        at ./asm/lib.rs:49:17
   2: _ZN10defmt_test6export4exit17h45e5ec6423f16273E
        at /home/jonas/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/defmt-test-0.3.0/src/export.rs:8:9
   3: main
        at tests/interrupts.rs:4:1
   4: Reset
(HOST) WARN  call stack was corrupted; unwinding could not be completed
(HOST) INFO  device halted without error

Expected output: the same, but without the warning.

@jonas-schievink jonas-schievink added type: bug Something isn't working status: needs PR Issue just needs a Pull Request implementing the changes topic: unwinding labels Nov 15, 2021
@newAM
Copy link
Member

newAM commented Nov 15, 2021

I am also seeing this in the stm32wlxx-hal testsuite: https://github.com/stm32-rs/stm32wlxx-hal/runs/4208369937?check_suite_focus=true#step:6:33

I saw this infrequently with defmt 0.2 (difficult to reproduce, so I didn't file a bug report), but now it occurs every time with 0.3.

@jamesmunns
Copy link

I'm also hitting this on every run, along with #201.

james@archx1c6g-ovar ➜  emb-playground git:(main) ✗ DEFMT_LOG=info cargo rrb forth
   Compiling emb-playground v0.1.0 (/home/james/anachro/forth/emb-playground)
    Finished release [optimized + debuginfo] target(s) in 1.45s
     Running `probe-run --chip nRF52840_xxAA target/thumbv7em-none-eabihf/release/forth`
(HOST) INFO  flashing program (17.36 KiB)
(HOST) INFO  success!
────────────────────────────────────────────────────────────────────────────────
INFO  Hello!
└─ forth::__cortex_m_rt_main @ src/bin/forth.rs:11
INFO  lol: 10781266
└─ forth::__cortex_m_rt_main @ src/bin/forth.rs:30
────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: _ZN3lib6inline6__bkpt17h78c2a3232bb722b9E
        at ./asm/inline.rs:13:5
   1: __bkpt
        at ./asm/lib.rs:49:17
   2: _ZN14emb_playground4exit17h99c2b316dcb70511E
        at src/lib.rs:18:5
   3: _ZN5forth18__cortex_m_rt_main17h8a5fd1dcc9faec7bE
        at src/bin/forth.rs:32:5
   4: main
        at src/bin/forth.rs:9:1
   5: Reset
(HOST) WARN  call stack was corrupted; unwinding could not be completed
(HOST) INFO  device halted without error

Running this application, with cargo rrb forth

bors bot added a commit to rust-embedded/cortex-m-rt that referenced this issue Nov 23, 2021
337: Fix unwinding through `Reset` r=thejpster a=jonas-schievink

Unwinders may detect the end of the program by seeing `0xFFFFFFFF` in `lr`, which is why code to ensure that it has that value was added in #293. However, the `bl main` overwrites that value with the current program counter. This PR saves the old `lr` value on the stack, and adds debuginfo entries to allow an external unwinder to restore the value.

This fixes knurling-rs/probe-run#277

Co-authored-by: Jonas Schievink <jonasschievink@gmail.com>
@jonathanpallant
Copy link
Contributor

A fix for this issue has gone into cortex-m-rt, but you won't see it until a) there's a new release (likely 0.7.1) or b) add a patch to Cargo.toml to use the git version.

@jonathanpallant
Copy link
Contributor

https://crates.io/crates/cortex-m-rt/0.7.1 is now out.

newAM added a commit to stm32-rs/stm32wlxx-hal that referenced this issue Nov 25, 2021
This resolves the corrupted stack backtrace in the testsuite.

See knurling-rs/probe-run#277
newAM added a commit to stm32-rs/stm32wlxx-hal that referenced this issue Nov 25, 2021
This resolves the corrupted stack backtrace in the testsuite.

See knurling-rs/probe-run#277
bors bot added a commit that referenced this issue Nov 25, 2021
287: unwind: skip FDEs with initial address of 0 r=jonathanpallant a=jonas-schievink

The [`panic` example in the app-template](https://github.com/knurling-rs/app-template/blob/09cb051ef0e6863e805a3fc65d509147b031f411/src/bin/panic.rs) can cause a broken backtrace like this:

```
stack backtrace:
   0: 0x00001450 @ HardFaultTrampoline
      <exception entry>
   1: 0x000008d4 @ lib::inline::__udf
        at ./asm/inline.rs:172:5
   2: 0x000008d4 @ __udf
        at ./asm/lib.rs:49:17
   3: 0x000001ba @ cortex_m::asm::udf
        at [...]/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/cortex-m-0.7.3/src/asm.rs:43:5
   4: 0x000001c4 @ _defmt_panic
        at [...]/src/lib.rs:13:5
   5: 0x000001b0 @ defmt::export::panic
        at [...]/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/defmt-0.3.0/src/export/mod.rs:125:14
   6: 0x000001b0 @ panic::__cortex_m_rt_main
        at [...]/src/bin/panic.rs:10:5
   7: 0x00000168 @ main
        at [...]/src/bin/panic.rs:6:1
   8: 0x00000144 @ Reset
   9: 0x00000100 @ Reset
  10: 0x52205244 @ __sheap
```

This appeared to have a different cause from #277, since the fix for that didn't fix the backtrace here.

After a lot of investigation, it turned out that what was going on is that there are multiple frame description entries (FDEs) in the debuginfo that claim to apply to some addresses, and we happened to pick the wrong one while unwinding from `Reset` in frame 8, causing us to restore registers incorrectly and then attempting to jump to a nonsense address.

This seems to be limited to FDEs describing presumably dead code, whose base address gets reset to 0 when they're included in the final ELF, so this PR fixes it by ignoring any FDEs that claim to apply to address 0.

Co-authored-by: Jonas Schievink <jonasschievink@gmail.com>
adamgreig pushed a commit to rust-embedded/cortex-m that referenced this issue Jan 12, 2022
337: Fix unwinding through `Reset` r=thejpster a=jonas-schievink

Unwinders may detect the end of the program by seeing `0xFFFFFFFF` in `lr`, which is why code to ensure that it has that value was added in rust-embedded/cortex-m-rt#293. However, the `bl main` overwrites that value with the current program counter. This PR saves the old `lr` value on the stack, and adds debuginfo entries to allow an external unwinder to restore the value.

This fixes knurling-rs/probe-run#277

Co-authored-by: Jonas Schievink <jonasschievink@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: needs PR Issue just needs a Pull Request implementing the changes topic: unwinding type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants