Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChibiOS: Invalid MSP from bootloader causes main stack to overlap process stack #1875

Open
tracernz opened this issue Jul 4, 2017 · 7 comments

Comments

@tracernz
Copy link
Member

tracernz commented Jul 4, 2017

Follow-up from #1874

Some bootloaders, due to unintended code generation, end up incrementing (positive) the main stack pointer by a number of bytes (8 for LibrePilot v6, 16 for dRonin recent releases). This causes the first few entries in the main stack to step on whatever happened to follow it in memory. In the bootloader updater this happened to be the flash partition table which caused the updater to fail. In firmware, this is the process stack (thread stacks): https://github.com/d-ronin/dRonin/blob/next/flight/PiOS/STM32F4xx/sections_chibios.ld#L127. Fortunately this is at the end of the stack of whatever thread happens to be first, so unless it completely fills it's stack it's probably only a theoretical issue (and filling stacks with the pattern used to detect stack usage happens in ChibiOS's ResetHandler, before any stack is ever used).

Suggested fix is either:

  • Set MSP to correct location in __early_init hook.. this is a bit fragile if upstream ChibiOS ResetHandler changes, and happens to use the main stack before calling __early_init, so not the preferred option.
  • Change ENTRY function in linker script to our own function which just sets MSP, then jumps to ChibiOS ResetHandler. This is the safest and most preferred option I think. -e- Duh, do still need to edit ChibiOS sources to place our reset handler in the vector table. -_-

@mlyle, thoughts?

@mlyle
Copy link
Member

mlyle commented Jul 4, 2017

Does that initial stack survive to later run, or is it just used to kickstart chibios and get the init task running?

(Does chibios just blindly reuse the initial stack as an irq stack?)

@tracernz
Copy link
Member Author

tracernz commented Jul 9, 2017

@tracernz
Copy link
Member Author

Did a little digging. The main thread (i.e. main), uses the "process stack" (switches to it at https://github.com/d-ronin/dRonin/blob/next/flight/PiOS/Common/Libraries/ChibiOS/os/ports/GCC/ARMCMx/crt0.c#L308), while interrupts use the "main stack". So the OS is running in the process stack... I guess it never gets too close to full.

Incidentally, this doesn't look correct (and should be a function of pios_sys IMO: https://github.com/d-ronin/dRonin/blob/next/flight/Modules/System/systemmod.c#L734). We can use ChibiOS pattern (0x55555555) and symbols (___main_stack_end etc.). And we don't check the process/main thread stack at all. Do we actually even check IRQ stack? :/ https://github.com/d-ronin/dRonin/search?utf8=%E2%9C%93&q=CHECK_IRQ_STACK&type=

@tracernz
Copy link
Member Author

Deferring because there is plenty of unused area at the bottom of the process stack so nothing bad happens.

@tracernz tracernz modified the milestones: Wired, Neat Jul 14, 2017
@tracernz tracernz modified the milestones: Wired, Post-Wired Jan 31, 2018
@tracernz
Copy link
Member Author

This needs to either be fixed in #2025, or after that merges.

@glowtape
Copy link
Member

@tracernz
Copy link
Member Author

Yes, that's perfect. 👍

@mlyle mlyle modified the milestones: Inconceivable, Ludicrous Apr 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants