Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32H750: GDB breakpoints do not halt execution #1059

Closed
1 task done
cmdrf opened this issue Oct 25, 2020 · 26 comments · Fixed by #1071
Closed
1 task done

STM32H750: GDB breakpoints do not halt execution #1059

cmdrf opened this issue Oct 25, 2020 · 26 comments · Fixed by #1071

Comments

@cmdrf
Copy link
Contributor

cmdrf commented Oct 25, 2020

  • I made serious effort to avoid creating duplicate or nearly similar issue

  • Programmer/board type: Stlink/v2-clone

  • Operating system and version: st-util: Armbian Linux, a Debian 10 derivative. gdb: macOS 10.15.6

  • Stlink tools version and/or git commit hash: 1e20921

  • Stlink command line tool name: st-util

  • Target chip: STM32H750VBT6

Command line output:

$ /Applications/ARM/bin/arm-none-eabi-gdb fmplayground-h7.elf
(gdb) tar ext 172.16.3.114:4242
(gdb) b FmPlaygroundMain.c:46
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: <redacted>/fmplayground-h7.elf

Expected/description:
The execution should stop at the specified breakpoint location, which is inside main(). I also tried various other locations, but it never stops. The same setup works with various other STM32 devices and other st-link versions.

This is a followup to #1011 .

Thanks for your effort!

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 25, 2020

@Ant-ON I replaced the two if (sl->core_id == STM32F7_CORE_ID) in update_code_breakpoint() as you suggested, but it didn't help unfortunately.

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 25, 2020

Hey, I was wrong! The patch works, but only sometimes. Still trying to figure out under which circumstances...

@timothytylee
Copy link
Collaborator

Hi, I am responsible for reviving the STM32H7 patch. What I have found is that the chip can only be reliably detected when connected under reset.

The problem appears to be that stlink_chip_id() in common.c reads an actual value from 0xE0042000, so it never falls through to the next statement, which attempts to read from 0x5C001000 (as is desired with the STM32H7).

Maybe someone can suggest a better approach to detecting the STM32H7?

@Nightwalker-87
Copy link
Member

General suggestion: Review & refactor the connect-under-reset and reset procedures which are a heavy burden to the code as far back as I can think... 😉 Almost every step forward (fixes or features) seems to lead to another regression regarding this topic. 😞 Best would probably be to rip apart larger parts of the codebase in order to modularise certain tasks, well knowing this is going to raise a fuss...

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 26, 2020

I'm pretty sure the chip is reliably detected correctly. I don't think I've seen issues there. Output of st-util for example:

2020-10-25T21:33:03 INFO common.c: H742/743/753: 128 KiB SRAM, 128 KiB flash in at least 128 KiB pages.

Regarding the breakpoints: There might be a certain probability that breakpoints halt execution. Breakpoints in initialization code almost never halt, while those in loops do.

On a side note: I added a specialized GDB memory map to gdbserver.c for the H7 to make the additional SRAM regions known to GDB, and it seems to work. I can create a pull request later.

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 26, 2020

@timothytylee You may use core_id to identify STM32H7 series. For example something like this:

diff --git a/inc/stm32.h b/inc/stm32.h
index 1773e70..768fabb 100644
--- a/inc/stm32.h
+++ b/inc/stm32.h
@@ -10,6 +10,7 @@
 /* Cortex core ids */
 #define STM32VL_CORE_ID 0x1ba01477
 #define STM32F7_CORE_ID 0x5ba02477
+#define STM32H7_CORE_ID 0x6ba02477 // STM32H7 JTAG ID Code (RM0433 pg3065)
 
 /* Constant STM32 memory map figures */
 #define STM32_FLASH_BASE           ((uint32_t)0x08000000)
diff --git a/src/common.c b/src/common.c
index a08db27..7dc6745 100644
--- a/src/common.c
+++ b/src/common.c
@@ -1252,17 +1252,18 @@ int stlink_core_id(stlink_t *sl) {
 int stlink_chip_id(stlink_t *sl, uint32_t *chip_id) {
     int ret;
 
-    ret = stlink_read_debug32(sl, 0xE0042000, chip_id);
+    if (sl->core_id == STM32H7_CORE_ID) {
+        // STM32H7 chipid in 0x5c001000 (RM0433 pg3189)
+        ret = stlink_read_debug32(sl, 0x5c001000, chip_id);
+    } else {
+        // Default chipid address
+        ret = stlink_read_debug32(sl, 0xE0042000, chip_id);
+    }
 
     if (ret == -1) {
         return(ret);
     }
 
-    if (*chip_id == 0) {
-        // STM32H7 chipid in 0x5c001000 (RM0433 pg3189)
-        ret = stlink_read_debug32(sl, 0x5c001000, chip_id);
-    }
-
     if (*chip_id == 0) {
         // Try Corex M0 DBGMCU_IDCODE register address
         ret = stlink_read_debug32(sl, 0x40015800, chip_id);

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 26, 2020

@cmdrf Also for H7 you need to enable cache support. This series of MCU has it. Change sl->core_id != STM32F7_CORE_ID to sl->core_id != STM32F7_CORE_ID && sl->core_id != 0x6ba02477 in st-util/gdb-server.c

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 26, 2020

@Ant-ON Now st-util produces additional output (who is Louis? 😄 ):

$ bin/st-util -m
st-util
2020-10-26T19:02:57 INFO common.c: H742/743/753: 128 KiB SRAM, 128 KiB flash in at least 128 KiB pages.
2020-10-26T19:02:57 INFO gdb-server.c: Chip clidr: 09000003, I-Cache: on, D-Cache: on
2020-10-26T19:02:57 INFO gdb-server.c:  cache: LoUU: 1, LoC: 1, LoUIS: 0
2020-10-26T19:02:57 INFO gdb-server.c:  cache: ctr: 8303c003, DminLine: 32 bytes, IminLine: 32 bytes
2020-10-26T19:02:57 INFO gdb-server.c: D-Cache L0: 2020-10-26T19:02:57 INFO gdb-server.c: f00fe019 LineSize: 8, ways: 4, sets: 128 (width: 12)
2020-10-26T19:02:57 INFO gdb-server.c: I-Cache L0: 2020-10-26T19:02:57 INFO gdb-server.c: f01fe009 LineSize: 8, ways: 2, sets: 256 (width: 13)
2020-10-26T19:02:57 INFO gdb-server.c: Listening at *:4242...
2020-10-26T19:05:38 INFO gdb-server.c: Found 8 hw breakpoint registers
2020-10-26T19:05:38 INFO gdb-server.c: GDB connected.

But breakpoints are still flaky.

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 27, 2020

@cmdrf It's from reference of Cortex-M7: https://developer.arm.com/documentation/dui0646/a/cortex-m7-peripherals/processor-features/cache-level-id-register

I read Cortex-M7 and ARMv7-M reference manual and small rewrote code of gdb-server: https://github.com/Ant-ON/stlink/tree/try_h7_debug

@timothytylee
Copy link
Collaborator

@Ant-ON I tried your core_id patch and it did not improve the detection problem on my PC. I still need to connect under reset to reliably detect the H7, otherwise it succeeds only 1 out of 8 times.

A question about the JTAG ID code: I found two listed in RM0433: page 3065 says IDCODE is 0x6ba00477, but page 3066 says DP_DPIDR register contains 0x6ba02477 (which is what my board returns). Is this a typo from ST, or is something else happening here?

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 27, 2020

@timothytylee after the patch, which register read incorrectly?

0x6ba00477 it's ID code read from JTAG-DP (via JTAG)
0x6ba02477 it's ID code read from SW-DP (via SWD)

@timothytylee
Copy link
Collaborator

@Ant-ON , I was testing on the NUCLEO-H7432ZI2. The value in sl->core_id was 0x6ba02477.

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 27, 2020

@timothytylee everything is correct. But why then the MCU is not determined the first time? Incorrect read from chip_id register (0x5c001000)?

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 28, 2020

@cmdrf Can you try this rewrited code?
https://github.com/Ant-ON/stlink/tree/try_h7_debug

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 28, 2020

@Ant-ON I'm on it in this very moment. But no luck so far. It seems it doesn't even trigger breakpoints in heavy-duty loops.

I noticed that cache support isn't enabled in that branch (at least I don't see LoUIS and friends anymore). Maybe that would help. I'll try and add it back in.

@Ant-ON
Copy link
Collaborator

Ant-ON commented Oct 29, 2020

@cmdrf I changed the code a bit. Now the choice of the style of breakpoints and the enabling of cache support is automatic. I think this solution is the most optimal: https://github.com/Ant-ON/stlink/tree/try_h7_debug

ps Can you try to build a project without optimization (-O0)?

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 29, 2020

Hey, I found something interesting: If I start the program through the debugger, then hit the reset button on the dev board, from then on it stops at every breakpoint I throw at it! However, when I want to continue execution, it stops at the same breakpoint again without actually executing anything.

Maybe this is important: In my projects I exclusively use the 4-pin SWD connection (SWDIO, SWCLK, 3.3V and GND) and don't connect the reset pin.

@Ant-ON I'm going to test your branch now.

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 30, 2020

@Ant-ON I tested your branch now, and as far as I can tell, it behaves the same as develop with the patches in update_code_breakpoint, init_cache and cache_change. Meaning that it stops at breakpoints in loops, but not in one-off breakpoints, e.g. in initialization code.

I investigated this further, and it turns out it stops at the loop breakpoints every time (meaning my earlier theory about there being a chance of hitting was wrong). However, continuing execution stops immediately again without executing anything. So a new theory would be (incorporation observations with the reset button): There is a certain time after program start where breakpoints don't work, and continuing from breakpoints doesn't work. But breakpoints work in general.

-O0 didn't change anything. I was using -Os before.

@cmdrf
Copy link
Contributor Author

cmdrf commented Oct 30, 2020

So, I did this:

int counter = 0;
  while (1)
  {
	  HAL_Delay(10); // Milliseconds
	  counter++;
	  HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_15);
  }

right after HAL and GPIO initialization and put a breakpoint in the HAL_GPIO_TogglePin() line. After doing run, by the time it reaches the breakpoint, counter is between 195 and 202, meaning it takes around 2 seconds before breakpoints work. counter doesn't increase after continue.

(The time is shorter and more inconsistent through my IDE. I guess the IDE introduces its own delays in between gdb commands)

@timothytylee
Copy link
Collaborator

@Ant-ON, regarding the incorrect detection of STM32H7: yes, the problem is that 0x5c001000 is returning the wrong value (0x05fa0004). Coincidentally, this is the value written to AIRCR (address 0xe000ed0c), just prior to the read.

@Ant-ON
Copy link
Collaborator

Ant-ON commented Nov 1, 2020

@timothytylee This is very similar to reset problems... I tried rewriting the reset function. I'm checked the work it on the existing hardware with F07/F1/F3/G4 chips. https://github.com/Ant-ON/stlink/commits/try_h7_debug

@cmdrf Perhaps correcting the reset will somehow affect the breakpoints. If possible, then I think it's worth checking.

@timothytylee
Copy link
Collaborator

timothytylee commented Nov 1, 2020

@Ant-ON Your try_h7_debug branch fixed the NUCLEO-H7432ZI2 detection problem!

@Ant-ON
Copy link
Collaborator

Ant-ON commented Nov 6, 2020

@cmdrf I've tweaked gdb a bit. Could you try https://github.com/Ant-ON/stlink/commits/try_h7_debug ?

@Nightwalker-87
Copy link
Member

Related to #1063.

@cmdrf
Copy link
Contributor Author

cmdrf commented Nov 10, 2020

@cmdrf I've tweaked gdb a bit. Could you try https://github.com/Ant-ON/stlink/commits/try_h7_debug ?

Hey that works! Only a slight build error on 32 Bit ARM because of size_t having the wrong format string in printf in st-info/info.c. Here is the output I get:

st-util
2020-11-10T19:56:09 WARN usb.c: NRST is not connected
2020-11-10T19:56:09 INFO common.c: H74x/H75x: 128 KiB SRAM, 128 KiB flash in at least 128 KiB pages.
2020-11-10T19:56:09 WARN usb.c: NRST is not connected
2020-11-10T19:56:09 INFO gdb-server.c: Chip clidr: 09000003, I-Cache: on, D-Cache: on
2020-11-10T19:56:09 INFO gdb-server.c:  cache: LoUU: 1, LoC: 1, LoUIS: 0
2020-11-10T19:56:09 INFO gdb-server.c:  cache: ctr: 8303c003, DminLine: 32 bytes, IminLine: 32 bytes
2020-11-10T19:56:09 INFO gdb-server.c: D-Cache L0: 2020-11-10T19:56:09 INFO gdb-server.c: f00fe019 LineSize: 8, ways: 4, sets: 128 (width: 12)
2020-11-10T19:56:09 INFO gdb-server.c: I-Cache L0: 2020-11-10T19:56:09 INFO gdb-server.c: f01fe009 LineSize: 8, ways: 2, sets: 256 (width: 13)
2020-11-10T19:56:09 INFO gdb-server.c: Listening at *:4242...
2020-11-10T19:56:14 WARN usb.c: NRST is not connected
2020-11-10T19:56:14 INFO gdb-server.c: Found 8 hw breakpoint registers
2020-11-10T19:56:14 INFO gdb-server.c: GDB connected.
2020-11-10T19:56:15 WARN usb.c: NRST is not connected
2020-11-10T19:56:15 INFO gdb-server.c: flash_erase: block 08000000 -> 20000
2020-11-10T19:56:15 INFO gdb-server.c: flash_erase: page 08000000
2020-11-10T19:56:15 INFO common.c: Starting Flash write for H7
2020-11-10T19:56:15 INFO gdb-server.c: flash_do: block 08000000 -> 20000
2020-11-10T19:56:15 INFO gdb-server.c: flash_do: page 08000000

The "NRST is not connected" bit is true btw.

Buut...I also got it working with my branch https://github.com/cmdrf/stlink/tree/stm32h7-breakpoint-almostfix now, where I accumulated all the patches concerning cache support etc. found in this thread. The weird breakpoint behavior seems to stem from the fact that I use TIM1 as the HAL time base. The timer interrupt seems to fire right after continuation and it ends up hitting the same breakpoint again when it returns from the interrupt handler. Why it takes two seconds after program start for the first breakpoint to trigger is still a mystery though. Anyway, when I switch this back to the SysTick interrupt, the breakpoints work as expected.

Maybe this is a known issue when debugging and I'm just stupid 😕 ? Are timers supposed to run when the program is halted?

@Ant-ON
Copy link
Collaborator

Ant-ON commented Nov 11, 2020

@cmdrf Thanks for the mistake with the size_t! I will fix it.

try_h7_debug is a bit smarter. It determines a reset occurred when the NRST was switched. Only after that it uses a software reset.

The TIM1 timer is probably incorrectly configured. It may have much higher frequency than expected. Eclipse IDE sets breakpoints after starting the program. Something like this happens:

...
Erase
Flash
FlashDone (current version of gdb starts the core here)
... (exchange of some packages with gdb)
Continue  (try_h7_debug version starts the core here)
... (exchange of some packages with gdb)
Set breakpoints
...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants