https://github.com/riscv/riscv-debug-spec contains instructions for generating a C header file that defines macros for every field in every register/abstract command mentioned in this document.
This section details how an external debugger might use the described debug interface to perform some common operations on RISC-V cores using the JTAG DTM described in [sec:jtagdtm]. All these examples assume a 32-bit core but it should be easy to adapt the examples to 64- or 128-bit cores.
To keep the examples readable, they all assume that everything succeeds, and that they complete faster than the debugger can perform the next access. This will be the case in a typical JTAG setup. However, the debugger must always check the sticky error status bits after performing a sequence of actions. If it sees any that are set, then it should attempt the same actions again, possibly while adding in some delay, or explicit checks for status bits.
To read an arbitrary Debug Module register, select {dtm-dmi}, and scan in a value with {dmi-op} set to 1, and {dmi-address} set to the desired register address. In Update-DR the operation will start, and in Capture-DR its results will be captured into {dmi-data}. If the operation didn’t complete in time, {dmi-op} will be 3 and the value in {dmi-data} must be ignored. The busy condition must be cleared by writing {dtmcs-dmireset} in {dtm-dtmcs}, and then the second scan scan must be performed again. This process must be repeated until {dmi-op} returns 0. In later operations the debugger should allow for more time between Update-DR and Capture-DR.
To write an arbitrary Debug Bus register, select {dtm-dmi}, and scan in a value with {dmi-op} set to 2, and {dmi-address} and {dmi-data} set to the desired register address and data respectively. From then on everything happens exactly as with a read, except that a write is performed instead of the read.
It should almost never be necessary to scan IR, avoiding a big part of the inefficiency in typical JTAG use.
A user will want to know as quickly as possible when a hart is halted
(e.g. due to a breakpoint). To efficiently determine which harts are
halted when there are many harts, the debugger uses the haltsum
registers. Assuming the maximum number of harts exist, first it checks {dm-haltsum3} .
For each bit set there, it writes hartsel, and checks {dm-haltsum2}. This process repeats
through {dm-haltsum1} and {dm-haltsum0}. Depending on how many harts exist, the process should
start at one of the lower haltsum
registers.
To halt one or more harts, the debugger selects them, sets {dmcontrol-haltreq}, and then waits for {dmstatus-allhalted} to indicate the harts are halted. Then it can clear {dmcontrol-haltreq} to 0, or leave it high to catch a hart that resets while halted.
First, the debugger should restore any registers that it has overwritten. Then it can let the selected harts run by setting {dmcontrol-resumereq}. Once {dmstatus-allresumeack} is set, the debugger knows the selected harts have resumed. Harts might halt very quickly after resuming (e.g. by hitting a software breakpoint) so the debugger cannot use {dmstatus-allhalted}/{dmstatus-anyhalted} to check whether the hart resumed.
Using the hardware single step feature is almost the same as regular running. The debugger just sets in before letting the hart run. The hart behaves exactly as in the running case, except that interrupts may be disabled (depending on {dcsr-stepie}) and it only fetches and executes a single instruction before re-entering Debug Mode.
Read s0
using abstract command:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-command} |
{accessregister-aarsize}\(=2\), {accessregister-transfer}, {accessregister-regno} = 0x1008 |
Read |
Read |
{dm-data0} |
- |
Returns value that was in |
Write mstatus
using abstract command:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-data0} |
new value |
|
Write |
{dm-command} |
{accessregister-aarsize}\(=2\), {accessregister-transfer}, {accessregister-write}, {accessregister-regno} = 0x300 |
Write |
Abstract commands are only required to support GPR access. To access non-GPR registers, the debugger can use the Program Buffer to move values to/from a GPR, and then access the GPR value using an abstract command.
Write mstatus
using program buffer:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
|
Write |
|
|
|
Write |
{dm-data0} |
new value |
|
Write |
{dm-command} |
{accessregister-aarsize}\(=2\), {accessregister-postexec}, {accessregister-transfer}, {accessregister-write}, {accessregister-regno} = 0x1008 |
Write |
Read f1
using program buffer:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
{ |
|
Write |
|
|
|
Write |
{dm-command} |
{accessregister-postexec} |
Execute program buffer |
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-regno} = 0x1008 |
read |
Read |
{dm-data0} |
- |
Returns the value that was in |
With system bus access, addresses are physical system bus addresses.
Read a word from memory using system bus access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-sbcs} |
{sbcs-sbaccess}\(=2\), {sbcs-sbreadonaddr} |
Setup |
Write |
{dm-sbaddress0} |
address |
|
Read |
{dm-sbdata0} |
- |
Value read from memory |
Read block of memory using system bus access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-sbcs} |
{sbcs-sbaccess}\(=2\), {sbcs-sbreadonaddr}, {sbcs-sbreadondata}, {sbcs-sbautoincrement} |
Turn on autoread and autoincrement |
Write |
{dm-sbaddress0} |
address |
Writing address triggers read and increment |
Read |
{dm-sbdata0} |
- |
Value read from memory |
Read |
{dm-sbdata0} |
- |
Next value read from memory |
… |
… |
… |
… |
Write |
{dm-sbcs} |
0 |
Disable autoread |
Read |
{dm-sbdata0} |
- |
Get last value read from memory. |
Memory can be accessed through the Program Buffer by having the hart perform loads/stores. Whether the addresses are physical or virtual depends on the system configuration.
Read a word from memory using program buffer:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
|
Write |
|
|
|
Write |
{dm-data0} |
address |
|
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-write}, {accessregister-postexec}, {accessregister-regno} = 0x1008 |
Write |
Write |
{dm-command} |
{accessregister-regno} = 0x1008 |
Read |
Read |
{dm-data0} |
- |
Value read from memory |
Read block of memory using program buffer:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
|
Write |
|
|
|
Write |
|
|
|
Write |
{dm-data0} |
address |
|
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-write}, {accessregister-postexec}, {accessregister-regno} = 0x1008 |
Write |
Write |
{dm-command} |
{accessregister-postexec}, {accessregister-regno} = 0x1009 |
Read |
Write |
{dm-abstractauto} |
{abstractauto-autoexecdata}[0] |
Set {abstractauto-autoexecdata}[0] |
Read |
{dm-data0} |
- |
Get value read from memory, then execute program buffer |
Read |
{dm-data0} |
- |
Get next value read from memory, then execute program buffer |
… |
… |
… |
… |
Write |
{dm-abstractauto} |
0 |
Clear {abstractauto-autoexecdata}[0] |
Read |
{dm-data0} |
- |
Get last value read from memory. |
Abstract memory accesses act as if they are performed by the hart, although the actual implementation may differ.
Read a word from memory using abstract memory access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
|
address |
|
Write |
{dm-command} |
cmdtype=2, {accessmemory-aamsize}\(=2\) |
|
Read |
{dm-data0} |
- |
Value read from memory |
Read block of memory using abstract memory access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-abstractauto} |
1 |
Re-execute the command when {dm-data0} is accessed |
Write |
|
address |
|
Write |
{dm-command} |
cmdtype=2, {accessmemory-aamsize}\(=2\), {accessmemory-aampostincrement}\(=1\) |
|
Read |
{dm-data0} |
- |
Read value, and trigger reading of next address |
… |
… |
… |
… |
Write |
{dm-abstractauto} |
0 |
Disable auto-exec |
Read |
{dm-data0} |
- |
Get last value read from memory. |
With system bus access, addresses are physical system bus addresses.
Write a word to memory using system bus access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-sbcs} |
{sbcs-sbaccess}\(=2\) |
Configure access size |
Write |
{dm-sbaddress0} |
address |
|
Write |
{dm-sbdata0} |
value |
Write a block of memory using system bus access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-sbcs} |
{sbcs-sbaccess}\(=2\), {sbcs-sbautoincrement} |
Turn on autoincrement |
Write |
{dm-sbaddress0} |
address |
|
Write |
{dm-sbdata0} |
value0 |
|
Write |
{dm-sbdata0} |
value1 |
|
… |
… |
… |
… |
Write |
{dm-sbdata0} |
valueN |
Through the Program Buffer, the hart performs the memory accesses. Addresses are physical or virtual (depending on and other system configuration).
Write a word to memory using program buffer:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
|
Write |
|
|
|
Write |
{dm-data0} |
address |
|
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-write}, {accessregister-regno} = 0x1008 |
Write |
Write |
{dm-data0} |
value |
|
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-write}, {accessregister-postexec}, {accessregister-regno} = 0x1009 |
Write |
Write block of memory using program buffer:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
|
Write |
|
|
|
Write |
|
|
|
Write |
{dm-data0} |
address |
|
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-write}, {accessregister-regno} = 0x1008 |
Write |
Write |
{dm-data0} |
value0 |
|
Write |
{dm-command} |
{accessregister-transfer}, {accessregister-write}, {accessregister-postexec}, {accessregister-regno} = 0x1009 |
Write |
Write |
{dm-abstractauto} |
{abstractauto-autoexecdata}[0] |
Set {abstractauto-autoexecdata}[0] |
Write |
{dm-data0} |
value1 |
|
… |
… |
… |
… |
Write |
{dm-data0} |
valueN |
|
Write |
{dm-abstractauto} |
0 |
Clear {abstractauto-autoexecdata}[0] |
Abstract memory accesses act as if they are performed by the hart, although the actual implementation may differ.
Write a word to memory using abstract memory access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
|
address |
|
Write |
{dm-data0} |
value |
|
Write |
{dm-command} |
cmdtype=2, {accessmemory-aamsize}=2, write=1 |
Write a block of memory using abstract memory access:
Op | Address | Value | Comment |
---|---|---|---|
Write |
|
address |
|
Write |
{dm-data0} |
value0 |
|
Write |
{dm-command} |
cmdtype=2, {accessmemory-aamsize}\(=2\), write\(=1\), {accessmemory-aampostincrement}\(=1\) |
|
Write |
{dm-abstractauto} |
1 |
Re-execute the command when {dm-data0} is accessed |
Write |
{dm-data0} |
value1 |
|
Write |
{dm-data0} |
value2 |
|
… |
… |
… |
… |
Write |
{dm-data0} |
valueN |
|
Write |
{dm-abstractauto} |
0 |
Disable auto-exec |
A debugger can use hardware triggers to halt a hart when a certain event occurs. Below are some examples, but as there is no requirement on the number of features of the triggers implemented by a hart, these examples might not be applicable to all implementations. When a debugger wants to set a trigger, it writes the desired configuration, and then reads back to see if that configuration is supported. All examples assume XLEN=32.
Enter Debug Mode when the instruction at 0x80001234 is executed, to be used as an instruction breakpoint in ROM:
{csr-tdata1} |
0x6980105c |
type=6, dmode=1, action=1, select=0, match=0, m=1, s=1, u=1, vs=1, vu=1, execute=1 |
{csr-tdata2} |
0x80001234 |
address |
Enter Debug Mode when performing a load at address 0x80007f80 in M-mode or S-mode or U-mode:
{csr-tdata1} |
0x68001059 |
type=6, dmode=1, action=1, select=0, match=0, m=1, s=1, u=1, load=1 |
{csr-tdata2} |
0x80007f80 |
address |
Enter Debug Mode when storing to an address between 0x80007c80 and 0x80007cef (inclusive) in VS-mode or VU-mode when hgatp.VMID=1:
{csr-tdata1} 0 |
0x69801902 |
type=6, dmode=1, action=1, chain=1, select=0, match=2, vs=1, vu=1, store=1 |
{csr-tdata2} 0 |
0x80007c80 |
start address (inclusive) |
{csr-textra32} 0 |
0x03000000 |
mhselect=6, mhvalue=0 |
{csr-tdata1} 1 |
0x69801182 |
type=6, dmode=1, action=1, select=0, match=3, vs=1, vu=1, store=1 |
{csr-tdata2} 1 |
0x80007cf0 |
end address (exclusive) |
{csr-textra32} 1 |
0x03000000 |
mhselect=6, mhvalue=0 |
Enter Debug Mode when storing to an address between 0x81230000 and 0x8123ffff (inclusive):
{csr-tdata1} |
0x698010da |
type=6, dmode=1, action=1, select=0, match=1, m=1, s=1, u=1, vs=1, vu=1, store=1 |
{csr-tdata2} |
0x81237fff |
16 upper bits to match exactly, then 0, then all ones. |
Enter Debug Mode when loading from an address between 0x86753090 and 0x8675309f or between 0x96753090 and 0x9675309f (inclusive):
{csr-tdata1} 0 |
0x69801a59 |
type=6, dmode=1, action=1, chain=1, match=4, m=1, s=1, u=1, vs=1, vu=1, load=1 |
{csr-tdata2} 0 |
0xfff03090 |
Mask for low half, then match for low half |
{csr-tdata1} 1 |
0x698012d9 |
type=6, dmode=1, action=1, match=5, m=1, s=1, u=1, vs=1, vu=1, load=1 |
{csr-tdata2} 1 |
0xefff8675 |
Mask for high half, then match for high half |
Generally the debugger can avoid exceptions by being careful with the programs it writes. Sometimes they are unavoidable though, e.g. if the user asks to access memory or a CSR that is not implemented. A typical debugger will not know enough about the hardware platform to know what’s going to happen, and must attempt the access to determine the outcome.
When an exception occurs while executing the Program Buffer, {dm-command} becomes set. The debugger can check this field to see whether a program encountered an exception. If there was an exception, it’s left to the debugger to know what must have caused it.
There are a variety of instructions to transfer data between GPRs and
the data
registers. They are either loads/stores or CSR reads/writes.
The specific addresses also vary. This is all specified in {dm-hartinfo}. The
examples here use the pseudo-op transfer dest, src
to represent all
these options.
Halt the hart for a minimum amount of time to perform a single memory write:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
Save |
Write |
|
|
Read first argument (address) |
Write |
|
|
Save |
Write |
|
|
Read second argument (data) |
Write |
|
|
|
Write |
|
|
Restore |
Write |
|
|
Restore |
Write |
|
|
|
Write |
{dm-data0} |
address |
|
Write |
|
data |
|
Write |
{dm-command} |
0x10000000 |
Perform quick access |
This shows an example of setting the {mcontrol-m} bit in to enable a hardware breakpoint in M-mode. Similar quick access instructions could have been used previously to configure the trigger that is being enabled here:
Op | Address | Value | Comment |
---|---|---|---|
Write |
{dm-progbuf0} |
|
Save |
Write |
|
|
Form the mask for {mcontrol-m} bit |
Write |
|
|
Apply the mask to {csr-mcontrol} |
Write |
|
|
Restore |
Write |
|
|
|
Write |
{dm-command} |
0x10000000 |
Perform quick access |
The spec contains a few features to aid in writing a native debugger. This section describes how some common tasks might be achieved.
Single step is straightforward if the OS or a debug stub runs in M-Mode while the
program being debugged runs in a less privileged mode. When a step is required,
the OS or debug stub writes {icount-count}=1, {icount-action}=0,
{icount-m}=0 before returning control to the lower user program with an
mret
instruction.
Stepping code running in the same privilege mode as the debugger is more complicated, depending on what other debug features are implemented.
If hardware implements {tcontrol-mpte} and {tcontrol-mte}, then stepping through non-trap code which doesn’t allow for nested interrupts is also straightforward.
If hardware automatically prevents {mcontrol6-action}=0 triggers from matching when entering a trap handler as described in [nativetrigger], then a carefully written trap handler can ensure that interrupts are disabled whenever the icount trigger must not match.
If neither of these features exist, then single step is doable, but tricky to get right. To single step, the debug stub would execute something like:
li t0, count=4, action=0, m=1 csrw tdata1, t0 /* Write the trigger. */ lw t0, 8(sp) /* Restore t0, count decrements to 3 */ lw sp, 0(sp) /* Restore sp, count decrements to 2 */ mret /* Return to program being debugged. count decrements to 1 */
There is an additional problem with using {csr-icount} to single step. An instruction may cause an exception into a more privileged mode where the trigger is not enabled. The exception handler might address the cause of the exception, and then restart the instruction. Examples of this include page faults, FPU instructions when the FPU is not yet enabled, and interrupts. When a user is single stepping through such code, they will have to step twice to get past the restarted instruction. The first time the exception handler runs, and the second time the instruction actually executes. That is confusing and usually undesirable.
To help users out, debuggers should detect when a single step restarted an instruction, and then step again. This way the users see the expected behavior of stepping over the instruction. Ideally the debugger would notify the user that an exception handler executed the first time.
The debugger should perform this extra step when the PC doesn’t change during a regular step.
Note
|
It is safe to perform an extra step when the PC changes, because every RISC-V instruction either changes the PC or has side effects when repeated, but never both. |
To avoid an infinite loop if the exception handler does not address the cause of the exception, the debugger must execute no more than a single extra step.