Skip to content

Commit

Permalink
remove all panicking code from gdbstub!!
Browse files Browse the repository at this point in the history
this is something I've been meaning to do for a _long_ time now, and
after many failed attempts and false-starts, I've finally done it! if
you inspect the asm output of `example_no_std`, you'll find that there
is absolutely _zero_ panic handling machinery in the final binary!

This has also resulted in a _substantial_ drop in binary size, as those
bounds checks and panicking machinery were taking up a lot of space!

removing panicking code ended up requiring 3 different approaches:

1. Rewriting array-indexing operations to use simpler indexing, which
the compiler is able to optimize better. e.g: see the code used to index
into the `target.xml` buffer in `base.rs`
2. Adding a sprinkle of unsafe code to certain array-indexing operations
that the compiler is unsable to prove are safe. This was only done in
two places: `decode_hex_buf` and `PacketBuf`.
3. Manually re-implementing the standard library's `slice::split_mut`
and `slice::splitn_mut` methods to elide a bounds check.
  - Tracked upstream via rust-lang/rust#86313

Introducing unsafe code isn't something I take lightly, and while I've
done my best to audit and validate the unsafe code I've written, I did
end up including an optional `paranoid_unsafe` feature which gives
end-users the option to opt-out of `gdbstub`'s no-panic guarantee in
exchange for some additional peace of mind.
  • Loading branch information
daniel5151 committed Jun 19, 2021
1 parent 62e7646 commit ecbbaf7
Show file tree
Hide file tree
Showing 14 changed files with 307 additions and 53 deletions.
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ goblin = "0.2"
default = ["std"]
alloc = ["managed/alloc"]
std = ["alloc"]
paranoid_unsafe = []

# INTERNAL: enables the `__dead_code_marker!` macro.
# used as part of the `scripts/test_dead_code_elim.sh`
Expand Down
26 changes: 19 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,19 @@ Why use `gdbstub`?
- Organizes GDB's countless optional protocol extensions into a coherent, understandable, and type-safe hierarchy of traits.
- Automatically handles client/server protocol feature negotiation, without needing to micro-manage the specific [`qSupported` packet](https://sourceware.org/gdb/onlinedocs/gdb/General-Query-Packets.html#qSupported) response.
- `gdbstub` makes _extensive_ use of Rust's powerful type system + generics to enforce protocol invariants at compile time, minimizing the number of tricky protocol details end users have to worry about.
- Using a novel technique called [**Inlineable Dyn Extension Traits**](#zero-overhead-protocol-extensions) (IDETs), `gdbstub` enables fine-grained control over active protocol extensions _without_ relying on clunky `cargo` features.
- Using a novel technique called [**Inlineable Dyn Extension Traits**](#zero-overhead-protocol-extensions) (IDETs), `gdbstub` enables fine-grained control over active protocol extensions _without_ relying on clunky `cargo` features or the use of `unsafe` code!
- **Easy to Integrate**
- `gdbstub`'s API is designed to be as unobtrusive as possible, and shouldn't require any large refactoring effort to integrate into an existing project. It doesn't require taking direct ownership of any key data structures, and aims to be a "drop in" solution when you need to add debugging to a project.
- **`#![no_std]` Ready & Size Optimized**
- `gdbstub` is a **`no_std` first** library, whereby all protocol features are required to be `no_std` compatible.
- `gdbstub` does not require _any_ dynamic memory allocation, and can be configured to use fixed-size, pre-allocated buffers. This enables `gdbstub` to be used on even the most resource constrained, no-[`alloc`](https://doc.rust-lang.org/alloc/) platforms.
- `gdbstub` is transport-layer agnostic, and uses a basic [`Connection`](https://docs.rs/gdbstub/latest/gdbstub/trait.Connection.html) interface to communicate with the GDB server. As long as target has some method of performing in-order, serial, byte-wise I/O (e.g: putchar/getchar over UART), it's possible to run `gdbstub` on it.
- "You don't pay for what you don't use": All code related to parsing/handling protocol extensions is guaranteed to be dead-code-eliminated from an optimized binary if left unimplmeneted! See the [Zero-overhead Protocol Extensions](#zero-overhead-protocol-extensions) section below for more details.
- `gdbstub` tries to keep the binary and RAM footprint of its minimal configuration to a bare minimum, enabling it to be used on even the most resource-constrained microcontrollers.
- When compiled in release mode, using all the tricks outlined in [`min-sized-rust`](https://github.com/johnthagen/min-sized-rust), a baseline `gdbstub` implementation weighs in at **_roughly 10kb of `.text` and negligible `.rodata`!_** \*
- \* Exact numbers vary by target platform, compiler version, and `gdbstub` revision. Data was collected using the included `example_no_std` project compiled on x86_64.
- `gdbstub` is entirely **panic free** (when compiled in release mode, without the `paranoid_unsafe` cargo feature).
- Validated by inspecting the asm output of the in-tree `example_no_std`.
- `gdbstub` is transport-layer agnostic, and uses a basic [`Connection`](https://docs.rs/gdbstub/latest/gdbstub/trait.Connection.html) interface to communicate with the GDB server. As long as target has some method of performing in-order, serial, byte-wise I/O (e.g: putchar/getchar over UART), it's possible to run `gdbstub` on it!
- "You don't pay for what you don't use": All code related to parsing/handling protocol extensions is guaranteed to be dead-code-eliminated from an optimized binary if left unimplemented! See the [Zero-overhead Protocol Extensions](#zero-overhead-protocol-extensions) section below for more details.
- `gdbstub`'s minimal configuration has an incredibly low binary size + RAM overhead, enabling it to be used on even the most resource-constrained microcontrollers.
- When compiled in release mode, using all the tricks outlined in [`min-sized-rust`](https://github.com/johnthagen/min-sized-rust), a baseline `gdbstub` implementation can weigh in at **_less than 10kb of `.text` + `.rodata`!_** \*
- \*Exact numbers vary by target platform, compiler version, and `gdbstub` revision. Data was collected using the included `example_no_std` project compiled on x86_64.

### Can I Use `gdbstub` in Production?

Expand Down Expand Up @@ -102,6 +104,11 @@ When using `gdbstub` in `#![no_std]` contexts, make sure to set `default-feature
- Implement `Connection` for [`TcpStream`](https://doc.rust-lang.org/std/net/struct.TcpStream.html) and [`UnixStream`](https://doc.rust-lang.org/std/os/unix/net/struct.UnixStream.html).
- Implement [`std::error::Error`](https://doc.rust-lang.org/std/error/trait.Error.html) for `gdbstub::Error`.
- Add a `TargetError::Io` variant to simplify `std::io::Error` handling from Target methods.
- `paranoid_unsafe`
- Enabling the `paranoid_unsafe` feature will swap out a handful of unsafe `get_unchecked_mut` operations with their safe equivalents, at the expense of introducing panicking code into `gdbstub`.
- `rustc` + LLVM do a pretty incredible job at eliding bounds checks... most of the time. Unfortunately, there are a few places in the code where the compiler is not smart enough to "prove" that a bounds check isn't needed, and a bit of unsafe code is required to remove those bounds checks.
- This feature is **disabled** by default, as the unsafe code has been aggressively audited and tested for correctness. That said, if you're particularly paranoid about the use of unsafe code, enabling this feature may offer some piece of mind.
- Please refer to the [`unsafe` in `gdbstub`](#unsafe-in-gdbstub) section below for more details.

## Examples

Expand Down Expand Up @@ -150,8 +157,13 @@ If you happen to stumble across this crate and end up using it to debug some bar

- When no cargo features are enabled:
- A few trivially safe calls to `NonZeroUsize::new_unchecked()` when defining internal constants.

- When the `paranoid_unsafe` feature is enabled, the following `unsafe` code is _removed_:
- `src/protocol/packet.rs`: Swaps a couple slice-index methods in `PacketBuf` to use `get_unchecked_mut`. The public API of struct ensures that the bounds used to index into the array remain in-bounds.
- `src/protocol/common/hex`: Use an alternate implementation of `decode_hex_buf` which uses unsafe slice indexing.

- When the `std` feature is enabled:
- An implementation of `UnixStream::peek` which uses `libc::recv`. This manual implementation will be removed once [rust-lang/rust#76923](https://github.com/rust-lang/rust/issues/76923) is stabilized.
- `src/connection/impls/unixstream.rs`: An implementation of `UnixStream::peek` which uses `libc::recv`. This manual implementation will be removed once [rust-lang/rust#76923](https://github.com/rust-lang/rust/issues/76923) is stabilized.

## Future Plans + Roadmap to `1.0.0`

Expand Down
1 change: 1 addition & 0 deletions example_no_std/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ panic = "abort"
opt-level = 's' # Optimize for size.
lto = true
codegen-units = 1
debug = true
20 changes: 9 additions & 11 deletions src/gdbstub_impl/ext/base.rs
Original file line number Diff line number Diff line change
Expand Up @@ -130,22 +130,20 @@ impl<T: Target, C: Connection> GdbStubImpl<T, C> {

match xml {
Some(xml) => {
let xml = xml.trim();
if cmd.offset >= xml.len() {
// no more data
res.write_str("l")?;
} else if cmd.offset + cmd.len >= xml.len() {
// last little bit of data
res.write_str("l")?;
res.write_binary(&xml.as_bytes()[cmd.offset..])?
} else {
let xml = xml.trim().as_bytes();
if cmd.offset < xml.len() {
// still more data
res.write_str("m")?;
res.write_binary(&xml.as_bytes()[cmd.offset..(cmd.offset + cmd.len)])?
res.write_binary(
&xml[cmd.offset..][..cmd.len.min(xml.len() - cmd.offset)],
)?
} else {
// no more data
res.write_str("l")?;
}
}
// If the target hasn't provided their own XML, then the initial response to
// "qSupported" wouldn't have included "qXfer:features:read", and gdb wouldn't
// "qSupported" wouldn't have included "qXfer:features:read", and gdb wouldn't
// send this packet unless it was explicitly marked as supported.
None => return Err(Error::PacketUnexpected),
}
Expand Down
1 change: 1 addition & 0 deletions src/protocol/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ pub(self) mod prelude {
};
pub use crate::protocol::common::Bstr;
pub use crate::protocol::packet::PacketBuf;
pub use crate::util::no_panic_iter::SliceExt;
pub use core::convert::{TryFrom, TryInto};
}

Expand Down
13 changes: 8 additions & 5 deletions src/protocol/commands/_m.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,20 +29,23 @@ impl<'a> ParseCommand<'a> for m<'a> {
// +------+------------------+------------------------------------------------+

let (buf, body_range) = buf.into_raw_buf();
let body = &mut buf[body_range.start..];
let body = buf.get_mut(body_range.start..body_range.end)?;

// should return 3 slices: the addr (hex-encoded), len (hex-encoded), and the
// "rest" of the buffer
let mut body = body.split_mut(|b| *b == b',' || *b == b'#');
let mut body = body.split_mut_no_panic(|b| *b == b',');

let addr = decode_hex_buf(body.next()?).ok()?;
let addr_len = addr.len();
let len = decode_hex(body.next()?).ok()?;

drop(body);

// ensures that `split_at_mut` doesn't panic
if buf.len() < body_range.start + addr_len {
return None;
}

let (addr, buf) = buf.split_at_mut(body_range.start + addr_len);
let addr = &addr[b"$m".len()..];
let addr = addr.get(b"$m".len()..)?;

Some(m { addr, len, buf })
}
Expand Down
10 changes: 3 additions & 7 deletions src/protocol/commands/_m_upcase.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,11 @@ impl<'a> ParseCommand<'a> for M<'a> {
fn from_packet(buf: PacketBuf<'a>) -> Option<Self> {
let body = buf.into_body();

let mut body = body.split_mut(|&b| b == b',' || b == b':');
let mut body = body.split_mut_no_panic(|&b| b == b',' || b == b':');
let addr = decode_hex_buf(body.next()?).ok()?;
let len = decode_hex(body.next()?).ok()?;
let val = body.next()?;
let val = decode_hex_buf(body.next()?).ok()?;

Some(M {
addr,
len,
val: decode_hex_buf(val).ok()?,
})
Some(M { addr, len, val })
}
}
2 changes: 1 addition & 1 deletion src/protocol/commands/_vRun.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ impl<'a> ParseCommand<'a> for vRun<'a> {
fn from_packet(buf: PacketBuf<'a>) -> Option<Self> {
let body = buf.into_body();

let mut body = body.splitn_mut(3, |b| *b == b';');
let mut body = body.splitn_mut_no_panic(3, |b| *b == b';');

let _first_semi = body.next()?;
let filename = match decode_hex_buf(body.next()?).ok()? {
Expand Down
10 changes: 4 additions & 6 deletions src/protocol/commands/breakpoint.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
pub use crate::common::*;
pub use crate::protocol::common::hex::{decode_hex, decode_hex_buf};
pub use crate::protocol::packet::PacketBuf;
pub use core::convert::{TryFrom, TryInto};
use crate::protocol::common::hex::{decode_hex, decode_hex_buf};
use crate::util::no_panic_iter::SliceExt;

// Breakpoint packets are split up like this:
//
Expand All @@ -26,7 +24,7 @@ pub struct BasicBreakpoint<'a> {

impl<'a> BasicBreakpoint<'a> {
pub fn from_slice(body: &'a mut [u8]) -> Option<BasicBreakpoint<'a>> {
let mut body = body.splitn_mut(4, |b| matches!(*b, b',' | b';'));
let mut body = body.splitn_mut_no_panic(4, |b| matches!(*b, b',' | b';'));
let type_ = decode_hex(body.next()?).ok()?;
let addr = decode_hex_buf(body.next()?).ok()?;
let kind = decode_hex_buf(body.next()?).ok()?;
Expand All @@ -44,7 +42,7 @@ pub struct BytecodeBreakpoint<'a> {

impl<'a> BytecodeBreakpoint<'a> {
pub fn from_slice(body: &'a mut [u8]) -> Option<BytecodeBreakpoint<'a>> {
let mut body = body.splitn_mut(2, |b| *b == b';');
let mut body = body.splitn_mut_no_panic(2, |b| *b == b';');

let base = BasicBreakpoint::from_slice(body.next()?)?;

Expand Down
75 changes: 71 additions & 4 deletions src/protocol/common/hex.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ pub enum DecodeHexError {
InvalidOutput,
}

/// Decode a GDB dex string into the specified integer.
/// Decode a GDB hex string into the specified integer.
///
/// GDB hex strings may include "xx", which represent "missing" data. This
/// method simply treats "xx" as 0x00.
Expand All @@ -35,7 +35,7 @@ where
Ok(result)
}

/// Wrapper around a raw hex string. Enabled "late" calls to `decode` from
/// Wrapper around a raw hex string. Enables "late" calls to `decode` from
/// outside the `crate::protocol` module.
#[derive(Debug, Clone, Copy)]
pub struct HexString<'a>(pub &'a [u8]);
Expand Down Expand Up @@ -64,7 +64,7 @@ fn ascii2byte(c: u8) -> Option<u8> {
}
}

/// Check if the byte `c` is a valid GDB hex digit `[0-9][a-f][A-F][xX]`
/// Check if the byte `c` is a valid GDB hex digit `[0-9a-fA-FxX]`
#[allow(clippy::match_like_matches_macro)]
pub fn is_hex(c: u8) -> bool {
match c {
Expand All @@ -81,7 +81,74 @@ pub fn is_hex(c: u8) -> bool {
/// GDB hex strings may include "xx", which represent "missing" data. This
/// method simply treats "xx" as 0x00.
// TODO: maybe don't blindly translate "xx" as 0x00?
// TODO: rewrite this method to elide bound checks
#[cfg(not(feature = "paranoid_unsafe"))]
pub fn decode_hex_buf(base_buf: &mut [u8]) -> Result<&mut [u8], DecodeHexBufError> {
use DecodeHexBufError::*;

if base_buf.is_empty() {
return Ok(&mut []);
}

let odd_adust = base_buf.len() % 2;
if odd_adust != 0 {
base_buf[0] = ascii2byte(base_buf[0]).ok_or(NotAscii)?;
}

let buf = &mut base_buf[odd_adust..];

let decoded_len = buf.len() / 2;
for i in 0..decoded_len {
// SAFETY: rustc isn't smart enough to automatically elide these bound checks.
//
// If buf.len() == 0 or 1: trivially safe, since the for block is never taken
// If buf.len() >= 2: the range of values for `i` is 0..(buf.len() / 2 - 1)
let (hi, lo, b) = unsafe {
(
// (buf.len() / 2 - 1) * 2
// == (buf.len() - 2)
// since buf.len() is >2, this is in-bounds
*buf.get_unchecked(i * 2),
// (buf.len() / 2 - 1) * 2 + 1
// == (buf.len() - 1)
// since buf.len() is >2, this is in-bounds
*buf.get_unchecked(i * 2 + 1),
// since buf.len() is >2, (buf.len() / 2 - 1) is always in-bounds
buf.get_unchecked_mut(i),
)
};

let hi = ascii2byte(hi).ok_or(NotAscii)?;
let lo = ascii2byte(lo).ok_or(NotAscii)?;
*b = hi << 4 | lo;
}

// SAFETY: rustc isn't smart enough to automatically elide this bound check.
//
// Consider the different values (decoded_len + odd_adust) can take:
//
// buf.len() | (decoded_len + odd_adust)
// -----------|---------------------------
// 0 | (0 + 0) == 0
// 1 | (0 + 1) == 1
// 2 | (1 + 0) == 1
// 3 | (1 + 1) == 2
// 4 | (2 + 0) == 2
// 5 | (2 + 1) == 3
//
// Note that the computed index is always in-bounds.
//
// If I were still in undergrad, I could probably have whipped up a proper
// mathematical proof by induction or whatnot, but hopefully this "proof by
// example" ought to suffice.
unsafe { Ok(base_buf.get_unchecked_mut(..decoded_len + odd_adust)) }
}

/// Decode a GDB hex string into a byte slice _in place_.
///
/// GDB hex strings may include "xx", which represent "missing" data. This
/// method simply treats "xx" as 0x00.
// TODO: maybe don't blindly translate "xx" as 0x00?
#[cfg(feature = "paranoid_unsafe")]
pub fn decode_hex_buf(base_buf: &mut [u8]) -> Result<&mut [u8], DecodeHexBufError> {
use DecodeHexBufError::*;

Expand Down
Loading

0 comments on commit ecbbaf7

Please sign in to comment.