-
Notifications
You must be signed in to change notification settings - Fork 31
Fuzzing goblin (Rust:crab:!) project with Sydr and AFLplusplus
From this article you will learn how to use hybrid fuzzing tool sydr-fuzz for fuzzing Rust 🦀 projects. Sydr-fuzz combines the power of Sydr - dynamic symbolic execution tool and AFLplusplus. It also supports another fuzzing engine libFuzzer. In this guide we will focus on preparing targets for AFLplusplus, Sydr, code coverage, and libFuzzer. We will do hybrid fuzzing using Sydr & AFL++ and then we will collect code coverage. Also, we will use our crash triage tool casr, and apply Sydr to check security predicates for finding interesting bugs using symbolic execution techniques. And of course, we will take into account the Rust 🦀 projects specific while doing all that stuff!
Goblin is a nice library for binary parsing various executable formats. Before I start, there is already prepared for building docker container with all fuzz environment: targets for AFL++, Sydr, libFuzzer and code coverage. Later, we will use it with some modifications.
We are lucky, goblin already supports fuzzing with libFuzzer! Fuzzing rust crates with libFuzzer is often done by using cargo fuzz. You could learn how to use cargo fuzz from Rust Fuzz Book. So, goblin has two fuzz targets: parse and parse_elf. Building fuzz targets for libFuzzer is easy with this command: cargo fuzz build -O
. Now I want to give you my first advice. Build fuzz target with these flags: RUSTFLAGS="-C panic=abort" cargo fuzz build -O
. Why is it useful? According to this, with panic=abort
we will not waste time on stack unwinding on each crash occurred and will produce more accurate and small stacktraces for later triaging with casr.
So far so good, but we want to use AFL++ as fuzzer for our needs. For this purpose cargo afl is suitable. You also could learn how to use cargo afl from Rust Fuzz Book. Now we need to create fuzz targets for AFL++ using fuzz targets for libFuzzer. It's easy. Here is the example of parse target.
#[macro_use]
extern crate afl;
fn main() {
fuzz!(|data: &[u8]| {
let _ = goblin::Object::parse(data);
});
}
Like for libFuzzer we build target with -C panic=abort
using this command: RUSTFLAGS="-C panic=abort" cargo afl build --release
.
Now we need to build a target for Sydr. It's also easy. The example parse target.
extern crate goblin;
use std::env;
use std::fs::File;
use std::io::Read;
fn main() -> std::io::Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() >= 2 {
let filename = &args[1];
let mut f = File::open(filename).expect("no file found");
let metadata = std::fs::metadata(filename).expect("unable to read metadata");
let mut data = vec![0; metadata.len() as usize];
f.read(&mut data).expect("buffer overflow");
let _ = goblin::Object::parse(&data);
}
Ok(())
}
We just need to read input file as u8 vector and pass it to parse
function. Build it using RUSTFLAGS="-C panic=abort" cargo build --release
. Also, I have to mention that we build release, but we want overflow-checks
to be enabled. It is very useful for symbolic execution, because these checks are conditional branches, that could be inverted by symbolic execution and as the result lead the target to an error state. Here is the example of Cargo.toml, but you could do it via RUSTFLAGS.
[profile.release]
debug = true
panic = 'abort'
overflow-checks = true
And last but not least, binary for code coverage. For this purpose we use sydr target and build it with provided command: RUSTFLAGS="-C instrument-coverage" cargo build
.
Good, we have learned how to prepare all needed binaries. Let's build the docker with fuzzing environment using these instructions. But before we start building let's checkout some old commit in Dockerfile (for example git checkout 59ec2f3c57c53aa828b6a4cb4730d1efe3e43a05
). Goblin is nice library where new crashes found by fuzzing are rapidly fixed. So, due to this change we would definitely find some crashes.
We are going to start hybrid fuzzing using sydr-fuzz with Sydr & AFL++. I've slightly modified the parse-afl++.toml.
exit-on-time = 3600
[sydr]
target = "/sydr_parse @@"
jobs = 2
[aflplusplus]
target = "/afl_parse"
args = "-i /corpus"
jobs = 4
[cov]
target = "/cov_parse @@"
Let's have a brief look at this config file:
exit-on-time - is an optional parameter that takes time in seconds. If during this time (1 hour in our case) the coverage does not increase, fuzzing is automatically terminated.
I set two Sydr instances and four AFL++ instances. It's time to start fuzzing:
# sydr-fuzz -c parse-afl++.toml run
After several minutes I noticed that AFL++ and Sydr by itself found some crashes. Let's wait till fuzzing ends.
After 17 hours fuzzing has ended and we have found 2 timeouts and 559 crashes!
I know you want to apply casr as soon as possible to analyze crashes. "Now don't be hasty, Master Meriadoc." (c) Treebeard.
Let's minimize the input corpus first:
# sydr-fuzz -c parse-afl++.toml cmin
afl-cmin
narrowed down 14994 files to 1982. It's nice result, let's collect code coverage and check security predicates before crash triage.
sydr-fuzz
provides a convenient way to collect coverage also for rust targets. Let's try to do this.
# sydr-fuzz -c parse.toml cov-export -- -format=lcov > parse.lcov
# genhtml --ignore-errors source -o parse_html parse.lcov
Here is some output that we've got.
# export PATH=/root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin/:$PATH
This is already done in Dockerfile. We need this change because sydr-fuzz needs to use the same llvm toolchain that has been applied for building coverage binary. As the result, we have collected code coverage that could be analyzed.
The idea behind security predicates is shortly described in xlnt guide. Let's check security predicates. Well, resulting corpus is still not small. I use my PC for fuzzing this example. It has 6 cores and 32GB RAM. So I'll check security predicates on subset of inputs (for example 256 inputs) using 4 Sydr jobs.
Here I've to say that we need to build Sydr's target with -C panic=abort
, because we don't have UBSAN&ASAN. We have overflow-checks = true
. Without -C panic=abort
the target will silently exit and we won't know about overflow or any other panic.
To check security predicates, I use this command:
# sydr-fuzz -c parse-afl++.toml security -j 4 --runs 256
After some time Sydr has found something. Let's wait till security predicates check is finished. Security predicates check is finished. We have found one more crash. Now we have 560 crashes to analyze.
At last, it's time for crash triage! For crash triage I use casr via sydr-fuzz casr
subcommand:
# sydr-fuzz -c parse-afl++.toml casr
You can learn more about casr
from it's repository or from my other fuzzing tutorial.
Let's look at casr output:
After deduplication we have 11 crashes splitted into 4 clusters. We also can see, that in cl3
there are 5 crashes with the same crash line. From a brief look, all crashes were fixed already except one from cl4
. Let's look at its casr report.
So what do we have here? There is an integer overflow. Honestly, I don't know does it affect something or not. It looks safe for now. Maybe we need to open an issue for it.
The attentive reader will ask me: "What about two timeouts?". I've checked them. Target works slow on them, about a minute each, but no infinite loops were spotted! I think we need some automation for this, hmm?
In this article I tried to shed some light on interesting aspects of fuzzing Rust 🦀 projects. I've showed how to use hybrid fuzzing with sydr-fuzz, minimize corpus, collect code coverage, check security predicates, and triage crashes. Though, Sydr and sydr-fuzz are commercial products, and if you don't have access to them, you could try to do some parts without them.
-
Preparing fuzz targets is in the same way for AFL++ without sydr-fuzz, and also for other symbolic executors, such as Fuzzolic. Also you could build and use provided docker container for fuzzing.
-
Starting hybrid fuzzing is different. You need to start some AFL++ instances manually or using some scripts. A lot of useful information about it you could get from AFL++ docs (also about how to run AFL++ with Fuzzolic). For corpus minimization you could do the following: copy inputs from all nodes into one folder and run
afl-cmin
. -
To collect coverage you need to run coverage binary on minimized corpus, merge
.profraw
files and export the result aslcov
(maybe write some script for that). Then usegenhtml
to create html report. -
Checking security predicates is a Sydr feature (paper). Though there are some similar academic papers in the wild.
-
casr is open-sourced and has integration with AFL++! You could easily use it:
casr-afl -i /path/to/afl/out/dir -o /path/to/casr/out/dir
Andrey Fedotov