This artifact is for paper "Demystifying the Dependency Challenge in Kernel Fuzzing". Fuzz testing operating system kernels remains a daunting task to date. One known challenge is that much of the kernel code is locked under specific kernel states and current kernel fuzzers are not effective in exploring such an enormous state space. We refer to this problem as the dependency challenge. Though there are some efforts trying to address the dependency challenge, the prevalence and categorization of dependencies have never been studied. Most prior work simply attempted to recover dependencies opportunistically whenever they are relatively easy to recognize. We undertake a substantial measurement study to systematically understand the real challenge behind dependencies. In one word, the artifact is to help researchers to understand the dependency challenge in kernel fuzzing.
- username & password: icse22ae
- zenodo archive:
https://doi.org/10.5281/zenodo.6029158
- also available in Google driver:
https://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
- zenodo archive:
https://doi.org/10.5281/zenodo.6029520
- github and update:
https://github.com/ZHYfeng/Dependency
- zenodo archive:
https://doi.org/10.5281/zenodo.5441138
- also available in Google driver:
data.tar.gz
inhttps://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
sudo apt install -y git
git clone https://github.com/ZHYfeng/Dependency.git
cd Dependency
bash build_script/build.bash
- configure the kernel and image based on the requirement of syzkaller, mv image to
path-of-Dependency/workdir/image
doc of syzkaller: https://github.com/google/syzkaller/blob/master/docs/linux/setup_ubuntu-host_qemu-vm_x86-64-kernel.md
the image we build: image.tar.gz inhttps://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
- add
-fsanitize-coverage=no-prune
toCFLAGS_KCOV
in kernel config - build kernel using clang and mv it to
path-of-Dependency/workdir/13-linux-clang-np
the kernel we build: linux-clang-np.tar.gz in
https://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
- copy the kernel and generate bitcode of kernel using
-fembed-bitcode -save-temps=obj
https://github.com/ZHYfeng/Generate_Linux_Kernel_Bitcode/tree/master/Achieve/01-change-makefile
the bitcode we build: linux-clang-np-bc-f.tar.gz inhttps://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
- preprocess kernel in order to save time
cd path-of-Dependency/workdir/13-linux-clang-np objdump -d vmlinux > vmlinux.objdump a2l -objdump=vmlinux.objdump
the workdir we prepare: workdir.tar.gz in
https://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
- make a directory called
dev_xxx
inpath-of-Dependency/workdir
- copy the bitcode(.bc) and assembly code(.s) to the directory and rename it to
built-in.bc
andbuilt-in.s
- copy the configuration files
path-of-Dependency/04-experiment_script/json/dra.json
andpath-of-Dependency/04-experiment_script/json/syzkaller.json
.change the value of
file_bc
indra.json
to the relative path for the bitcode of device driver you test
change the value ofpath_s
indra.json
to the relative path of device driver you test - copy the run script
path-of-Dependency/04-experiment_script/python/run.py
- generate static analysis results based on the static-taint-analysis-component
https://zenodo.org/record/5348989/files/static-taint-analysis-component.zip
(the path based on virtual machine)
-
active the environment
source /home/icse22ae/Dependency/environment.sh
-
pick one device driver in
/home/icse22ae/Dependency/workdir/workdir
, for examplecdrom
:cd /home/icse22ae/Dependency/workdir/workdir/dev_cdrom
-
configure the run script
time_run: the second of fuzzing time.
number_execute: the number of fuzzing runs.
number_vm_count: the number of vm in each fuzzing.In our paper,
time_run
is at least 48 hours,number_execute
is 3 andnumber_vm_count
is 32.
For artifact evaluation,number_execute
andnumber_vm_count
could be 1.
time_run
should be at least 5 mins(20 mins for device driver kvm) -
run our tool using script It will automatically stop after
time_run
.python3 run.py
-
read the results
still in the same environment in step 1 and the same path in step 2.go run /home/icse22ae/Dependency/03-syzkaller/tools/read_result/ -a2i
Based on the different fuzzing configuration and device driver, the time would be differnet.
For cdrom, it should be several mins. For kvm, it needs several hours.
You can find the results used in our paper in /home/icse22ae/Dependency/workdir/data
.
- The
dataDependency.bin
,dataResult.bin
,dataRunTime.bin
,statistics.bin
in./0
or./1
or./2
are the resutls in protobuf format.The protobuf files are in
/home/icse22ae/Dependency/05-proto
0_coverage.txt
is the coverage of the fuzzing in./0
.coverage.txt
is the average coverage of all runs.Each line istime@number-of-edge
.conditionD.txt
lists all unresolved condition related to dependency.conditionND.txt
lists all unresolved condition not related to dependency.conditionDN.txt
lists all unresolved condition related to dependency but our static analysis can not find their write statements.intersection.txt
is the intersection coverage of all runs andunion_coverage.txt
is the union coverage of all runs. Each line is the address of the edge.OutsideFunctions.txt
is theUnreachable Functions Elimination
mentioned in our paper.statistic.txt
is the statistic used in our paper.uncovered.txt
lists all uncovered edge and its unresovled conditions, anduncovered_more.txt
lists more details about them.
Still use dev_cdrom
as example and the results can be found in data.tar.gz
as mentioned in Section Evaluation Data
All unresolved condition related to dependency in conditionD.txt
, for example:
0xffffffff8579b9b7@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279@0xffffffff8579b960@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279@mmc_ioctl_cdrom_read_audio@if.end11.i@
@ @0xffffffff857a3eaa@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2124@1@
@ @0xffffffff8579b421@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2228@0@
@ @0xffffffff8579b05a@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2187@1@
0xffffffff8579b9b7
is the assembly address of unresovled branch in binary and https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279
is the source code of the unresolved dependency. 0xffffffff8579b960
is the assembly address of condition of the unresovled branch and also https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279
is the source code. if.end11.i
is the name of basic block in LLVM bitcode.
Next lines are the write addresses for the unresolved dependency.
Then we can find a file 0xffffffff8579b9b7.txt
, which is named by the assembly address of unresovled branch.
Inside this file, we can find the number of dominator instructions of this unresolved dpendnecy,
the inputs (test cases) from syzkaller which can arrive unresolved dpendnecy, the inputs which can arrive the write address.
We can also find the call chain of write address starting from entry function.
02-dependency
02-dependency/lib/DMM/
: mapping between assembly address in the binary and basic block in LLVM bitcode02-dependency/lib/RPC/
: work with fuzzing component (syzkaller) using Protobuf and gRPC02-dependency/lib/STA/
: work with static analysis component using JSON02-dependency/lib/DCC/
: output human-readable information and statistics for unresolved conditions
03-syzkaller
03-syzkaller/syz-fuzzer/
: modification for collecting more complete coverage and other related useful information from fuzzing03-syzkaller/pkg/dra/
: work with mapping component and output results using Protobuf and gRPC
05-proto
: all Protobuf files