- This project contains propagation-based taint analysis based on llvm DFSan, and inference-based taint analysis Memlog based on llvm Pass, which are used for support memory feedback fuzzing.
-
一個沒有什麼卵用的玩具。
-
估計原因:外部呼叫跟隱性control flow
-
最近才發現Greyone中已經有討論過DFSan在實際應用中的效果,跟我實際實作後的結論差不多。由於DFSan是透過編譯階段插樁實現的,對於許多外部呼叫缺乏對應的實作,造成propagation容易在程式執行初期便失去作用。導致這東西變成了無法在real world使用的花架子。
- Each memory byte is binded with a label, which represents the state of this byte(ex. tainted or non tainted).
- These labels will be propagated in run time.
- DFSan maps a large region of shadow memory to record labels for all memory bytes in target program.
- Determine how DFSan handles each external function call.
- discard
- Do nothing, discard return value label.
- functional
- Do nothing, reserve return value label.
- custom
- Replace original function call with custom function call.
- func -> __dfsan_func
- DFSan already defined some glib functions.
- This project covers common I/O functions.
- user-defined.
- discard
- Original
- DFSan only records 2 bytes of label, user needs to define the meaning of label itself(ex. 1 represent tainted, 0 means non tainted).
- Modified
- Similiar to Angora, to understand the concept you can refer to Angora.
- Each label is binded with an unique input offset. When we found some instruction's operands is tainted with certain label, we also know which input offset can affect these operands.
- details
- Change label size from 2 bytes to 4 bytes.
- Maintain a union table to record mapping between label and input offset.
- New union operation.
- Modify DFSan label API ot update new union table.
- I/O wrapper, especially input, create each input offset, label for each input byte.
llvm-11
- test on Ubuntu 20.04
- Build DFSan, and Memlog passes with following commands.
-
cd ~/DTaint mkdir build cd build cmake ..
- Option
- -DDTAINT_DEBUG=ON/OFF
- Output dtaint debug message.
- -DMEMLOG_DEBUG=ON/OFF
- Output memlog debug message.
-
cmake -DMEMLOG_DEBUG=ON ..
- -DDTAINT_DEBUG=ON/OFF
- default
- memlog mode
- use dtaint mode
export DTAINT_MODE=1
-
export CC=build/clang-wrapper export CXX=build/clang-wrapper++ // default use memlog mode ... build target // use dtaint mode export DTAINT_MODE=1 ... build target
- Auto-generated target_abilist.txt from src/abilist/target in compiler time.
- user-defined
- you can use scripts in tools/.
- gen_library_abilist.sh
- When we want to test library, and harness will need certain abilist for library API. Auto-generated abilist from library .a, .so.
- usage
./gen_library_abilist.sh [library .a, .so] > [output_file] [option]
- option
- functional
- discard
- custom
- Modified from Angora.
- gen_udr_abilist.sh
- Auto-generate abilist when we encounter undefind reference in compile time.
- Store the error message manually.
make 2> erro_message
- usage
./gen_udr_abilist.sh [error message file] [output_file]
- Import generaged abilist into project.
- three ways
-
Move generated abilist to src/abilist manually, and rebuild DFSan.
-
When compiling target program adds compiler option.
-mllvm -dtaint-dfsan-abilist=gen_abilist.txt
-
gen_target_abilist.sh
- Combine all abilist files in choosed directory. Since we have set clang-wrapper to import target_abilist defaultly. Similiar to way 1, we just generate target_abilist without rebuild DFSan.
- usage
-
./gen_target_abilist.sh [input_dir] [output_file]
- default output directory
build/lib/share
- default output directory
-
- three ways
- dump debug info
- DFSan supports a feature to store some information to file.
export DFSAN_OPTIONS=dump_labels_at_exit=filename
- test/hook_test/va_arg_hook1_test.c
-
#define MAXSIZE 512 ... // read file to buf src[MAXSIZE] fread(src, 1, MAXSIZE, f); ... // size is affected by input size = (unsigned char)src[2] + (unsigned char)src[8]; ... for(int i = 0; i < MAXSIZE; i++) { if (i >= MAXSIZE - 2) { // pointer index is controllable src[size + i] = '8'; } }
- input
2434127653dsasdawqewqqw
- when -DDTAINT_DEBUG=ON
- output
- input offset 2, 8 are tainted.
- 本來是設計來搭配AFLplusplus使用,需要的runtime code也已經編進去了,包含shared memory跟forkserver的code,可以透過改過的AFLplusplus用另一個forkserver啟動DFSan插樁的target。
- 但現在已經是廢棄狀態,AFLplusplus那邊的code也打算廢棄掉了。
- Inference-based taint analysis需要的插樁code。