Skip to content

Developer's Guide

Alexandra Sandulescu edited this page Nov 8, 2017 · 13 revisions

Overview

This wiki shows how to use and extend Cardinal. If your Cardinal installation is not complete, follow the instructions in Getting Started.

Please check that your Cardinal installation works and the executables database is not empty.

cd $CARDINAL_ROOT_DIR
./scripts/run.sh <binary-id> 10000 1 1 norebuild genetic
# monitor execution and make sure all tools are up and running
pm2 list
pm2 monit

Check the execution result in web ui. Display the control flow graph and show stats for your binary.

River.genetic development guide

The following information shows river.genetic integration in Cardinal, how input data is obtained and parsed and how output data is stored and sent in Cardinal pipeline flow.

To connect to Mongo, use:

mongo -u <user> -p --authenticationDatabase=admin

Cardinal input is represented by target programs that are analyzed with static and dynamic tools. Target programs are named executables.
Each Cardinal executable is stored in mongo executables database. Explore the layout:

> db.executables.findOne();

In each Cardinal tool, executables are referenced via their mongo id.

db.executables.findOne({}, {_id:1});

For Cardinal integration, river.genetic has three extra command-line parameters:

--config <config-path>
--driver [0|1]
--executableId <executable-id>

The configuration file contains information about database connection and layout and also rabbit queues connection and layout. Each river.genetic process can run for one single executable, so the executable id stored in mongo must be passed.

River.genetic analyzes executable traces that are produced by tracer.node and stored in the trace_<executable-id> database. They are stored using mongo gridfs, each file representing a single trace.

> db.trace_<executable-id>.files.findOne();

Files are stored by chunks in separate databases.

river.genetic retrieves the traces via TraceImporter.
The method TraceImporter.getTrace returns a new trace from mongo. If inputString parameter is not null, then a new testcase is written to disk and sent to Cardinal pipeline for analysis. The idea is that new testcases generated by river.genetic cannot be traced immediately. Instead of waiting for their traces to be produced, river.genetic retrieves other traces that are ready and analyzes them.

TraceImporter also handles the mongo and rabbit connection.
To retrieve a new trace from mongo, river.genetic listens to a rabbit queue where traced tests ids are stored. If a test id arrives in the queue, it means that the test was traced, and its trace was stored in mongo. river.genetic gets ids one by one from this queue and looks for the trace in trace_<executable-id> database. The trace is retrieved via mongo handlers and sent to the TraceImporter.getTrace caller class.

Traces are parsed in EvalFunctors.processDataStream. The information produced by each trace shows each executed basic block along with specific information, as: module, offset, cost, jump type(imm, reg, mem), jump instuction(ret, jmp, jxx, call, syscall), number of instructions, (module, offset) for taken branch and (module, offset) for not taken branch.

Taken branch is considered the basic block the follows in case the jump condition evaluates to true. The not taken branch is the one the follows if the condition evaluates to false.

tracer.node development guide

Tracer.node is a Cardinal component that traces each untraced test from mongo tests_<executable-id> database. Each tracer process works per executable, thus the executable id must be passed as parameter to each tracer.node process.

New tests that are not traced are introduced in rabbit newtests queues. tracer.node listens to the corresponding queue. Each test is traced using a core tracer composed from river.sdk and node-river, that handles the communication between core river tracer and nodejs distributed component. The tracer has two running modes, simple and annotated. The annotated tracer handles the data flow analysis for each test. The analysis is logged in corresponding trace in a separate format.

river.sdk

This component contains the river tracer, the symbolic execution engine, and many other components that allow tracing in-process or external target programs. More information about it is found in the corresponding repo. The output format for each trace, either simple or annotated, is shown in simpletracer.

To build and test simpletracer, follow the instructions from here.

process manager

Process manager is a Cardinal component that handles all Cardinal running tools. processes.json. See the configuration of existing tools. To use it, see the usage instructions here.

Clone this wiki locally