Warning: the following steps will require about 15GB of free disk space.
To download the binaries from Google Drive use the gdrive_download.py
Python3 script and follow the instructions below:
-
Install the Python3 virtualenv
-
Create a new virtualenv and install the required packages
# create a new "env" environment
python3 -m venv ../env
# enter the virtual environment
source ../env/bin/activate
# Install the requirements in the current environment
pip install -r ../requirements.txt
- Download and unzip the binaries in the corresponding folders:
python3 ../gdrive_download.py --binaries
The binaries will be unzipped in the following directories:
Binaries/Dataset-Vulnerability
Binaries/Dataset-1
Binaries/Dataset-2
- The instructions to cross-compile the binaries of
Dataset-1
can be found in the [Compilation scripts](Compilation scripts) folder - Binaries from
Dataset-2
andDataset-Vulnerability
are a subset of those released by Trex [1]. Link to the original dataset.
[1] Pei, Kexin, et al. "Trex: Learning execution semantics from micro-traces for binary similarity." arXiv preprint arXiv:2012.08680 (2020).