Visual Basic for Applications tools allowing to parse VBA files, interpret them and extract behaviour information for malware analysis purpose.
SpuriousEmu is available on PyPI, so you can install it using
pip install spurious-emu
SpuriousEmu can work with VBA source files, or directly with Office documents. For the later case, it relies on olevba to extract macros from the files. All of the command use a final positional argument to specify the input file to work with.
If you work with VBA source files, the following convention is used:
- procedural modules have
.bas
extension - class modules have
.cls
extension - standalone script files have
.vbs
extension
SpuriousEmu uses different subcommands for its different operating modes.
Static analysis is performed using the static
subcommand.
Usually, the first step is to determine the different functions and classes defined, in order to understand the structure of the program. You can for example use it to determine the entry point prior to dynamic analysis. It is the default behaviour when using no flag:
emu static document.xlsm
Additionally, for large files, you can use the -o
flag to serialize the information compiled during static analysis into a binary file that you will be able to use later with the report
command for example:
emu static -o document.spurious-com document.xlsm
You can trigger dynamic analysis with the dynamic
subcommand.
Once you have found the entry-point you want to use with the static
subcommand, you can execute a file by specifying it with the -e
flag. For example, to launch the Main
function found in doc.xlsm
, use
emu dynamic -e Main doc.xlsm
This will display a report of the execution of the program. Additionally, if you want to save the files created during execution, you can use the -o
flag: it specifies a directory to save files to. Each created file is then stored in a file with its md5 sum as file name, and a {hash}.filename.txt
file contains its original name. You can also save a report of the dynamic analysis using the -r
flag. For example:
emu dynamic -o extract_files -r report.spemu-out doc.xlsm
SpuriousEmu will often can fail to interpret VBA program, however it should still be able to help you de-obfuscate macros : that is what the deobfuscate
command is for.
It works with a document, source file or compiled file and writes to the standard output a de-obfuscated version of macros that have been found. The most basic invocation is
emu deobfuscate document.docm
You can customize de-obfuscation with two options:
- Flag
-p
allows you to evaluate expressions without side effects. Use-p 0
to disable it,-p 1
to only handle literal expressions (e.g. replace"W" + "Scr" & "ip"
with"WScript"
) and-p 2
to also handle pure functions (e.g. replaceChr(37)
with"%"
) - Flag
-s
renames symbols that seem to be obfuscated with legible names (e.g.1l11l1l
tovar_1
). If it is not specified, all the modules will be de-obfuscated.
Additionally, you can choose to only output a given symbol with the -e
flag.
Thus, to de-obfuscate Document_Open
, using clear variable names and decrypting XOR-encrypted static strings, use
emu deobfuscate -e Document_Open -p 2 -s document.spemu-com
Finally, you can use the experimental Markov classifier feature : variable names to be demangled are determined by a classifier which tries to compute how English a word appears. It is enabled by the -m
flag.
You can work with .spemu-out
and .spemu-com
file with the report
command.
The report
commands can have three mutually exclusive flags: --json
, --csv
and --table
, which change the way reports are displayed.
Similarly to the default static
output, you can use the --symbols
flag with a .spemu-com
file to get the list of functions and classes. For example, to have them in a JSON dump, you can use
emu report --symbols --json program.spemu-com
You can extract the files generated by the execution of a program using the --extract-files
flag, which behaves like the -o
flag with the dynamic
command:
emu report --extract-files files program.spemu-out
A timeline of the events can be produced with the --timeline
flag. It can be made easier to read with the --shorten
and --skip-streaks
commands, as in
emu report --timeline --table --shorten --skip-streaks 10 program.spemu-out
SpuriousEmu was initially started during an internship at the NATO Cyber Security Centre during the summer of 2020, and is now developped on my spare time. It is highly experimental, so you may expect it to fail on most real-life samples.
Python 3.8 is used, and SpuriousEmu mainly relies on PyParsing for VBA grammar parsing, and oletools to extract VBA macros from Office documents. Report tables are generated using PrettyTable.
nose is used as testing framework, and mypy to perform static code analysis. lxml
and coverage
are used to produce test reports.
To set a development environment up, use poetry
:
poetry install
Then, use nose to run the test suite:
poetry run nosetests
All test files are in tests
, including:
- Python test scripts, starting with test_
- VBA scripts used to test the different stages of the tools, with vbs
extensions, stored in source
- expected test results, stored as JSON dumps in result
You can use mypy to perform code static analysis:
poetry run mypy emu/*.py
Both commands produce HTML reports stored in tests/report
.