Skip to content

EVMLiSA: an abstract interpretation-based static analyzer for EVM bytecode

License

Notifications You must be signed in to change notification settings

lisa-analyzer/evm-lisa

Repository files navigation

EVMLiSA: an abstract interpretation-based static analyzer for EVM bytecode

GitHub Actions Workflow Status GitHub last commit GitHub commit activity GitHub issues

EVMLiSA is a static analyzer based on abstract interpretation for EVM bytecode of smart contracts deployed on Ethereum blockchain and built upon LiSA. Given a EVM bytecode smart contract, EVMLiSA builds a sound and precise control-flow graph of the smart contract.

EVMLiSA is based on the peer-reviewed publication

Vincenzo Arceri, Saverio Mattia Merenda, Greta Dolcetti, Luca Negrini, Luca Olivieri, Enea Zaffanella. "Towards a Sound Construction of EVM Bytecode Control-Flow Graphs". In Proceedings of the 26th ACM International Workshop on Formal Techniques for Java-like Programs (FTfJP 2024), co-located with ECOOP 2024.

🛠 Building EVMLiSA

Compiling EVMLiSA requires:

You need to:

  • Clone the repository:
    git clone https://github.com/lisa-analyzer/evm-lisa.git
    cd evm-lisa
  • Import the project into the Eclipse/IntelliJ workspace as a Gradle project (optional).

⚙️ Running EVMLiSA

Before running EVMLiSA, ensure you have set up an Environment Variable with your Etherscan API Key. Follow the steps below to set up the environment variable:

  1. Begin by creating a file named .env in the EVMLiSA project.
  2. Inside the .env file, add the following line:
ETHERSCAN_API_KEY=<your_etherscan_api_key>
  1. Replace <your_etherscan_api_key> with your Etherscan API key.

Here you can find how to generate an Etherscan API key.

Once you have set up the environment variable, you can run EVMLiSA via Docker or via Bash.

Via Docker

Build the container:

mkdir -p execution/docker &&
docker build -t evm-lisa:latest .

Then you can run EVMLiSA with:

docker run --rm -it \
-v $(pwd)/.env:/app/.env \
-v $(pwd)/execution/docker:/app/execution/results \
evm-lisa:latest \
-a <smart_contract_address> [options]
  • -v $(pwd)/.env:/app/.env: mount the .env file.
  • -v $(pwd)/execution/docker:/app/execution/results: share the results' folder.

Replace <smart_contract_address> with the address of the Ethereum smart contract you want to analyze.

Via Bash

Build the Project:

./gradlew build

Create Distribution Zip:

./gradlew distZip

Unzip the Distribution:

unzip build/distributions/evm-lisa.zip -d execution

Then you can run EVMLiSA with:

./execution/evm-lisa/bin/evm-lisa \
-a <smart_contract_address> [options]

Replace <smart_contract_address> with the address of the Ethereum smart contract you want to analyze.

This command will initiate the analysis process for the specified smart contract, providing insights and results based on the EVM bytecode of the contract.

Options:
 -a,--address <arg>          address of an Ethereum smart contract
 -b,--benchmark <arg>        filepath of the benchmark suite (i.e., a list of smart contract addresses)
 -C,--cores <arg>            number of cores to use
 -c,--dump-cfg               dump the CFG
 -d,--dump-analysis <arg>    dump the analysis (html, dot)
 -D,--download-bytecode      download the bytecode, without analyzing it
 -f,--filepath <arg>         filepath of an EVM bytecode smart contract
 -o,--output <arg>           output directory path
 -q,--stack-size <arg>       maximal height of stack
 -s,--dump-stats             dump statistics
 -S,--use-live-storage       use the live storage in SLOAD
 -w,--stack-set-size <arg>   maximal size of stack sets

🔍 Abstract Stack Set Domain

In the analysis of EVM bytecode programs, EVMLiSA employs a domain of sets of abstract stacks to enhance precision, particularly when loops are encountered in the source code.

EVMLiSA introduces the abstract stack powerset domain $\texttt{SetSt}_{k,h,l}$ which consists of sets of abstract stacks with at most $l$ elements and an height of at most $h$. This domain allows the analyzer to maintain collections of abstract stacks, avoiding the need to compute the lub and allowing each element of an abstract stack to be a $k$ integer set.

📋 Running example

Here is an example of how to run EVMLiSA. In this example, we will analyze a smart contract at the address 0x7c21C4Bbd63D05Fa9F788e38d14e18FC52E9557B with specific options for stack size, stack-set size, live storage usage and the dump of the CFG:

  • Bash:
./execution/evm-lisa/bin/evm-lisa \
-a 0x7c21C4Bbd63D05Fa9F788e38d14e18FC52E9557B \
--stack-size 64 \
--stack-set-size 10 \
--dump-stats \
--use-live-storage
  • Docker:
docker run --rm -it \
-v $(pwd)/.env:/app/.env \
-v $(pwd)/execution/docker:/app/execution/results \
evm-lisa:latest \
-a 0x7c21C4Bbd63D05Fa9F788e38d14e18FC52E9557B \
--stack-size 64 \
--stack-set-size 10 \
--dump-stats \
--use-live-storage

Use docker run -a stderr to dump only the json report as standard output.

The expected output is as follows:

##############
Total opcodes: 344
Total jumps: 45
Resolved jumps: 44
Definitely unreachable jumps: 1
Maybe unreachable jumps: 0
Unsound jumps: 0
Maybe unsound jumps: 0
##############

Jump classification

  • Resolved: all the destinations of the jump node have been resolved;
  • Definitely unreachable: the jump node is unreachable (i.e., it is reached with the bottom abstract state);
  • Maybe unreachable: the jump node is not reachable in the CFG, starting from its entry point, but it may be reachable via a (maybe) unsound jump node (if any);
  • Unsound: the jump node is reached at least with a stack with an unknown numerical value that may correspond to a valid jump destination as the top element;
  • Maybe unsound: the stack set exceeded the maximal stack size.

Contributors