Skip to content

Commit

Permalink
feat: O(1) selector tables (#3496)
Browse files Browse the repository at this point in the history
this commit replaces the existing linear entry point search with an O(1)
implementation. there are two methods depending on whether optimizing
for code size or gas, hash table with probing and perfect hashing using
a two-level technique.

the first method divides the selectors into buckets, uses
`method_id % n_buckets` as a "guess" to where to enter the selector
table and then jumps there and performs the familiar linear search for
the selector ("probing"). to avoid too large buckets, the jumptable
generator searches a range from ~`n_buckets * 0.85` to
`n_buckets * 1.15` to minimize worst-case probe depth; the average worst
case for 80-100 methods is 3 items per bucket and the worst worst case
is 4 items per bucket (presumably if you get really unlucky), see
`_bench_sparse()` in `vyper/codegen/jumptable_utils.py`. the average
bucket size is 1.6 methods.

the second method uses a perfect hashing technique. finding a single
magic which produces a perfect hash is infeasible for large `N`
(exponential, and in practice seems to run off a cliff around 10
 methods). to "get around" this, the methods are divided into buckets of
roughly size 10, and a magic is computed per bucket. several `n_buckets`
are tried, trying to minimize `n_buckets`. the code size overhead of
each bucket is roughly 5 bytes per bucket, which works out to ~20% per
method, see `_bench_dense()` in `vyper/codegen/jumptable_utils.py`.
then, the function selector is looked up in two steps - it loads the
magic for the bucket given by `method_id % n_buckets`, and then uses the
magic to compute the location of the function selector (and associated
metadata) in the data section. from there it loads the function
metadata, performs the calldatasize, callvalue and method id checks and
jumps into the function.

there is a gas vs code size tradeoff between the two methods - roughly
speaking, the sparse method requires ~69 gas in the best case (~109 gas
in the "average" case) and 12-22 bytes of code per method, while the
dense method requires ~212 gas across the board, and ~8 bytes of code
per method.

to accomplish this implementation-wise, the jumptable info is generated
in a new helper module, `vyper/codegen/jumptable_utils.py`. some
refactoring had to be additionally done to pull the calldatasize,
callvalue and method id checks from external function generation out
into a new selector section construction step in
`vyper/codegen/module.py`.

additionally, a new IR "data" directive was added, and an associated
assembly directive. the data segments in assembly are moved to the end
of the bytecode to ensure that data bytes which happen to look like
`PUSH` instructions do not mangle valid bytecode which comes after the
data section.
  • Loading branch information
charles-cooper authored Jul 25, 2023
1 parent 4ca1c81 commit 408929f
Show file tree
Hide file tree
Showing 24 changed files with 1,133 additions and 230 deletions.
15 changes: 11 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,18 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [["3.10", "310"], ["3.11", "311"]]
python-version: [["3.11", "311"]]
# run in modes: --optimize [gas, none, codesize]
flag: ["core", "no-opt", "codesize"]
opt-mode: ["gas", "none", "codesize"]
debug: [true, false]
# run across other python versions.# we don't really need to run all
# modes across all python versions - one is enough
include:
- python-version: ["3.10", "310"]
opt-mode: gas
debug: false

name: py${{ matrix.python-version[1] }}-${{ matrix.flag }}
name: py${{ matrix.python-version[1] }}-opt-${{ matrix.opt-mode }}${{ matrix.debug && '-debug' || '' }}

steps:
- uses: actions/checkout@v1
Expand All @@ -97,7 +104,7 @@ jobs:
run: pip install tox

- name: Run Tox
run: TOXENV=py${{ matrix.python-version[1] }}-${{ matrix.flag }} tox -r -- --reruns 10 --reruns-delay 1 -r aR tests/
run: TOXENV=py${{ matrix.python-version[1] }} tox -r -- --optimize ${{ matrix.opt-mode }} ${{ matrix.debug && '--enable-compiler-debug-mode' || '' }} --reruns 10 --reruns-delay 1 -r aR tests/

- name: Upload Coverage
uses: codecov/codecov-action@v1
Expand Down
17 changes: 17 additions & 0 deletions docs/compiling-a-contract.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,23 @@ Remix IDE

While the Vyper version of the Remix IDE compiler is updated on a regular basis, it might be a bit behind the latest version found in the master branch of the repository. Make sure the byte code matches the output from your local compiler.

.. _optimization-mode:

Compiler Optimization Modes
===========================

The vyper CLI tool accepts an optimization mode ``"none"``, ``"codesize"``, or ``"gas"`` (default). It can be set using the ``--optimize`` flag. For example, invoking ``vyper --optimize codesize MyContract.vy`` will compile the contract, optimizing for code size. As a rough summary of the differences between gas and codesize mode, in gas optimized mode, the compiler will try to generate bytecode which minimizes gas (up to a point), including:

* using a sparse selector table which optimizes for gas over codesize
* inlining some constants, and
* trying to unroll some loops, especially for data copies.

In codesize optimized mode, the compiler will try hard to minimize codesize by

* using a dense selector table
* out-lining code, and
* using more loops for data copies.


.. _evm-version:

Expand Down
4 changes: 2 additions & 2 deletions docs/structure-of-a-contract.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,13 @@ In the above examples, the contract will only compile with Vyper versions ``0.3.
Optimization Mode
-----------------

The optimization mode can be one of ``"none"``, ``"codesize"``, or ``"gas"`` (default). For instance, the following contract will be compiled in a way which tries to minimize codesize:
The optimization mode can be one of ``"none"``, ``"codesize"``, or ``"gas"`` (default). For example, adding the following line to a contract will cause it to try to optimize for codesize:

.. code-block:: python
#pragma optimize codesize
The optimization mode can also be set as a compiler option. If the compiler option conflicts with the source code pragma, an exception will be raised and compilation will not continue.
The optimization mode can also be set as a compiler option, which is documented in :ref:`optimization-mode`. If the compiler option conflicts with the source code pragma, an exception will be raised and compilation will not continue.

EVM Version
-----------------
Expand Down
4 changes: 2 additions & 2 deletions tests/base_conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,10 @@ def w3(tester):
return w3


def _get_contract(w3, source_code, optimize, *args, **kwargs):
def _get_contract(w3, source_code, optimize, *args, override_opt_level=None, **kwargs):
settings = Settings()
settings.evm_version = kwargs.pop("evm_version", None)
settings.optimize = optimize
settings.optimize = override_opt_level or optimize
out = compiler.compile_code(
source_code,
# test that metadata gets generated
Expand Down
4 changes: 2 additions & 2 deletions tests/cli/vyper_json/test_parse_args_vyperjson.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def test_to_stdout(tmp_path, capfd):
_parse_args([path.absolute().as_posix()])
out, _ = capfd.readouterr()
output_json = json.loads(out)
assert _no_errors(output_json)
assert _no_errors(output_json), (INPUT_JSON, output_json)
assert "contracts/foo.vy" in output_json["sources"]
assert "contracts/bar.vy" in output_json["sources"]

Expand All @@ -71,7 +71,7 @@ def test_to_file(tmp_path):
assert output_path.exists()
with output_path.open() as fp:
output_json = json.load(fp)
assert _no_errors(output_json)
assert _no_errors(output_json), (INPUT_JSON, output_json)
assert "contracts/foo.vy" in output_json["sources"]
assert "contracts/bar.vy" in output_json["sources"]

Expand Down
2 changes: 2 additions & 0 deletions tests/compiler/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# prevent module name collision between tests/compiler/test_pre_parser.py
# and tests/ast/test_pre_parser.py
27 changes: 27 additions & 0 deletions tests/compiler/test_default_settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
from vyper.codegen import core
from vyper.compiler.phases import CompilerData
from vyper.compiler.settings import OptimizationLevel, _is_debug_mode


def test_default_settings():
source_code = ""
compiler_data = CompilerData(source_code)
_ = compiler_data.vyper_module # force settings to be computed

assert compiler_data.settings.optimize == OptimizationLevel.GAS


def test_default_opt_level():
assert OptimizationLevel.default() == OptimizationLevel.GAS


def test_codegen_opt_level():
assert core._opt_level == OptimizationLevel.GAS
assert core._opt_gas() is True
assert core._opt_none() is False
assert core._opt_codesize() is False


def test_debug_mode(pytestconfig):
debug_mode = pytestconfig.getoption("enable_compiler_debug_mode")
assert _is_debug_mode() == debug_mode
10 changes: 9 additions & 1 deletion tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

from vyper import compiler
from vyper.codegen.ir_node import IRnode
from vyper.compiler.settings import OptimizationLevel
from vyper.compiler.settings import OptimizationLevel, _set_debug_mode
from vyper.ir import compile_ir, optimizer

from .base_conftest import VyperContract, _get_contract, zero_gas_price_strategy
Expand Down Expand Up @@ -43,6 +43,7 @@ def pytest_addoption(parser):
default="gas",
help="change optimization mode",
)
parser.addoption("--enable-compiler-debug-mode", action="store_true")


@pytest.fixture(scope="module")
Expand All @@ -51,6 +52,13 @@ def optimize(pytestconfig):
return OptimizationLevel.from_string(flag)


@pytest.fixture(scope="session", autouse=True)
def debug(pytestconfig):
debug = pytestconfig.getoption("enable_compiler_debug_mode")
assert isinstance(debug, bool)
_set_debug_mode(debug)


@pytest.fixture
def keccak():
return Web3.keccak
Expand Down
15 changes: 11 additions & 4 deletions tests/parser/functions/test_slice.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import pytest
from hypothesis import given, settings

from vyper.compiler.settings import OptimizationLevel
from vyper.exceptions import ArgumentException, TypeMismatch

_fun_bytes32_bounds = [(0, 32), (3, 29), (27, 5), (0, 5), (5, 3), (30, 2)]
Expand Down Expand Up @@ -33,12 +34,15 @@ def slice_tower_test(inp1: Bytes[50]) -> Bytes[50]:

@pytest.mark.parametrize("literal_start", (True, False))
@pytest.mark.parametrize("literal_length", (True, False))
@pytest.mark.parametrize("opt_level", list(OptimizationLevel))
@given(start=_draw_1024, length=_draw_1024, length_bound=_draw_1024_1, bytesdata=_bytes_1024)
@settings(max_examples=25, deadline=None)
@settings(max_examples=100, deadline=None)
@pytest.mark.fuzzing
def test_slice_immutable(
get_contract,
assert_compile_failed,
assert_tx_failed,
opt_level,
bytesdata,
start,
literal_start,
Expand All @@ -64,7 +68,7 @@ def do_splice() -> Bytes[{length_bound}]:
"""

def _get_contract():
return get_contract(code, bytesdata, start, length)
return get_contract(code, bytesdata, start, length, override_opt_level=opt_level)

if (
(start + length > length_bound and literal_start and literal_length)
Expand All @@ -84,12 +88,15 @@ def _get_contract():
@pytest.mark.parametrize("location", ("storage", "calldata", "memory", "literal", "code"))
@pytest.mark.parametrize("literal_start", (True, False))
@pytest.mark.parametrize("literal_length", (True, False))
@pytest.mark.parametrize("opt_level", list(OptimizationLevel))
@given(start=_draw_1024, length=_draw_1024, length_bound=_draw_1024_1, bytesdata=_bytes_1024)
@settings(max_examples=25, deadline=None)
@settings(max_examples=100, deadline=None)
@pytest.mark.fuzzing
def test_slice_bytes(
get_contract,
assert_compile_failed,
assert_tx_failed,
opt_level,
location,
bytesdata,
start,
Expand Down Expand Up @@ -133,7 +140,7 @@ def do_slice(inp: Bytes[{length_bound}], start: uint256, length: uint256) -> Byt
"""

def _get_contract():
return get_contract(code, bytesdata)
return get_contract(code, bytesdata, override_opt_level=opt_level)

data_length = len(bytesdata) if location == "literal" else length_bound
if (
Expand Down
Loading

0 comments on commit 408929f

Please sign in to comment.