Planning for `javy` monitoring #89

ejrgilbert · 2024-07-12T21:08:49Z

Here is the current overarching structure of the javy monitor implementation that associates "fuel" consumption with interpreted JavaScript bytecode.

This monitor is currently implemented on Wizard in Virgil, see this PR.

System Architecture

javy runs compiled JavaScript bytecode on WebAssembly by running the quickjs engine, compiled to Wasm, on top of a Wasm VM. It embeds the compiled end-user JavaScript code into the Wasm data section.

The interpreter loop takes place in a specific function of the Wasm module and is basically a bunch of nested blocks that are entered with a switch (br_table Wasm opcode). When a new function is entered (on a Javascript call to invoke that new function), the function is loaded from memory and a new function data structure is malloced which stores the copied-over javascript opcode for that function. There is then a pc that starts at 0 at the beginning of the interpreter function to iterate over the interpreted function bytecode. The pc into the JavaScript bytecode is maintained and used to load a u8 from this function data structure, then that byte is matched in the giant switch to determine the interpreter switch handler.

The Monitor

The monitor collects lots of information dynamically about execution to tie fuel consumption to user Javascript code. It does so by instrumenting all function entry/exits (builds a fn call trace) and instrumenting the specific u8 load Wasm opcode that grabs the next Javascript opcode. It also instruments other data loads to gather information on Javascript opcode immediates; however, the pc opcode load is a special case!

This means that whamm! will need to have the following events available for use:

wasm:fn:entry:after, to instrument the entry to Wasm functions
wasm:fn:exit:before, to instrument the exit from Wasm functions
wasm:opcode:load:before / fn_id == 0 && offset == 20 /, to instrument the specific load of the Javascript pc` opcode. For now, these indices will be hardcoded after doing some preliminary static analysis of the bytecode. In the future we can figure out a more generic way to do this.

We will need a way to pull different pieces of dynamic data to save off for later analysis.

The arg to the wasm u8 load (the pc, aka the memory offset)
A way to get the result of a load. If the JavaScript load is a call opcode, then we need the immediate to get the fn ID of the called Javascript user function.
The arg to the br_table (the Javascript opcode)
The name of the function entered/exited

This feels like a good start to a plan for this monitor use-case.

The text was updated successfully, but these errors were encountered:

ejrgilbert added enhancement New feature or request investigation Issue to document idea and research while investigating it labels Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Planning for `javy` monitoring #89

Planning for `javy` monitoring #89

ejrgilbert commented Jul 12, 2024 •

edited

Loading

Planning for javy monitoring #89

Planning for javy monitoring #89

Comments

ejrgilbert commented Jul 12, 2024 • edited Loading

System Architecture

The Monitor

Planning for `javy` monitoring #89

Planning for `javy` monitoring #89

ejrgilbert commented Jul 12, 2024 •

edited

Loading