-
Notifications
You must be signed in to change notification settings - Fork 987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch the runtime from wasmi to wasmtime + cranelift #1700
Conversation
c9d6130
to
02362a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First batch of comments. More to follow.
4796723
to
f1c4bbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another batch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest batch of comments. Sorry this review is taking a while, there's quite a lot to go over.
let ptr = self.arena_start_ptr as usize; | ||
|
||
// Safe because we are accessing and immediately dropping the reference to the data. | ||
unsafe { self.memory.data_unchecked_mut()[ptr..(ptr + bytes.len())].copy_from_slice(bytes) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I audited this for safety and I think it checks out. The situation may be a bit more tenuous than the comment implies though. The comment implies that the safety of this operation can be audited locally as if only the code within the unsafe
block matters. This is not true.
Unfortunately, all the code which has access to the path WasmInstance::memory: Memory
(which includes the set of all code which might access WasmInstance::instance: wasmtime::Instance
and WasmInstance::memory_allocate
) is a part of the surface area of the unsafety because these are paths that an additional reference (shared or not) could be created which aliases this borrow. Note that the data_unchecked_mut
method doesn't even require &mut self
! This is one of the most unsafe API's I've seen in the wild in some time.
This is concerning because these data are accessible by host exports! This is critical to security because this is exactly the kind of place for an attack vector to try to read memory owned by the indexer (like a private key, which we need to make sure an indexer does not have).
Is there a way to sandbox all paths to this memory in a way that less code has to be audited? Ideally, we could drop the module and the memory into something rock-solid with less than 100 lines of surface area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment really is insufficient, so I'll explain my reasoning. First instance and memory are !Sync
, at least for now, so these unsafe blocks can't be called concurrently.
Given that, the only way this API can cause unsoundness outside an unsafe
block is if that block leaks a reference to the memory to safe code. We can adopt as an invariant that no unsafe block does this, documenting and auditing in the two unsafe blocks we have. The beauty being that if this invariant is locally true for all unsafe blocks, then all our code is safe.
Checking all the paths that use Instance
, tracking usage of memory for these paths and concluding that the code really is sound is one way to audit this, but imo it's the more complicated way, even if we use privacy to further restrict the paths that can access the instance. We can take advantage of the particularities of this API to provide a tighter argument for soundness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start by implementing !Sync
for WasmiInstance
to make sure this situation doesn't change under our feet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was definitely taking this too lightly, I've added a doc test that WasmInstance
is !Sync
, and written a comment that describes our reasoning for why this is sound. Thanks for verifying this and helping me better elaborate on the safety here.
runtime/wasm/src/module/mod.rs
Outdated
// This task is likely to outlive the instance, which is fine. | ||
let interrupt_handle = instance.store().interrupt_handle().unwrap(); | ||
let should_interrupt = should_interrupt.clone(); | ||
graph::spawn(async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to construct a subgraph that intentionally avoids timeout and enters an infinite loop. There are several problems here:
- Calling
ipfs_map
once is enough to disable the timeout forever. The timeout is never restored, so the subgraph could enter an infinite loop right after this. - Even if the timeout was restored, this check never triggers a second time if it happens to check while the timeout was disabled
- The use of atomics is overkill for something that is not on the hot path
- The code that disables
should_interrupt
is limited toipfs_map
but does not count eg:ipfs_cat
which may also want to disable the inturrupt. - Even if the inturrupt were disabled for just the duration of the ipfs_map call and we turned it back on at the end, code spawned in other modules needs to count toward the limit.
We discussed outside of the context of this PR that we will want to handle timeouts in a more deterministic way, but before I get to writing that RFC I think we need to do better here since this is an attack vector as-is.
I suggest writing something like a shared Arc<Mutex<Stopwatch>>
that can be paused for only the duration of time that is external to any possible wasm code executing (eg: inside ipfs reading code, but be running for ipfs_map callbacks) Or maybe just not disabling the timeout at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last batch of comments for this pass
9cc8d94
to
b933244
Compare
let ptr = self.arena_start_ptr as usize; | ||
|
||
// Safe because we are accessing and immediately dropping the reference to the data. | ||
unsafe { self.memory.data_unchecked_mut()[ptr..(ptr + bytes.len())].copy_from_slice(bytes) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start by implementing !Sync
for WasmiInstance
to make sure this situation doesn't change under our feet.
} | ||
|
||
impl<E> HostError for HostExportError<E> where E: fmt::Debug + fmt::Display + Send + Sync + 'static {} | ||
fn get(&self, offset: u32, size: u32) -> Vec<u8> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that this is deterministic? I think it probably is, but I want to be sure. Imagine that a subgraph reads from a specific memory address that was not given to it by a previous call to raw_new
. Does WebAssembly guarantee that all of the memory layout be deterministic? Will something like upgrading the wasmtime dependency possibly affect that memory layout? Is some of the memory returned by instance.get_memory("memory")
controlled by wasmtime? Where did that name come from and why didn't we use Memory::new()
to limit this to just the arena?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you access the stack with this memory? Can the WebAssembly compiler introduce optimizations that would eliminate dead code and thereby parts of the stack that might be read by some other function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went digging to find good answers to your questions.
The store represents the global state of a Wasm module, this includes the memory. In the future there may be multiple memories, but currently there is only one memory per module. And at least AS calls it "memory"
. There exists the option of having AS import a memory that we initialize ourselves instead of letting it initialize its own memory, but that's not what we've done so far and I'm sure what difference that makes. There is no uninitialized memory in Wasm, memory is zeroed when initialized or grown.
The Wasm stack is separate from the store and its memory, so it cannot be accessed by the host.
Wasm is deterministic, and what we're doing from the host here is no different from what a Wasm load or store instruction. I see no indication that accessing memory from the host should be any different from accessing it from Wasm.
And to your point on optimizations, Cranelift is a general-purpose optimizing compiler, and in principle that is scary for determinism. As a minor point the optimization level is set to the default of None
, so dead code elimination isn't performed, but that could change. The important thing is that even a general purpose optimizing compiler should not violate the spec when compiling Wasm. From what I could see, Cranelift considers any store to be a side effect and therefore not a candidate for dead code elimination. I did find one mention of dead store removal, but I think that's just a mistake in the comment which I asked about here bytecodealliance/wasmtime#1828, and it seems that while it's not implemented it's theoretically possible. But we can assume that if a store is visible to a host function, then it's not a dead store.
All that said, there is an alternative backend for wasmtime called Lightbeam, which is overall a more conservative backend and is what Substrate uses. I didn't try it because it doesn't support nan canonicalization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am satisfied that this is fully deterministic. Consider putting some of your learnings in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments around memory and cranelift, and also explicitly set OptLevel::None
.
102ce40
to
b2500c4
Compare
runtime/wasm/src/host.rs
Outdated
bail!( | ||
ensure!( | ||
!handlers.is_empty(), | ||
format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to trigger another round of review on account of this, but ensure!
already supports formatting so there is no need for the format!
here.
let mut config = wasmtime::Config::new(); | ||
config.strategy(wasmtime::Strategy::Cranelift).unwrap(); | ||
config.interruptable(true); // For timeouts. | ||
config.cranelift_nan_canonicalization(true); // For NaN determinism. | ||
config.cranelift_opt_level(wasmtime::OptLevel::None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will be necessary to do but I won't complain much. As a matter of curiosity, I wouldn't be surprised if this ended up being faster anyway if a lot of code in a trigger is run-once per compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yhea some mappings might be worse off overall in the runtime/compile time tradeoff.
let flags = self.asc_get(flags); | ||
|
||
// Pause the timeout while running ipfs_map | ||
self.timeout_stopwatch.lock().unwrap().stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to use a scope-based drop API. (I've had good luck with the simplicity of the defer
crate) As it is currently if the inner function panics or returns an error then the stopwatch is never restarted. This will result in the timeout task to be leaked.
timeout.checked_sub(timeout_stopwatch.lock().unwrap().elapsed()); | ||
match time_left { | ||
None => break interrupt_handle.interrupt(), // Timed out. | ||
Some(time) => tokio::time::delay_for(time).await, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider having a minimum wait time. There is a diabolical case where the stopwatch pauses with very little time left and this becomes an expensive spin loop that starves other futures. The "right" way to implement this is to awake when the stopwatch is unpaused, but I think you want to expend as little effort as possible and the minimum delay seems simpler.
runtime/wasm/src/module/stopwatch.rs
Outdated
pub fn stop(&mut self) { | ||
self.elapsed = self.elapsed(); | ||
self.start_time = None; | ||
self.split_time = None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care, but as long as we're making changes note that split_time
is unused. I'm surprised there is no warning for this.
b2500c4
to
c090839
Compare
@That3Percent Ok that really wasn't hard. Thanks to defer which is really nifty. Time for the final 🤞 review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
runtime/wasm/src/module/test.rs
Outdated
|
||
// Ipfs host functions use `block_on` which must be called from a sync context. | ||
tokio::task::block_in_place(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the problem with these APIs… there are at least 3 functions that sound like the thing you might want - spawn_blocking
, block_on
, and block_in_place
. But, all of these require global knowledge of everything happening both up the stack and down the stack and have some very subtle problems and differences in their semantics.
I'm racking my brain trying to figure out what the right approach is, but I think the best we can get at the moment may be to replicate what graph-node
is actually doing and put modules in their own thread to replicate the circumstances that ipfs.map
is called in since that is very important to whether this test reflects whether the implementation will behave in production.
Thinking longer-term, the way that web APIs have been designed is that there are no async functions with sync APIs. Full stop. There is no "readFile()", only "readFile(callback)". Because of this, calling async functions within a mapping is a no-no.
The only way we can fix this is to embrace this in our own APIs. With IPFS this is relatively easy. Since the file becomes its own data source the problem is solved automatically since the handler becomes the callback. The only sticky part will be the entity_get
API. There are 2 approaches here. We could change the entity_get
API to literally take a callback - but that would break a large number of existing subgraphs for reasons our community may find difficult to understand. The preferred but larger effort way is that it should be possible to inspect and re-write the WebAssembly in terms of callbacks, in exactly the same way that would happen if we imagine that this was an async method and any call to entity_get
had an implicit .await
. It seems like a pretty big lift, but it's the only way I can think of to get this to behave correctly and retain the API we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jannis I'd be interested in your opinion of the above for a long-term solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the ideal threading/task architecture for us remains an open question, but for the purpose of fixing this tests I believe block_in_place
not working is a bug tokio-rs/tokio#2603. I've switched to using an OS thread in the ipfs tests, to replicate what we do in the real code.
Given the problems we just had with futures block_on and in the principle of not mixing executor libraries, I've switch our functions in |
runtime/wasm/src/module/test/abi.rs
Outdated
let res: Result<(), _> = func(); | ||
assert!(res.unwrap_err().to_string().contains(TRAP_TIMEOUT)); | ||
} | ||
// #[tokio::test(threaded_scheduler)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm really suspicious that this has been commented out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pure accident, it's a bit slower so I did it to run all tests faster when attempting to reproduce the failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks for catching.
|
||
// Ipfs host functions use `block_on` which must be called from a sync context, | ||
// so we replicate what we do `spawn_module`. | ||
std::thread::spawn(move || { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting fatigued on this PR, so for a future improvement let's put the spawning of the runtime, module, and thread as a single api to replace the test_module
API and just do this in one place, for all tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make sense.
graph/src/task_spawn.rs
Outdated
//! underlying executor implementation which currently is tokio. This serves a few purposes: | ||
//! - Avoid depending directly on tokio APIs, making upgrades or a potential switch easier. | ||
//! - Reflect our chosen default semantics of aborting on task panic, offering `*_allow_panic` | ||
//! functoins to opt out of that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo -> functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎊
@That3Percent Thanks for the intense review and help, merging! |
This is primarily so we can use NaN canonicalization to be certain that wasm execution will be deterministic, but it has other advantages. wasmi was only passively maintained, while wasmtime is an incredibly active project so we'll have access to any cutting edge wasm features we might need. Wasmtime has backtraces on wasm traps, which should help with debugging subgraph errors.
On performance, wasmtime is faster since it's a jit engine while wasmi is an interpreter. Also timeout checks are more efficient in wasmtime. Testing with the moloch subgraph, with wasmtime it spent a total of 150ms in compilation and 100ms in execution of wasm code, while with wasmi it spent 1 second in execution. This subgraphs takes only a couple minutes to sync so it's not very representative, but it shows that wasm execution got way faster, though was already relatively fast, so this shouldn't be much of an overall sync time gain for most subgraphs. But now I'd proclaim that the cost of wasm execution is totally negligible for most subgraphs. We'll see if this shows in the index node CPU usage, I believe it should.
Wasmtime uses
anyhow
as an error library, so I started using it in the runtime and migrated some code in other crates fromfailure
toanyhow
. Working with anyhow has been great, you can usecontext
to add information to errors or turn options into errors. It's just important to remember to format the error with{:#}
so the full chain of causes is printed.With a bit of refactoring in
ipfs_map
and tests,block_on_allow_panic
is now just a proxy tofutures::block_on
, as it should.The
big_int_to_i32
andi32_to_big_int
host exports are removed since graph-ts stopped depending on those over a year ago.On the implementation, the main interface to instantiate a wasmtime
Instance
is through aLinker
. Unlike wasmi, it's not easy to access give host exports access to the context they need, so theWasmInstance
needed to be wrapped in anRc<RefCell
to be shared. This is returned asWasmInstanceHandle
which mananges the instance and makes sure it's dropped. Thelink!
macro is a convenience for adding host exports. Timeouts are delivered through the asynchronous wasmtime interrupts, so each module has a tokio task which is serves as a watchdog to interrupt the instance after the timeout elapses.