Add lazy Wasm compilation #732

Robbepop · 2023-06-09T19:07:51Z

Follow-up from #729.

~~Blocked by: #729~~

Currently wasmi always validates and translates Wasm to wasmi bytecode for all functions in a Wasm module eagerly.
The advantage of this approach is a simpler architecture, slightly less overhead for direct function calls and better performance in case most or all functions are used upon execution.

Benchmarks conducted in the toywasm repository showed that Wasm runtimes that employ lazy compilation (and validation) have significantly better startup times which not significantly reduce their execution performance. Also their memory consumption is not inflated either.

Despite the performance improvements as indicated by the above benchmarks wasmi does not employ lazy compilation. One reason is that wasmi bytecode and its translation is fairly simple. With the future shift towards register-machine wasmi bytecode the compilation overhead is going to be significantly higher compared to the current stack-machine based wasmi bytecode architecture and it would make lazy compilation likely very advantageous.

This issue tracks progress of lazy compilation.

The text was updated successfully, but these errors were encountered:

tomaka · 2023-07-13T19:02:03Z

I just wanted to mention that the smoldot light client would hugely benefit from this feature.

At initialization, smoldot downloads and compiles the runtime of a chain. The runtime is typically around 1.0 to 1.5 MiB, and this compilation takes around 300ms-400ms on my machine.

After initialization, smoldot only executes very small functions from this runtime (typically small getters).

In the case of smoldot, the start-up time is extremely important, as the user is waiting before they can interact with the chain. 300ms-400ms isn't that much, but smoldot wants to support connecting to dozens of chains at the same time (so dozens of runtimes), and this compilation time quickly adds up.

Robbepop · 2023-07-13T19:42:40Z

@tomaka Thanks a lot for letting me know about your use case.

The register machine wasmi is certainly going to add some compilation time overhead, therefore the lazy compilation strategy is needed in order to balance out the incoming regressions.

Note that Wasm validation will still be eager and thus the 300-400ms smoldot runtime startup time will probably not evaporate entirely.

One drawback with lazy Wasm compilation is that the translation process might fail, for example due to overflowing wasmi limits and if the compilation is interleaved with execution, then execution might fail. On the contrary with lazy Wasm compilation certain Wasm blobs might work that would not work with eager Wasm compilation since the parts that would fail Wasm compilation might simply not be used.

In the contracts-pallet we are going to eagerly compile when uploading a smart contract Wasm blob and lazily compile when executing one. This way we are certain that all uploaded Wasm smart contracts can actually be used fully and at the same time enjoy lazy compilation times for the common case of smart contract execution.

tomaka · 2023-07-14T07:09:47Z

Note that Wasm validation will still be eager and thus the 300-400ms smoldot runtime startup time will probably not evaporate entirely.

Is it not possible to disable the validation as well?

One drawback with lazy Wasm compilation is that the translation process might fail, for example due to overflowing wasmi limits and if the compilation is interleaved with execution, then execution might fail.

It is already not possible for smoldot (or Substrate for that regard) to guarantee ahead of time that execution will succeed.
If the runtime has a bug, execution can fail for example because it panics or because it calls an imported function with invalid parameters or because it reads memory out of bounds.

The fact that the Wasm compilation might fail at execution time is just an extra possible reason. Given that there are already many reasons why execution might fail, it really doesn't change much. It's all under the same "the runtime has a bug" umbrella of reasons.

Similarly, it would also be ok (for smoldot) if a bad Wasm bytecode (for example an invalid opcode) fails at execution time as well.

Robbepop · 2023-07-14T07:25:13Z

Is it not possible to disable the validation as well?

No. While partial Wasm validation is part of the Wasm spec it is also highly controversial as it is a huge safety and security risk and leaders in the Wasm ecosystem say that it was allowed to satisfy some authors of a runtime back in the days and that runtime doesn't even exist anymore today. The only runtime I know of that implements partial Wasm validation is Wasm3.

tomaka · 2023-07-14T08:10:37Z

I can imagine that there could be risks in executing a single function of Wasm blob that hasn't been fully validated.

However here the Wasm blob has been fully validated. Just not by the local machine.

Furthermore, since wasmi is an interpreter, I don't really see what the "risk" could be except for the fact that execution could succeed when it normally shouldn't.

Robbepop · 2023-07-14T08:31:37Z

Furthermore, since wasmi is an interpreter, I don't really see what the "risk" could be except for the fact that execution could succeed when it normally shouldn't.

The wasmi bytecode executor has some unsafe parts that rely on Wasm validation to be correct. If you view wasmi in isolation it is simply not a good idea to take off this mandatory safety belt. Also Wasm validation is comparatively fast. There is a reason why no major Wasm runtime implements partial Wasm validation.

Polkadot-Forum · 2023-11-03T15:09:53Z

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/smoldot-updates-threads/4471/12

Robbepop · 2023-12-08T19:20:41Z

All blockers have been resolved and it is now time to (finally) work on this issue.

Robbepop · 2023-12-10T22:18:54Z

I was able to conduct initial benchmarks in #844.

@tomaka If you are able to disable Wasm validation in smoldot speed-ups in compilation performance for your use case are likely 10x, otherwise more like 2x. You can disable Wasm validation if you know that the bytes you operate on is valid Wasm bytecode.
In case you cannot be sure about this it may be an option to separate Wasm validation and validate the input using multiple threads. Benchmarks needed.

Robbepop · 2023-12-16T20:31:37Z

@tomaka I am mostly done implementing #844.

Benchmark results on our CI machines indicate far better speed-ups than on my local machine:

Spidermonkey timings (big Wasm):
- checked + eager + fueled: 82.22ms
- checked + lazy: 32.64ms (2.5x)
- unchecked + lazy: 3.05ms (27x)
ERC-20 timings (small Wasm):
- checked + eager + fueled: 142.15µs
- checked + lazy: 65.85µs (2.15x)
- unchecked + lazy: 19.09µs (7.4x)
In pallet_contracts we will be able to use the fast unchecked + lazy configuration since we know that the smart contract blobs are valid Wasm and I hope you can do that in smoldot as well.

tomaka · 2023-12-17T07:59:51Z

Ah great!

If I use unchecked but that in reality it turns out that the Wasm is invalid, what can happen?

Robbepop · 2023-12-17T08:40:34Z

Ah great!

If I use unchecked but that in reality it turns out that the Wasm is invalid, what can happen?

@tomaka There be dragons. Undefined behavior at its finest.

tomaka · 2023-12-17T09:09:29Z

I see, so this doesn't fit smoldot, but I can use lazy.

Robbepop · 2023-12-17T09:20:15Z

Are you able to use multiple threads? In this case I could implement parallel Wasm validation since that would greatly help.

tomaka · 2023-12-17T10:41:36Z

No

Robbepop · 2023-12-17T20:13:38Z

Due to the discussions in WebAssembly/design#1464 I did not implement partial Wasm validation for Module::new, however, technically we could implement it in Wasmi without much implementation burden from what I would say.

However, the reasons against partial Wasm validation are ... valid ... in my opinion and I think it can be dangerous to go there.
I just released Wasmi v0.32.0-beta.0 to crates.io. In case you test it out I would love feedback on the improved startup times. Also I'd like to know if you want me to implement partial Wasm validation support in Wasmi even though it is a controversial topic.

edit: To clarify: Partial Wasm validation means that Wasm function bodies are only validated on their first use. The problem with this is that is causes invalid Wasm modules to work as long as the invalid parts are unused and thus invalid Wasm modules can become undetected for a while and create a mess.

edit2: Partial Wasm validation would yield the same 27x perfomance improvements as unchecked lazy Wasm compilation.

tomaka · 2023-12-17T21:44:21Z

I don't really understand why you're okay with implementing unchecked but you're not okay with partial validation. To me, unchecked is in every way "worse" (according to the arguments in the issue you've linked) than partial validation.

tomaka · 2023-12-17T21:54:01Z

As far as I understand, the main argument against partial validation is:

A concrete issue would be if a widely used module has a function with a validation error in it, but the function is never called in practice. On an engine with lazy validation, the module would run fine. On an engine with eager validation, the module would fail to compile and the web page would be broken. In order to resolve this, the second engine would need to implement lazy validation even if it doesn't make sense in their compilation model.

Indeed, when a runtime upgrade happens, the new runtime would have to be compiled with eager validation in order to check whether it is valid.

In the case of the smoldot light client, though, the situation is that we only ever execute the wasm after we have received the guarantee that at least 2/3rds of the validators of the network (who have previously eagerly compiled the wasm) have declared "this wasm is valid". And the smoldot light client is based upon the assumption that at least 2/3rds of the validators are honest, otherwise everything is lost. It seems to 100% fit within the partial validation situation.

I'm not ok with "unchecked" validation, because 1) we want to gracefully inform the user if a chain turns out to be broken rather than go into undefined behavior 2) the same instance of smoldot can be used to connect to multiple chains, and if one chain is broken we don't want the other chain connections to be affected.

Robbepop · 2023-12-18T08:09:21Z

I don't really understand why you're okay with implementing unchecked but you're not okay with partial validation. To me, unchecked is in every way "worse" (according to the arguments in the issue you've linked) than partial validation.

Okay let me elaborate on my design thoughts.

The goal is to avoid partial Wasm validation while allowing for the fastest possible compilation performance when possible.
So we provide two mechanisms:

lazy compilation with eager Wasm validation which is faster than eager compilation and safe.
lazy compilation without Wasm validation for when users can safely assume their Wasm inputs to be valid Wasm.

In case of the pallet_contracts 2) is applicable because it has full control over its input Wasm blobs. Note that Module::new_unchecked is an unsafe API so the responsibilities are on the user side. We never deal with partial validated Wasm blobs since in 2) we assume the Wasm to be valid.

In smoldot I agree that you shouldn't use unchecked Wasm validation since it seems that you do not have this "full control" over the input Wasm blobs. Therefore the middleground solution with lazy compilation & validation would probably make sense for you.

Robbepop added blocked The issue or PR is currently blocked. optimization An performance optimization issue. labels Jun 9, 2023

Robbepop added this to Parity Roadmap Jun 9, 2023

github-project-automation bot moved this to Draft in Parity Roadmap Jun 9, 2023

Robbepop moved this from Draft to Open in Parity Roadmap Jun 9, 2023

Robbepop removed this from Parity Roadmap Jun 9, 2023

Robbepop added this to Parity Roadmap Jun 9, 2023

github-project-automation bot moved this to Draft in Parity Roadmap Jun 9, 2023

Robbepop moved this from Draft to Open in Parity Roadmap Jun 9, 2023

Robbepop self-assigned this Jun 9, 2023

tomaka mentioned this issue Aug 15, 2023

Improve light client time to warp sync finished smol-dot/smoldot#864

Open

Robbepop added register-machine A work item for the register-machine engine. and removed blocked The issue or PR is currently blocked. labels Sep 22, 2023

This was referenced Nov 30, 2023

Remove the stack-machine wasmi engine backend #816

Closed

Add Module::new_unchecked to avoid expensive Wasm function validation #817

Closed

Robbepop mentioned this issue Dec 9, 2023

Implement lazy Wasm to wasmi bytecode translation #844

Merged

7 tasks

Robbepop closed this as completed in #844 Dec 16, 2023

github-project-automation bot moved this from Open to Closed in Parity Roadmap Dec 16, 2023

Robbepop mentioned this issue Dec 18, 2023

Experiment with partial Wasm validation compilation mode #853

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lazy Wasm compilation #732

Add lazy Wasm compilation #732

Robbepop commented Jun 9, 2023 •

edited

Loading

tomaka commented Jul 13, 2023 •

edited

Loading

Robbepop commented Jul 13, 2023 •

edited

Loading

tomaka commented Jul 14, 2023 •

edited

Loading

Robbepop commented Jul 14, 2023 •

edited

Loading

tomaka commented Jul 14, 2023

Robbepop commented Jul 14, 2023 •

edited

Loading

Polkadot-Forum commented Nov 3, 2023

Robbepop commented Dec 8, 2023

Robbepop commented Dec 10, 2023 •

edited

Loading

Robbepop commented Dec 16, 2023 •

edited

Loading

tomaka commented Dec 17, 2023

Robbepop commented Dec 17, 2023 •

edited

Loading

tomaka commented Dec 17, 2023

Robbepop commented Dec 17, 2023

tomaka commented Dec 17, 2023

Robbepop commented Dec 17, 2023 •

edited

Loading

tomaka commented Dec 17, 2023 •

edited

Loading

tomaka commented Dec 17, 2023 •

edited

Loading

Robbepop commented Dec 18, 2023 •

edited

Loading

Add lazy Wasm compilation #732

Add lazy Wasm compilation #732

Comments

Robbepop commented Jun 9, 2023 • edited Loading

tomaka commented Jul 13, 2023 • edited Loading

Robbepop commented Jul 13, 2023 • edited Loading

tomaka commented Jul 14, 2023 • edited Loading

Robbepop commented Jul 14, 2023 • edited Loading

tomaka commented Jul 14, 2023

Robbepop commented Jul 14, 2023 • edited Loading

Polkadot-Forum commented Nov 3, 2023

Robbepop commented Dec 8, 2023

Robbepop commented Dec 10, 2023 • edited Loading

Robbepop commented Dec 16, 2023 • edited Loading

tomaka commented Dec 17, 2023

Robbepop commented Dec 17, 2023 • edited Loading

tomaka commented Dec 17, 2023

Robbepop commented Dec 17, 2023

tomaka commented Dec 17, 2023

Robbepop commented Dec 17, 2023 • edited Loading

tomaka commented Dec 17, 2023 • edited Loading

tomaka commented Dec 17, 2023 • edited Loading

Robbepop commented Dec 18, 2023 • edited Loading

Robbepop commented Jun 9, 2023 •

edited

Loading

tomaka commented Jul 13, 2023 •

edited

Loading

Robbepop commented Jul 13, 2023 •

edited

Loading

tomaka commented Jul 14, 2023 •

edited

Loading

Robbepop commented Jul 14, 2023 •

edited

Loading

Robbepop commented Jul 14, 2023 •

edited

Loading

Robbepop commented Dec 10, 2023 •

edited

Loading

Robbepop commented Dec 16, 2023 •

edited

Loading

Robbepop commented Dec 17, 2023 •

edited

Loading

Robbepop commented Dec 17, 2023 •

edited

Loading

tomaka commented Dec 17, 2023 •

edited

Loading

tomaka commented Dec 17, 2023 •

edited

Loading

Robbepop commented Dec 18, 2023 •

edited

Loading