Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lazy Wasm compilation #732

Closed
Robbepop opened this issue Jun 9, 2023 · 19 comments · Fixed by #844
Closed

Add lazy Wasm compilation #732

Robbepop opened this issue Jun 9, 2023 · 19 comments · Fixed by #844
Assignees
Labels
optimization An performance optimization issue. register-machine A work item for the register-machine engine.

Comments

@Robbepop
Copy link
Member

Robbepop commented Jun 9, 2023

Follow-up from #729.

Blocked by: #729

Currently wasmi always validates and translates Wasm to wasmi bytecode for all functions in a Wasm module eagerly.
The advantage of this approach is a simpler architecture, slightly less overhead for direct function calls and better performance in case most or all functions are used upon execution.

Benchmarks conducted in the toywasm repository showed that Wasm runtimes that employ lazy compilation (and validation) have significantly better startup times which not significantly reduce their execution performance. Also their memory consumption is not inflated either.

Despite the performance improvements as indicated by the above benchmarks wasmi does not employ lazy compilation. One reason is that wasmi bytecode and its translation is fairly simple. With the future shift towards register-machine wasmi bytecode the compilation overhead is going to be significantly higher compared to the current stack-machine based wasmi bytecode architecture and it would make lazy compilation likely very advantageous.

This issue tracks progress of lazy compilation.

@Robbepop Robbepop added blocked The issue or PR is currently blocked. optimization An performance optimization issue. labels Jun 9, 2023
@Robbepop Robbepop moved this from Draft to Open in Parity Roadmap Jun 9, 2023
@Robbepop Robbepop removed this from Parity Roadmap Jun 9, 2023
@Robbepop Robbepop moved this from Draft to Open in Parity Roadmap Jun 9, 2023
@Robbepop Robbepop self-assigned this Jun 9, 2023
@tomaka
Copy link
Contributor

tomaka commented Jul 13, 2023

I just wanted to mention that the smoldot light client would hugely benefit from this feature.

At initialization, smoldot downloads and compiles the runtime of a chain. The runtime is typically around 1.0 to 1.5 MiB, and this compilation takes around 300ms-400ms on my machine.

After initialization, smoldot only executes very small functions from this runtime (typically small getters).

In the case of smoldot, the start-up time is extremely important, as the user is waiting before they can interact with the chain. 300ms-400ms isn't that much, but smoldot wants to support connecting to dozens of chains at the same time (so dozens of runtimes), and this compilation time quickly adds up.

@Robbepop
Copy link
Member Author

Robbepop commented Jul 13, 2023

@tomaka Thanks a lot for letting me know about your use case.

The register machine wasmi is certainly going to add some compilation time overhead, therefore the lazy compilation strategy is needed in order to balance out the incoming regressions.

Note that Wasm validation will still be eager and thus the 300-400ms smoldot runtime startup time will probably not evaporate entirely.

One drawback with lazy Wasm compilation is that the translation process might fail, for example due to overflowing wasmi limits and if the compilation is interleaved with execution, then execution might fail. On the contrary with lazy Wasm compilation certain Wasm blobs might work that would not work with eager Wasm compilation since the parts that would fail Wasm compilation might simply not be used.

In the contracts-pallet we are going to eagerly compile when uploading a smart contract Wasm blob and lazily compile when executing one. This way we are certain that all uploaded Wasm smart contracts can actually be used fully and at the same time enjoy lazy compilation times for the common case of smart contract execution.

@tomaka
Copy link
Contributor

tomaka commented Jul 14, 2023

Note that Wasm validation will still be eager and thus the 300-400ms smoldot runtime startup time will probably not evaporate entirely.

Is it not possible to disable the validation as well?

One drawback with lazy Wasm compilation is that the translation process might fail, for example due to overflowing wasmi limits and if the compilation is interleaved with execution, then execution might fail.

It is already not possible for smoldot (or Substrate for that regard) to guarantee ahead of time that execution will succeed.
If the runtime has a bug, execution can fail for example because it panics or because it calls an imported function with invalid parameters or because it reads memory out of bounds.

The fact that the Wasm compilation might fail at execution time is just an extra possible reason. Given that there are already many reasons why execution might fail, it really doesn't change much. It's all under the same "the runtime has a bug" umbrella of reasons.

Similarly, it would also be ok (for smoldot) if a bad Wasm bytecode (for example an invalid opcode) fails at execution time as well.

@Robbepop
Copy link
Member Author

Robbepop commented Jul 14, 2023

Is it not possible to disable the validation as well?

No. While partial Wasm validation is part of the Wasm spec it is also highly controversial as it is a huge safety and security risk and leaders in the Wasm ecosystem say that it was allowed to satisfy some authors of a runtime back in the days and that runtime doesn't even exist anymore today. The only runtime I know of that implements partial Wasm validation is Wasm3.

@tomaka
Copy link
Contributor

tomaka commented Jul 14, 2023

I can imagine that there could be risks in executing a single function of Wasm blob that hasn't been fully validated.

However here the Wasm blob has been fully validated. Just not by the local machine.

Furthermore, since wasmi is an interpreter, I don't really see what the "risk" could be except for the fact that execution could succeed when it normally shouldn't.

@Robbepop
Copy link
Member Author

Robbepop commented Jul 14, 2023

Furthermore, since wasmi is an interpreter, I don't really see what the "risk" could be except for the fact that execution could succeed when it normally shouldn't.

The wasmi bytecode executor has some unsafe parts that rely on Wasm validation to be correct. If you view wasmi in isolation it is simply not a good idea to take off this mandatory safety belt. Also Wasm validation is comparatively fast. There is a reason why no major Wasm runtime implements partial Wasm validation.

@Robbepop Robbepop added register-machine A work item for the register-machine engine. and removed blocked The issue or PR is currently blocked. labels Sep 22, 2023
@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/smoldot-updates-threads/4471/12

@Robbepop
Copy link
Member Author

Robbepop commented Dec 8, 2023

All blockers have been resolved and it is now time to (finally) work on this issue.

@Robbepop
Copy link
Member Author

Robbepop commented Dec 10, 2023

I was able to conduct initial benchmarks in #844.

@tomaka If you are able to disable Wasm validation in smoldot speed-ups in compilation performance for your use case are likely 10x, otherwise more like 2x. You can disable Wasm validation if you know that the bytes you operate on is valid Wasm bytecode.
In case you cannot be sure about this it may be an option to separate Wasm validation and validate the input using multiple threads. Benchmarks needed.

@Robbepop
Copy link
Member Author

Robbepop commented Dec 16, 2023

@tomaka I am mostly done implementing #844.

Benchmark results on our CI machines indicate far better speed-ups than on my local machine:

  • Spidermonkey timings (big Wasm):

    • checked + eager + fueled: 82.22ms
    • checked + lazy: 32.64ms (2.5x)
    • unchecked + lazy: 3.05ms (27x)
  • ERC-20 timings (small Wasm):

    • checked + eager + fueled: 142.15µs
    • checked + lazy: 65.85µs (2.15x)
    • unchecked + lazy: 19.09µs (7.4x)

    In pallet_contracts we will be able to use the fast unchecked + lazy configuration since we know that the smart contract blobs are valid Wasm and I hope you can do that in smoldot as well.

@github-project-automation github-project-automation bot moved this from Open to Closed in Parity Roadmap Dec 16, 2023
@tomaka
Copy link
Contributor

tomaka commented Dec 17, 2023

Ah great!

If I use unchecked but that in reality it turns out that the Wasm is invalid, what can happen?

@Robbepop
Copy link
Member Author

Robbepop commented Dec 17, 2023

Ah great!

If I use unchecked but that in reality it turns out that the Wasm is invalid, what can happen?

@tomaka There be dragons. Undefined behavior at its finest.

@tomaka
Copy link
Contributor

tomaka commented Dec 17, 2023

I see, so this doesn't fit smoldot, but I can use lazy.

@Robbepop
Copy link
Member Author

Are you able to use multiple threads? In this case I could implement parallel Wasm validation since that would greatly help.

@tomaka
Copy link
Contributor

tomaka commented Dec 17, 2023

No

@Robbepop
Copy link
Member Author

Robbepop commented Dec 17, 2023

Due to the discussions in WebAssembly/design#1464 I did not implement partial Wasm validation for Module::new, however, technically we could implement it in Wasmi without much implementation burden from what I would say.

However, the reasons against partial Wasm validation are ... valid ... in my opinion and I think it can be dangerous to go there.
I just released Wasmi v0.32.0-beta.0 to crates.io. In case you test it out I would love feedback on the improved startup times. Also I'd like to know if you want me to implement partial Wasm validation support in Wasmi even though it is a controversial topic.

edit: To clarify: Partial Wasm validation means that Wasm function bodies are only validated on their first use. The problem with this is that is causes invalid Wasm modules to work as long as the invalid parts are unused and thus invalid Wasm modules can become undetected for a while and create a mess.

edit2: Partial Wasm validation would yield the same 27x perfomance improvements as unchecked lazy Wasm compilation.

@tomaka
Copy link
Contributor

tomaka commented Dec 17, 2023

I don't really understand why you're okay with implementing unchecked but you're not okay with partial validation. To me, unchecked is in every way "worse" (according to the arguments in the issue you've linked) than partial validation.

@tomaka
Copy link
Contributor

tomaka commented Dec 17, 2023

As far as I understand, the main argument against partial validation is:

A concrete issue would be if a widely used module has a function with a validation error in it, but the function is never called in practice. On an engine with lazy validation, the module would run fine. On an engine with eager validation, the module would fail to compile and the web page would be broken. In order to resolve this, the second engine would need to implement lazy validation even if it doesn't make sense in their compilation model.

Indeed, when a runtime upgrade happens, the new runtime would have to be compiled with eager validation in order to check whether it is valid.

In the case of the smoldot light client, though, the situation is that we only ever execute the wasm after we have received the guarantee that at least 2/3rds of the validators of the network (who have previously eagerly compiled the wasm) have declared "this wasm is valid". And the smoldot light client is based upon the assumption that at least 2/3rds of the validators are honest, otherwise everything is lost. It seems to 100% fit within the partial validation situation.

I'm not ok with "unchecked" validation, because 1) we want to gracefully inform the user if a chain turns out to be broken rather than go into undefined behavior 2) the same instance of smoldot can be used to connect to multiple chains, and if one chain is broken we don't want the other chain connections to be affected.

@Robbepop
Copy link
Member Author

Robbepop commented Dec 18, 2023

I don't really understand why you're okay with implementing unchecked but you're not okay with partial validation. To me, unchecked is in every way "worse" (according to the arguments in the issue you've linked) than partial validation.

Okay let me elaborate on my design thoughts.

The goal is to avoid partial Wasm validation while allowing for the fastest possible compilation performance when possible.
So we provide two mechanisms:

  1. lazy compilation with eager Wasm validation which is faster than eager compilation and safe.
  2. lazy compilation without Wasm validation for when users can safely assume their Wasm inputs to be valid Wasm.

In case of the pallet_contracts 2) is applicable because it has full control over its input Wasm blobs. Note that Module::new_unchecked is an unsafe API so the responsibilities are on the user side. We never deal with partial validated Wasm blobs since in 2) we assume the Wasm to be valid.

In smoldot I agree that you shouldn't use unchecked Wasm validation since it seems that you do not have this "full control" over the input Wasm blobs. Therefore the middleground solution with lazy compilation & validation would probably make sense for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization An performance optimization issue. register-machine A work item for the register-machine engine.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants