You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In recent months we have been hard at work in trying to improve the reliability and performance of the runtime. We approach this task in many different ways. Simplifying the code and cutting out unused modules and modernizing our testing and fuzzing suites are just some of the approaches we have employed to make progress towards a runtime we can have full confidence in.
Until very recently our fuzzing setup would generate contracts almost randomly. Most of the contracts generated this way would end up being considered invalid early due to the generated WebAssembly code utilizing proposals we do not implement. In the off-chance the generator would generate a module that used just the core primitives, we'd still end up rejecting the contract most of the time at linking time. These contracts would a lot of the time contain imports for functions that our runtime is unable to provide, like for example env.c0gfJY.
A couple hours after merging this contribution we started receiving hundreds of automated reports of test cases that crash our runtime! The test cases were quite straightforward to reduce to a pattern of a contract directly re-exporting an imported host function – exactly the area our fuzz suite improvements focused on:
This crash was ultimately root-caused to a recent migration away from the high-level API that wasmer exposes towards the lower level interfaces. Although the upstream wasmer project considers this interface a private and unstable implementation detail, we felt this migration was justified by the virtue of us now maintaining and developing our own copy of the runtime.
One of the things this refactoring has changed is how the contract functions are found and invoked. In particular, all calls from host (neard) to the VM (the runtime) go through a piece of code called a trampoline. This piece of code is responsible, among other things, for setting up the function call arguments such that they end up in the correct places in registers and memory as expected by the functions that were generated by the compiler.
In the typical scenario, where the WebAssembly module defines a function and exports it, the data structures describing the function will have the trampoline specified at the time when the compiled code is loaded into memory. The situation where the module directly re-exports an import, however, is special: the definitions of these imported host functions are provided at a later stage, during linking. And so, they would end up without a any trampoline being specified at all. This was what ultimately caused a crash we observed.
The original high-level wasmer API we migrated away from handled the problem by setting it up during the lookup of an exported function. This happened very late in the instance lifetime, and happened to be implemented as part of the high level API implementation that we had removed. Not aware of this detail we ended up losing the piece of code responsible for this set-up.
In order to resolve this oversight, we have have made changes to ensure that trampolines for host functions are set-up during the linking phase (i.e. as early as possible), so future refactors should never hit this sort of scenario again. Admittedly, though, there are still plenty of refactors that need to happen before similar oversights cannot happen by construction. If this sort of effort sounds like something you'd enjoy contributing to, take a look at our jobs page as we're hiring.
Timeline
2022-01-11 The refactoring that introduced the issue in question has been merged;
2022-05-18 The PR to augment the fuzzing suite has been merged;
2022-05-20 A crash has been reported by our fuzzing infrastructure, the test case has been
minified.
2022-05-24 This issue has been assigned a severity of SEV0 (CODE_RED).
2022-05-30 After validation, the fixes have been published and 1.26.1 has been released with
the fix.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
In recent months we have been hard at work in trying to improve the reliability and performance of the runtime. We approach this task in many different ways. Simplifying the code and cutting out unused modules and modernizing our testing and fuzzing suites are just some of the approaches we have employed to make progress towards a runtime we can have full confidence in.
Until very recently our fuzzing setup would generate contracts almost randomly. Most of the contracts generated this way would end up being considered invalid early due to the generated WebAssembly code utilizing proposals we do not implement. In the off-chance the generator would generate a module that used just the core primitives, we'd still end up rejecting the contract most of the time at linking time. These contracts would a lot of the time contain imports for functions that our runtime is unable to provide, like for example
env.c0gfJY
.In an effort to improve the efficacy of our fuzzing suite we made some changes to the test case generator. All of this work culminated in a change from an external contributor @mooori to tie everything together.
A couple hours after merging this contribution we started receiving hundreds of automated reports of test cases that crash our runtime! The test cases were quite straightforward to reduce to a pattern of a contract directly re-exporting an imported host function – exactly the area our fuzz suite improvements focused on:
This crash was ultimately root-caused to a recent migration away from the high-level API that wasmer exposes towards the lower level interfaces. Although the upstream wasmer project considers this interface a private and unstable implementation detail, we felt this migration was justified by the virtue of us now maintaining and developing our own copy of the runtime.
One of the things this refactoring has changed is how the contract functions are found and invoked. In particular, all calls from host (neard) to the VM (the runtime) go through a piece of code called a trampoline. This piece of code is responsible, among other things, for setting up the function call arguments such that they end up in the correct places in registers and memory as expected by the functions that were generated by the compiler.
In the typical scenario, where the WebAssembly module defines a function and exports it, the data structures describing the function will have the trampoline specified at the time when the compiled code is loaded into memory. The situation where the module directly re-exports an import, however, is special: the definitions of these imported host functions are provided at a later stage, during linking. And so, they would end up without a any trampoline being specified at all. This was what ultimately caused a crash we observed.
The original high-level wasmer API we migrated away from handled the problem by setting it up during the lookup of an exported function. This happened very late in the instance lifetime, and happened to be implemented as part of the high level API implementation that we had removed. Not aware of this detail we ended up losing the piece of code responsible for this set-up.
In order to resolve this oversight, we have have made changes to ensure that trampolines for host functions are set-up during the linking phase (i.e. as early as possible), so future refactors should never hit this sort of scenario again. Admittedly, though, there are still plenty of refactors that need to happen before similar oversights cannot happen by construction. If this sort of effort sounds like something you'd enjoy contributing to, take a look at our jobs page as we're hiring.
Timeline
minified.
the fix.
Beta Was this translation helpful? Give feedback.
All reactions