Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove no-op reinterpret wasmi instructions #518

Merged
merged 18 commits into from
Nov 4, 2022
Merged

Conversation

Robbepop
Copy link
Member

@Robbepop Robbepop commented Oct 15, 2022

This removes the 4 reinterpret wasmi bytecode instructions since they are doing nothing at runtime anyways.
Turns out that for some weird reason removing those 4 instructions significantly regresses performance of the wasmi interpreter ...
Until this is fixed or we know why that happens I won't merge this PR.

They did nothing before this change so we can also just remove them.
This is due to the fact that wasmi bytecode is untyped.
@paritytech-cicd-pr
Copy link

paritytech-cicd-pr commented Oct 15, 2022

BENCHMARKS

NATIVEWASMTIME
BENCHMARKMASTERPRDIFFMASTERPRDIFFWASMTIME OVERHEAD
execute/
bare_call_0
1.23ms 1.20ms 🟢 -1.98% 902.71µs 908.39µs 🔴 0.65% 🟢 -25%
execute/
bare_call_0/typed
652.44µs 651.51µs ⚪ 0.02% 389.28µs 408.21µs 🔴 4.85% 🟢 -37%
execute/
bare_call_1
1.33ms 1.32ms 🟢 -0.73% 1.11ms 1.22ms 🔴 10.07% 🟢 -8%
execute/
bare_call_16
3.07ms 2.85ms 🟢 -7.35% 4.60ms 4.64ms ⚪ 0.07% 🟡 63%
execute/
bare_call_16/typed
1.53ms 1.36ms 🟢 -11.26% 2.00ms 2.02ms ⚪ 0.68% 🟢 48%
execute/
bare_call_1/typed
776.18µs 764.30µs 🟢 -1.43% 699.57µs 697.45µs ⚪ -0.45% 🟢 -9%
execute/
bare_call_4
1.65ms 1.64ms 🟢 -0.70% 1.72ms 1.75ms 🔴 1.56% 🟢 7%
execute/
bare_call_4/typed
876.78µs 857.93µs 🟢 -2.09% 824.43µs 802.98µs 🟢 -2.70% 🟢 -6%
execute/
br_table
830.81µs 890.46µs ⚪ 5.23% 878.53µs 920.41µs 🔴 4.49% 🟢 3%
execute/
count_until
1.15ms 772.43µs 🟢 -33.07% 1.98ms 2.37ms 🔴 19.67% 🔴 207%
execute/
factorial_iterative
500.72µs 364.59µs 🟢 -27.17% 801.11µs 842.85µs 🔴 5.38% 🔴 131%
execute/
factorial_recursive
761.11µs 700.12µs 🟢 -8.19% 1.23ms 1.27ms 🔴 3.64% 🟡 82%
execute/
fib_iterative
2.54ms 1.67ms 🟢 -34.17% 4.33ms 4.48ms 🔴 3.69% 🔴 168%
execute/
fib_recursive
7.98ms 6.64ms 🟢 -16.87% 11.19ms 11.29ms ⚪ 0.86% 🟡 70%
execute/
global_bump
2.00ms 1.17ms 🟢 -41.59% 3.00ms 3.17ms 🔴 5.57% 🔴 171%
execute/
global_const
1.21ms 898.46µs 🟢 -26.18% 2.21ms 2.33ms 🔴 5.80% 🔴 159%
execute/
host_calls
32.42µs 31.52µs 🟢 -2.72% 35.79µs 37.24µs 🔴 4.00% 🟢 18%
execute/
memory_fill
2.04ms 1.50ms 🟢 -26.73% 3.74ms 3.99ms 🔴 6.75% 🔴 166%
execute/
memory_sum
2.02ms 1.50ms 🟢 -25.87% 3.80ms 3.97ms 🔴 4.58% 🔴 165%
execute/
memory_vec_add
4.42ms 3.16ms 🟢 -28.56% 7.72ms 8.22ms 🔴 6.48% 🔴 161%
execute/
recursive_is_even
1.33ms 1.28ms 🟢 -4.21% 2.02ms 2.10ms 🔴 4.08% 🟡 64%
execute/
recursive_ok
171.82µs 165.80µs 🟢 -3.33% 282.62µs 295.25µs 🔴 4.46% 🟡 78%
execute/
recursive_scan
219.00µs 200.94µs 🟢 -8.34% 359.73µs 375.06µs 🔴 4.10% 🟡 87%
execute/
recursive_trap
16.95µs 16.19µs 🟢 -4.29% 28.84µs 28.87µs ⚪ 0.13% 🟡 78%
execute/
regex_redux
815.50µs 660.73µs 🟢 -18.75% 1.40ms 1.45ms 🔴 3.17% 🔴 119%
execute/
rev_complement
890.52µs 620.62µs 🟢 -30.32% 1.39ms 1.43ms 🔴 2.87% 🔴 130%
execute/
tiny_keccak
784.55µs 470.76µs 🟢 -39.94% 1.09ms 1.19ms 🔴 9.96% 🔴 153%
execute/
trunc_f2i
1.33ms 1.09ms 🟢 -17.77% 2.22ms 2.44ms 🔴 9.77% 🔴 123%
instantiate/
wasm_kernel
70.63µs 70.61µs ⚪ 1.16% 71.43µs 95.52µs 🔴 32.71% 🟢 35%
translate/
erc1155
239.75µs 246.43µs 🔴 2.37% 373.63µs 380.49µs 🔴 1.99% 🟡 54%
translate/
erc20
117.87µs 119.29µs ⚪ 1.45% 181.39µs 185.79µs 🔴 2.23% 🟡 56%
translate/
erc721
167.93µs 169.41µs 🔴 1.44% 264.59µs 270.11µs 🔴 1.97% 🟡 59%
translate/
spidermonkey
0.00ns 0.00ns ⚪ 0.53% 0.00ns 0.00ns 🔴 1.99% 🟢 0%
translate/
wasm_kernel
4.26ms 4.19ms 🟢 -1.92% 7.07ms 7.19ms 🔴 2.16% 🟡 72%

Link to pipeline

@Robbepop
Copy link
Member Author

I used the cargo-asm tool to inspect the Executor::execute function on an assembly level to understand if Rust and LLVM somehow fail to properly optimize the function after removal of the 4 unnecessary instructions. The result is: There is a huge difference in the assembly output!

See yourself:

@codecov-commenter
Copy link

codecov-commenter commented Oct 23, 2022

Codecov Report

Merging #518 (4b3d74d) into master (f5e1c46) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #518      +/-   ##
==========================================
- Coverage   79.73%   79.65%   -0.08%     
==========================================
  Files          75       75              
  Lines        6291     6267      -24     
==========================================
- Hits         5016     4992      -24     
  Misses       1275     1275              
Impacted Files Coverage Δ
crates/wasmi/src/engine/bytecode/mod.rs 100.00% <ø> (ø)
crates/wasmi/src/engine/executor.rs 98.04% <ø> (-0.05%) ⬇️
crates/wasmi/src/engine/bytecode/tests.rs 100.00% <100.00%> (ø)
crates/wasmi/src/engine/func_builder/mod.rs 89.72% <100.00%> (-0.13%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Robbepop Robbepop changed the title Refactor wasmi bytecode Remove no-op reinterpret wasmi instructions Oct 23, 2022
@Robbepop
Copy link
Member Author

Robbepop commented Nov 4, 2022

While the changes introduces in this PR were a regression to performance for wasmi with Rust 1.64 they are now significantly improving wasmi performance with Rust 1.65. Therefore it is now ready to merge.

@Robbepop Robbepop merged commit 9e9cbab into master Nov 4, 2022
@Robbepop Robbepop deleted the rf-bytecode-refactor branch November 4, 2022 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants