Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize wasmi::Linker host function setup #989

Merged
merged 11 commits into from
Apr 21, 2024
Merged

Conversation

Robbepop
Copy link
Member

@Robbepop Robbepop commented Apr 20, 2024

Closes #979.

cc @athei

Most inefficiencies are in StringInterner::get_or_insert and in memcmp calls to compare definition names.

image

Findings

  • By using an empty "" name for the module name for all host functions we can improve performance by ~15%.
    • This is likely due to a fast check for empty-string to avoid having to use memcmp.
  • Introduction of struct LenOrder(Arc<str>) and struct LenOrderStr(str) wrapper types with new Ord implementation to avoid memcmp are very effective, improving performance by roughly 50%. Unfortunately this requires a single line of unsafe Rust code since Rust's BTreeMap does not support custom comparators.

image

Copy link

codecov bot commented Apr 20, 2024

Codecov Report

Attention: Patch coverage is 64.46281% with 43 lines in your changes are missing coverage. Please review.

Project coverage is 81.89%. Comparing base (6c58a0f) to head (92aea8d).

Files Patch % Lines
crates/wasmi/src/linker.rs 62.83% 42 Missing ⚠️
crates/wasmi/src/func/mod.rs 87.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #989      +/-   ##
==========================================
+ Coverage   81.81%   81.89%   +0.08%     
==========================================
  Files         262      262              
  Lines       24110    24186      +76     
==========================================
+ Hits        19726    19808      +82     
+ Misses       4384     4378       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

According to our benchmarks those string type wrappers and their custom Ord implementation improve Linker performance by roughly 50%. This is probably because the Rust compiler is able to inline the `memcmp` routines whereas with `Arc<str>` and `str` it uses `memcmp` calls which are inefficient for small strings.
The new implementation optimizes the Ord impl of ImportKey since ImportKeys are compared more often than their fields are queried during Linker setup.
This allows us to create new HostFuncTrampolineEntities without an Engine and thus allow for a HostApi abstraction later to further optimize Linker setup.
Also conveniently this allow us to get rid of tons of custom Debug implementations to resolve the no longer existing DedupFuncTypes.
@Robbepop
Copy link
Member Author

The new experimental LinkerBuilder API benchmarks show that LinkerBuilder::finish is up to 6 times faster than re-constructing the Linker which is a great speed-up for certain use cases.
The downside is that this requires to build up the LinkerBuilder beforehand which is roughly as slow as building up a normal Linker. Thus if a user now builds all Linker using the new LinkerBuilder but only actually builds up a single Linker this is going to be roughly 20% slower.

@Robbepop Robbepop marked this pull request as ready for review April 21, 2024 12:31
@Robbepop Robbepop merged commit 023d06a into master Apr 21, 2024
16 of 17 checks passed
@Robbepop Robbepop deleted the rf-optimize-linker-setup branch April 21, 2024 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize Linker setup
1 participant