Skip to content

Commit

Permalink
Improve signature lookup happening during instantiation
Browse files Browse the repository at this point in the history
This commit is intended to be a perf improvement for instantiation of
modules with lots of functions. Previously the `lookup_shared_signature`
callback was showing up quite high in profiles as part of instantiation.

As some background, this callback is used to translate from a module's
`SignatureIndex` to a `VMSharedSignatureIndex` which the instance
stores. This callback is called for two reasons, one is to translate all
of the module's own types into `VMSharedSignatureIndex` for the purposes
of `call_indirect` (the translation of that loads from this table to
compare indices). The second reason is that a `VMCallerCheckedAnyfunc`
is prepared for all functions and this embeds a `VMSharedSignatureIndex`
inside of it.

The slow part today is that the lookup callback was called
once-per-function and each lookup involved hashing a full
`WasmFuncType`. Albeit our hash algorithm is still Rust's default
SipHash algorithm which is quite slow, but we also shouldn't need to
re-hash each signature if we see it multiple times anyway.

The fix applied in this commit is to change this lookup callback to an
`enum` where one variant is that there's a table to lookup from. This
table is a `PrimaryMap` which means that lookup is quite fast. The only
thing we need to do is to prepare the table ahead of time. Currently
this happens on the instantiation path because in my measurments the
creation of the table is quite fast compared to the rest of
instantiation. If this becomes an issue, though, we can look into
creating the table as part of `SigRegistry::register_module` and caching
it somewhere (I'm not entirely sure where but I'm sure we can figure it
out).

There's in generally not a ton of efficiency around the `SigRegistry`
type. I'm hoping though that this fixes the next-lowest-hanging-fruit in
terms of performance without complicating the implementation too much. I
tried a few variants and this change seemed like the best balance
between simplicity and still a nice performance gain.

Locally I measured an improvement in instantiation time for a large-ish
module by reducing the time from ~3ms to ~2.6ms per instance.
  • Loading branch information
alexcrichton committed Apr 8, 2021
1 parent 2ca97ed commit 2b776f7
Show file tree
Hide file tree
Showing 13 changed files with 141 additions and 98 deletions.
1 change: 1 addition & 0 deletions crates/jit/src/instantiate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@ impl CompiledModule {
}

/// Returns the map of all finished JIT functions compiled for this module
#[inline]
pub fn finished_functions(&self) -> &PrimaryMap<DefinedFuncIndex, *mut [VMFunctionBody]> {
&self.finished_functions.0
}
Expand Down
10 changes: 0 additions & 10 deletions crates/runtime/src/externref.rs
Original file line number Diff line number Diff line change
Expand Up @@ -828,16 +828,6 @@ impl StackMapRegistry {

let mut inner = self.inner.borrow_mut();

// Check if we've already registered this module.
if let Some(existing_module) = inner.ranges.get(&max) {
assert_eq!(existing_module.range, module_stack_maps.range);
debug_assert_eq!(
existing_module.pc_to_stack_map,
module_stack_maps.pc_to_stack_map,
);
return;
}

// Assert that this chunk of ranges doesn't collide with any other known
// chunks.
if let Some((_, prev)) = inner.ranges.range(max..).next() {
Expand Down
48 changes: 44 additions & 4 deletions crates/runtime/src/instance/allocator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@ pub struct InstanceAllocationRequest<'a> {
/// The imports to use for the instantiation.
pub imports: Imports<'a>,

/// A callback for looking up shared signature indexes.
pub lookup_shared_signature: &'a dyn Fn(SignatureIndex) -> VMSharedSignatureIndex,
/// Translation from `SignatureIndex` to `VMSharedSignatureIndex`
pub shared_signatures: SharedSignatures<'a>,

/// The host state to associate with the instance.
pub host_state: Box<dyn Any>,
Expand Down Expand Up @@ -165,6 +165,46 @@ pub unsafe trait InstanceAllocator: Send + Sync {
unsafe fn deallocate_fiber_stack(&self, stack: &wasmtime_fiber::FiberStack);
}

pub enum SharedSignatures<'a> {
/// Used for instantiating user-defined modules
Table(&'a PrimaryMap<SignatureIndex, VMSharedSignatureIndex>),
/// Used for instance creation that has only a single function
Always(VMSharedSignatureIndex),
/// Used for instance creation that has no functions
None,
}

impl SharedSignatures<'_> {
fn lookup(&self, index: SignatureIndex) -> VMSharedSignatureIndex {
match self {
SharedSignatures::Table(table) => table[index],
SharedSignatures::Always(index) => *index,
SharedSignatures::None => unreachable!(),
}
}
}

impl<'a> From<VMSharedSignatureIndex> for SharedSignatures<'a> {
fn from(val: VMSharedSignatureIndex) -> SharedSignatures<'a> {
SharedSignatures::Always(val)
}
}

impl<'a> From<Option<VMSharedSignatureIndex>> for SharedSignatures<'a> {
fn from(val: Option<VMSharedSignatureIndex>) -> SharedSignatures<'a> {
match val {
Some(idx) => SharedSignatures::Always(idx),
None => SharedSignatures::None,
}
}
}

impl<'a> From<&'a PrimaryMap<SignatureIndex, VMSharedSignatureIndex>> for SharedSignatures<'a> {
fn from(val: &'a PrimaryMap<SignatureIndex, VMSharedSignatureIndex>) -> SharedSignatures<'a> {
SharedSignatures::Table(val)
}
}

fn get_table_init_start(
init: &TableInitializer,
instance: &Instance,
Expand Down Expand Up @@ -413,7 +453,7 @@ unsafe fn initialize_vmcontext(instance: &Instance, req: InstanceAllocationReque
let mut ptr = instance.signature_ids_ptr();
for sig in module.types.values() {
*ptr = match sig {
ModuleType::Function(sig) => (req.lookup_shared_signature)(*sig),
ModuleType::Function(sig) => req.shared_signatures.lookup(*sig),
_ => VMSharedSignatureIndex::new(u32::max_value()),
};
ptr = ptr.add(1);
Expand Down Expand Up @@ -453,7 +493,7 @@ unsafe fn initialize_vmcontext(instance: &Instance, req: InstanceAllocationReque

// Initialize the functions
for (index, sig) in instance.module.functions.iter() {
let type_index = (req.lookup_shared_signature)(*sig);
let type_index = req.shared_signatures.lookup(*sig);

let (func_ptr, vmctx) = if let Some(def_index) = instance.module.defined_func_index(index) {
(
Expand Down
4 changes: 2 additions & 2 deletions crates/runtime/src/instance/allocator/pooling.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1366,7 +1366,7 @@ mod test {
memories: &[],
globals: &[],
},
lookup_shared_signature: &|_| VMSharedSignatureIndex::default(),
shared_signatures: VMSharedSignatureIndex::default().into(),
host_state: Box::new(()),
interrupts: std::ptr::null(),
externref_activations_table: std::ptr::null_mut(),
Expand All @@ -1390,7 +1390,7 @@ mod test {
memories: &[],
globals: &[],
},
lookup_shared_signature: &|_| VMSharedSignatureIndex::default(),
shared_signatures: VMSharedSignatureIndex::default().into(),
host_state: Box::new(()),
interrupts: std::ptr::null(),
externref_activations_table: std::ptr::null_mut(),
Expand Down
2 changes: 1 addition & 1 deletion crates/runtime/src/instance/allocator/pooling/uffd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -519,7 +519,7 @@ mod test {
memories: &[],
globals: &[],
},
lookup_shared_signature: &|_| VMSharedSignatureIndex::default(),
shared_signatures: VMSharedSignatureIndex::default().into(),
host_state: Box::new(()),
interrupts: ptr::null(),
externref_activations_table: ptr::null_mut(),
Expand Down
19 changes: 0 additions & 19 deletions crates/wasmtime/src/frame_info.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,6 @@ impl StoreFrameInfo {
.map(|info| (info, module.has_unparsed_debuginfo()))
}

/// Returns whether the `pc` specified is contained within some module's
/// function.
pub fn contains_pc(&self, pc: usize) -> bool {
match self.module(pc) {
Some(module) => module.contains_pc(pc),
None => false,
}
}

/// Fetches trap information about a program counter in a backtrace.
pub fn lookup_trap_info(&self, pc: usize) -> Option<&TrapInformation> {
self.module(pc)?.lookup_trap_info(pc)
Expand Down Expand Up @@ -79,10 +70,6 @@ impl StoreFrameInfo {
// may be a valid PC value
let end = end - 1;

if self.contains_pc(start) {
return;
}

// Assert that this module's code doesn't collide with any other registered modules
if let Some((_, prev)) = self.ranges.range(end..).next() {
assert!(prev.start > end);
Expand Down Expand Up @@ -191,12 +178,6 @@ impl ModuleFrameInfo {
})
}

/// Returns whether the `pc` specified is contained within some module's
/// function.
pub fn contains_pc(&self, pc: usize) -> bool {
self.func(pc).is_some()
}

/// Fetches trap information about a program counter in a backtrace.
pub fn lookup_trap_info(&self, pc: usize) -> Option<&TrapInformation> {
let (index, offset) = self.func(pc)?;
Expand Down
2 changes: 1 addition & 1 deletion crates/wasmtime/src/func.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1810,7 +1810,7 @@ macro_rules! impl_into_func {
// If not given a registry, use a default signature index that is guaranteed to trap
// if the function is called indirectly without first being associated with a store (a bug condition).
let shared_signature_id = registry
.map(|r| r.register(ty.as_wasm_func_type(), Some(trampoline)))
.map(|r| r.register(ty.as_wasm_func_type(), trampoline))
.unwrap_or(VMSharedSignatureIndex::default());

let instance = unsafe {
Expand Down
6 changes: 3 additions & 3 deletions crates/wasmtime/src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -513,14 +513,14 @@ impl<'a> Instantiator<'a> {
unsafe {
let engine = self.store.engine();
let allocator = engine.allocator();
let signatures = self.store.signatures().borrow();
let signatures = signatures.lookup_table(&self.cur.module);

let instance = allocator.allocate(InstanceAllocationRequest {
module: compiled_module.module().clone(),
finished_functions: compiled_module.finished_functions(),
imports: self.cur.build(),
lookup_shared_signature: &self
.store
.lookup_shared_signature(self.cur.module.types()),
shared_signatures: (&signatures).into(),
host_state: Box::new(()),
interrupts: self.store.interrupts(),
externref_activations_table: self.store.externref_activations_table()
Expand Down
68 changes: 63 additions & 5 deletions crates/wasmtime/src/sig_registry.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
//! Implement a registry of function signatures, for fast indirect call
//! signature checking.
use crate::Module;
use std::collections::{hash_map, HashMap};
use std::convert::TryFrom;
use wasmtime_environ::wasm::WasmFuncType;
use wasmtime_environ::entity::PrimaryMap;
use wasmtime_environ::wasm::{SignatureIndex, WasmFuncType};
use wasmtime_runtime::{VMSharedSignatureIndex, VMTrampoline};

/// WebAssembly requires that the caller and callee signatures in an indirect
Expand Down Expand Up @@ -33,12 +35,40 @@ struct Entry {
}

impl SignatureRegistry {
/// Register a signature and return its unique index.
/// Registers all signatures within a module into this registry all at once.
///
/// Note that `trampoline` can be `None` which indicates that an index is
/// desired for this signature but the trampoline for it is not compiled or
/// available.
/// This will also internally register trampolines compiled in the module.
pub fn register_module(&mut self, module: &Module) {
// Register a unique index for all types in this module, even if they
// don't have a trampoline.
let signatures = &module.types().wasm_signatures;
for ty in module.compiled_module().module().types.values() {
if let wasmtime_environ::ModuleType::Function(index) = ty {
self.register_one(&signatures[*index], None);
}
}

// Once we've got a shared index for all types used then also fill in
// any trampolines that the module has compiled as well.
for (index, trampoline) in module.compiled_module().trampolines() {
let shared = self.wasm2index[&signatures[*index]];
let entry = &mut self.index_map[shared.bits() as usize];
if entry.trampoline.is_none() {
entry.trampoline = Some(*trampoline);
}
}
}

/// Register a signature and return its unique index.
pub fn register(
&mut self,
wasm: &WasmFuncType,
trampoline: VMTrampoline,
) -> VMSharedSignatureIndex {
self.register_one(wasm, Some(trampoline))
}

fn register_one(
&mut self,
wasm: &WasmFuncType,
trampoline: Option<VMTrampoline>,
Expand Down Expand Up @@ -80,6 +110,34 @@ impl SignatureRegistry {
self.wasm2index.get(wasm).cloned()
}

/// Builds a lookup table for a module from the possible module's signature
/// indices to the shared signature index within this registry.
pub fn lookup_table(
&self,
module: &Module,
) -> PrimaryMap<SignatureIndex, VMSharedSignatureIndex> {
// For module-linking using modules this builds up a map that is
// too large. This builds up a map for everything in `TypeTables` but
// that's all the types for all modules in a whole module linking graph,
// which our `module` may not be using.
//
// For all non-module-linking-using modules, though, this is not an
// issue. This is optimizing for the non-module-linking case right now
// and it seems like module linking will likely change to the point that
// this will no longer be an issue in the future.
let signatures = &module.types().wasm_signatures;
let mut map = PrimaryMap::with_capacity(signatures.len());
for wasm in signatures.values() {
map.push(
self.wasm2index
.get(wasm)
.cloned()
.unwrap_or(VMSharedSignatureIndex::new(u32::MAX)),
);
}
map
}

/// Looks up information known about a shared signature index.
///
/// Note that for this operation to be semantically correct the `idx` must
Expand Down
69 changes: 21 additions & 48 deletions crates/wasmtime/src/store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ impl Store {
.inner
.signatures
.borrow_mut()
.register(ty.as_wasm_func_type(), Some(trampoline));
.register(ty.as_wasm_func_type(), trampoline);

Box::new(anyfunc)
});
Expand Down Expand Up @@ -248,19 +248,23 @@ impl Store {
&self.inner.signatures
}

pub(crate) fn lookup_shared_signature<'a>(
&'a self,
types: &'a TypeTables,
) -> impl Fn(wasm::SignatureIndex) -> VMSharedSignatureIndex + 'a {
move |index| {
self.signatures()
.borrow()
.lookup(&types.wasm_signatures[index])
.expect("signature not previously registered")
pub(crate) fn register_module(&self, module: &Module) {
// With a module being instantiated into this `Store` we need to
// preserve its jit-code. References to this module's code and
// trampolines are not owning-references so it's our responsibility to
// keep it all alive within the `Store`.
//
// If this module is already present in the store then we skip all
// further registration steps.
let first = self
.inner
.modules
.borrow_mut()
.insert(ArcModuleCode(module.compiled_module().code().clone()));
if !first {
return;
}
}

pub(crate) fn register_module(&self, module: &Module) {
// All modules register their JIT code in a store for two reasons
// currently:
//
Expand All @@ -271,7 +275,10 @@ impl Store {
// * Second when generating a backtrace we'll use this mapping to
// only generate wasm frames for instruction pointers that fall
// within jit code.
self.register_jit_code(module.compiled_module());
self.inner
.frame_info
.borrow_mut()
.register(module.compiled_module());

// We need to know about all the stack maps of all instantiated modules
// so when performing a GC we know about all wasm frames that we find
Expand All @@ -282,20 +289,7 @@ impl Store {
// once-per-module (and once-per-signature). This allows us to create
// a `Func` wrapper for any function in the module, which requires that
// we know about the signature and trampoline for all instances.
self.register_signatures(module);

// And finally with a module being instantiated into this `Store` we
// need to preserve its jit-code. References to this module's code and
// trampolines are not owning-references so it's our responsibility to
// keep it all alive within the `Store`.
self.inner
.modules
.borrow_mut()
.insert(ArcModuleCode(module.compiled_module().code().clone()));
}

fn register_jit_code(&self, module: &Arc<CompiledModule>) {
self.inner.frame_info.borrow_mut().register(module)
self.signatures().borrow_mut().register_module(module);
}

fn register_stack_maps(&self, module: &CompiledModule) {
Expand All @@ -310,27 +304,6 @@ impl Store {
}));
}

fn register_signatures(&self, module: &Module) {
let mut signatures = self.signatures().borrow_mut();
let types = module.types();

// Register a unique index for all types in this module, even if they
// don't have a trampoline.
for (_, ty) in module.compiled_module().module().types.iter() {
if let wasmtime_environ::ModuleType::Function(index) = ty {
let wasm = &types.wasm_signatures[*index];
signatures.register(wasm, None);
}
}

// Afterwards register all compiled trampolines for this module with the
// signature registry as well.
for (index, trampoline) in module.compiled_module().trampolines() {
let wasm = &types.wasm_signatures[*index];
signatures.register(wasm, Some(*trampoline));
}
}

pub(crate) fn bump_resource_counts(&self, module: &Module) -> Result<()> {
let config = self.engine().config();

Expand Down
Loading

0 comments on commit 2b776f7

Please sign in to comment.