Reimplement the pooling instance allocation strategy #5661

alexcrichton · 2023-01-30T23:34:15Z

This commit is a reimplementation of the strategy by which the pooling instance allocator selects a slot for a module. Previously there was a choice amongst three different algorithms: "reuse affinity", "next available", and "random". The default was "reuse affinity" but some new data has come to light which shows that this may not always be a good default.

Notably the pooling allocator will retain some memory per-slot in the pooling instance allocator, for example instance data or memory data if-so-configured. This means that a currently unused, but previously used, slot can contribute to the RSS usage of a program using Wasmtime. Consequently the RSS impact here is O(max slots) which can be counter-intuitive for embedders. This particularly affects "reuse affinity" because the algorithm for picking a slot when there are no affine slots is "pick a random slot", which means eventually all slots will get used.

In discussions about possible ways to tackle this, an alternative to "pick a strategy" arose and is now implemented in this commit. Concretely the new allocation algorithm for a slot is now:

First pick the most recently used affine slot, if one exists.
Otherwise if the number of affine slots to other modules is above some threshold N then pick the least-recently used affine slot.
Otherwise pick a slot that's affine to nothing.

The "N" in this algorithm is configurable and setting it to 0 is the same as the old "next available" strategy while setting it to infinity is the same as the "reuse affinity" algorithm. Setting it to something in the middle provides a knob to allow a modest "cache" of affine slots while not allowing the total set of slots used to grow too much beyond the maximal concurrent set of modules. The "random" strategy is now no longer possible and was removed to help simplify the allocator.

github-actions · 2023-01-31T01:36:42Z

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "fuzzing", "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing
peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

github-actions · 2023-01-31T01:37:07Z

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

If you added a new Config method, you wrote extensive documentation for
it.

Our documentation should be of the following form:

Short, simple summary sentence.

More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.

Is this method fallible? If so, when can it return an error?

Can this method panic? If so, when does it panic?

# Example

Optional example here.

If you added a new Config method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.

For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.

Often, all that is required of you is to ensure that there is a knob for this
configuration option in wasmtime_fuzzing::Config (or one
of its nested structs).

Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See our docs on fuzzing for more details.
If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.

To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

This commit is a reimplementation of the strategy by which the pooling instance allocator selects a slot for a module. Previously there was a choice amongst three different algorithms: "reuse affinity", "next available", and "random". The default was "reuse affinity" but some new data has come to light which shows that this may not always be a good default. Notably the pooling allocator will retain some memory per-slot in the pooling instance allocator, for example instance data or memory data if-so-configured. This means that a currently unused, but previously used, slot can contribute to the RSS usage of a program using Wasmtime. Consequently the RSS impact here is O(max slots) which can be counter-intuitive for embedders. This particularly affects "reuse affinity" because the algorithm for picking a slot when there are no affine slots is "pick a random slot", which means eventually all slots will get used. In discussions about possible ways to tackle this, an alternative to "pick a strategy" arose and is now implemented in this commit. Concretely the new allocation algorithm for a slot is now: * First pick the most recently used affine slot, if one exists. * Otherwise if the number of affine slots to other modules is above some threshold N then pick the least-recently used affine slot. * Otherwise pick a slot that's affine to nothing. The "N" in this algorithm is configurable and setting it to 0 is the same as the old "next available" strategy while setting it to infinity is the same as the "reuse affinity" algorithm. Setting it to something in the middle provides a knob to allow a modest "cache" of affine slots while not allowing the total set of slots used to grow too much beyond the maximal concurrent set of modules. The "random" strategy is now no longer possible and was removed to help simplify the allocator.

cfallin

This is a really clean and pleasing generalization of the old allocator -- thanks for this!

I didn't see any issues at all with the code; so in the absence of something more substantial, I just have some comment-request and naming nits :-) Overall it's quite clear already though.

crates/runtime/src/instance/allocator/pooling/index_allocator.rs

cfallin · 2023-01-31T22:57:52Z

crates/runtime/src/instance/allocator/pooling/index_allocator.rs

-            rand::thread_rng().gen()
-        };
-        let rng = SmallRng::from_seed(seed);
+    pub fn new(max_instances: u32, max_unused_warm_slots: u32) -> Self {


debug_assert!(max_unused_warm_slots <= max_instances) ?

Adding this assertion would require adding validation to the PoolingAllocatorConfig in wasmtime as well to provide a better error message than tripping the assertion. Thinking about that though I think it may not be worth it since it's not really a problem if max_unused_warm_slots is bigger than the number of slots. It's a bit silly but it can also perhaps be helpful to always pass a large value here to say "always keep everything warm"

crates/runtime/src/instance/allocator/pooling/index_allocator.rs

cfallin

Changes LGTM, thanks!

github-actions bot added fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Jan 31, 2023

alexcrichton added 2 commits January 30, 2023 20:57

Resolve rustdoc warnings in wasmtime-runtime crate

986fad7

alexcrichton requested a review from cfallin January 31, 2023 22:45

cfallin approved these changes Jan 31, 2023

View reviewed changes

alexcrichton added 4 commits February 1, 2023 07:21

Remove max_cold as it duplicates the slot_state.len()

f091d8c

More descriptive names

dadfe24

Add a comment and debug assertion

5b2a854

Add some list assertions

64f1819

alexcrichton force-pushed the lru branch from 2313e8a to 64f1819 Compare February 1, 2023 15:38

alexcrichton requested a review from cfallin February 1, 2023 16:59

cfallin approved these changes Feb 1, 2023

View reviewed changes

alexcrichton merged commit 8ffbb9c into bytecodealliance:main Feb 1, 2023

alexcrichton deleted the lru branch February 1, 2023 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplement the pooling instance allocation strategy #5661

Reimplement the pooling instance allocation strategy #5661

alexcrichton commented Jan 30, 2023

github-actions bot commented Jan 31, 2023

github-actions bot commented Jan 31, 2023

cfallin left a comment

cfallin Jan 31, 2023

alexcrichton Feb 1, 2023

cfallin left a comment

Reimplement the pooling instance allocation strategy #5661

Reimplement the pooling instance allocation strategy #5661

Conversation

alexcrichton commented Jan 30, 2023

github-actions bot commented Jan 31, 2023

Subscribe to Label Action

github-actions bot commented Jan 31, 2023

Label Messager: wasmtime:config

cfallin left a comment

Choose a reason for hiding this comment

cfallin Jan 31, 2023

Choose a reason for hiding this comment

alexcrichton Feb 1, 2023

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment