Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make backtrace collection a Config field rather than a cargo feature #4183

Merged
merged 15 commits into from
May 25, 2022

Conversation

pchickey
Copy link
Contributor

No description provided.

@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. wasmtime:config Issues related to the configuration of Wasmtime wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels May 24, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:c-api", "wasmtime:config", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types
  • peterhuene: wasmtime:api, wasmtime:c-api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions
Copy link

github-actions bot commented May 24, 2022

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

crates/wasmtime/src/trap.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/config.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/config.rs Outdated Show resolved Hide resolved
crates/runtime/src/traphandlers.rs Outdated Show resolved Hide resolved
@@ -112,78 +113,81 @@ pub unsafe fn resume_panic(payload: Box<dyn Any + Send>) -> ! {
#[derive(Debug)]
pub enum Trap {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're feeling ambitious one possibility here is to do something like:

struct Trap { reason: TrapReason, backtrace: Option<Backtrace> }

enum TrapReason {
    User(Error),
    Jit(usize),
    Code(TrapCode),
    Oom,
}

enum UnwindReason {
    Panic(Box<dyn Any>),
    Trap(TrapReason),
}

... hm I'm thinking about this and I'd ideally like to use this to prevent the double capture problem I described below. In any case this is fine to defer to a future PR. My rough thinking is that we would replace raise_{lib,user}_trap with simply raise_trap which takes a Trap and conversion from Error or wasmtime::Trap would appropriately create a wasmtime_runtime::Trap which is configured to not capture a backtrace if one is already contained (or something like that). It still involves a lot of converting back and forth between the wasmtime crate and the wasmtime_runtime crate though which is also something I'd like to avoid so this is an underbaked idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started down this path and it made a lot of code motion but I didn't get it to pay off. I could put more thought into it another time but for now I'm going to leave it be

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty happy with this and I think it'd be good to land 👍

Now that I think about it (and our helpful bot is reminding me), could you thread this option through to the fuzzing configuration? We shouldn't ever look at the trace there and adding it to the fuzzed configuration options I think would beg ood.

crates/wasmtime/src/config.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/config.rs Outdated Show resolved Hide resolved
@pchickey pchickey marked this pull request as ready for review May 25, 2022 17:38
@github-actions github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label May 25, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@pchickey pchickey merged commit bffce37 into main May 25, 2022
@pchickey pchickey deleted the pch/trap_backtrace_config branch May 25, 2022 19:25
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request May 27, 2022
I was running tests recently and was surprised that the `--test all`
test was taking more than a minute to run when I didn't recall it ever
taking more than a minute historically. A bisection pointed out bytecodealliance#4183 as
the cause and after re-reviewing I realized I forgot that we capture
unresolved backtraces by default (and don't actually resolve them
anywhere yet but that's a problem for another day) rather than resolved
backtraces. This means that it's intended that we use
`Backtrace::new_unresolved` instead of `Backtrace::new` in the
traphandlers crate.

The reason that tests were running so slowly is that the tests which
deal with deep stacks (e.g. stack overflow) would take forever in
testing as the Rust-based decoding of DWARF information is egregiously
slow in unoptimized mode. I did discover independently that optimizing
these dependencies makes the tests ~6x faster, but that's irrelevant if
we're not symbolicating in the first place.
alexcrichton added a commit that referenced this pull request May 31, 2022
I was running tests recently and was surprised that the `--test all`
test was taking more than a minute to run when I didn't recall it ever
taking more than a minute historically. A bisection pointed out #4183 as
the cause and after re-reviewing I realized I forgot that we capture
unresolved backtraces by default (and don't actually resolve them
anywhere yet but that's a problem for another day) rather than resolved
backtraces. This means that it's intended that we use
`Backtrace::new_unresolved` instead of `Backtrace::new` in the
traphandlers crate.

The reason that tests were running so slowly is that the tests which
deal with deep stacks (e.g. stack overflow) would take forever in
testing as the Rust-based decoding of DWARF information is egregiously
slow in unoptimized mode. I did discover independently that optimizing
these dependencies makes the tests ~6x faster, but that's irrelevant if
we're not symbolicating in the first place.
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Jun 27, 2022
Previous to this commit Wasmtime would use the `GlobalModuleRegistry`
when learning information about a trap such as its trap code, the
symbols for each frame, etc. This has a downside though of holding a
global read-write lock for the duration of this operation which hinders
registration of new modules in parallel. In addition there was a fair
amount of internal duplication between this "global module registry" and
the store-local module registry. Finally relying on global state for
information like this gets a bit more brittle over time as it seems best
to scope global queries to precisely what's necessary rather than
holding extra information.

With the refactoring in wasm backtraces done in bytecodealliance#4183 it's now possible
to always have a `StoreOpaque` reference when a backtrace is collected
for symbolication and otherwise Trap-identification purposes. This
commit adds a `StoreOpaque` parameter to the `Trap::from_runtime`
constructor and then plumbs that everywhere. Note that while doing this
I changed the internal `traphandlers::lazy_per_thread_init` function to
no longer return a `Result` and instead just `panic!` on Unix if memory
couldn't be allocated for a stack. This removed quite a lot of
error-handling code for a case that's expected to quite rarely happen.
If necessary in the future we can add a fallible initialization point
but this feels like a better default balance for the code here.

With a `StoreOpaque` in use when a trap is being symbolicated that means
we have a `ModuleRegistry` which can be used for queries and such. This
meant that the `GlobalModuleRegistry` state could largely be dismantled
and moved to per-`Store` state (within the `ModuleRegistry`, mostly just
moving methods around).

The final state is that the global rwlock is not exclusively scoped
around insertions/deletions/`is_wasm_trap_pc` which is just a lookup and
atomic add. Otherwise symbolication for a backtrace exclusively uses
store-local state now (as intended).

The original motivation for this commit was that frame information
lookup and pieces were looking to get somewhat complicated with the
addition of components which are a new vector of traps coming out of
Cranelift-generated code. My hope is that by having a `Store` around for
more operations it's easier to plumb all this through.
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Jun 27, 2022
Previous to this commit Wasmtime would use the `GlobalModuleRegistry`
when learning information about a trap such as its trap code, the
symbols for each frame, etc. This has a downside though of holding a
global read-write lock for the duration of this operation which hinders
registration of new modules in parallel. In addition there was a fair
amount of internal duplication between this "global module registry" and
the store-local module registry. Finally relying on global state for
information like this gets a bit more brittle over time as it seems best
to scope global queries to precisely what's necessary rather than
holding extra information.

With the refactoring in wasm backtraces done in bytecodealliance#4183 it's now possible
to always have a `StoreOpaque` reference when a backtrace is collected
for symbolication and otherwise Trap-identification purposes. This
commit adds a `StoreOpaque` parameter to the `Trap::from_runtime`
constructor and then plumbs that everywhere. Note that while doing this
I changed the internal `traphandlers::lazy_per_thread_init` function to
no longer return a `Result` and instead just `panic!` on Unix if memory
couldn't be allocated for a stack. This removed quite a lot of
error-handling code for a case that's expected to quite rarely happen.
If necessary in the future we can add a fallible initialization point
but this feels like a better default balance for the code here.

With a `StoreOpaque` in use when a trap is being symbolicated that means
we have a `ModuleRegistry` which can be used for queries and such. This
meant that the `GlobalModuleRegistry` state could largely be dismantled
and moved to per-`Store` state (within the `ModuleRegistry`, mostly just
moving methods around).

The final state is that the global rwlock is not exclusively scoped
around insertions/deletions/`is_wasm_trap_pc` which is just a lookup and
atomic add. Otherwise symbolication for a backtrace exclusively uses
store-local state now (as intended).

The original motivation for this commit was that frame information
lookup and pieces were looking to get somewhat complicated with the
addition of components which are a new vector of traps coming out of
Cranelift-generated code. My hope is that by having a `Store` around for
more operations it's easier to plumb all this through.
alexcrichton added a commit that referenced this pull request Jun 27, 2022
Previous to this commit Wasmtime would use the `GlobalModuleRegistry`
when learning information about a trap such as its trap code, the
symbols for each frame, etc. This has a downside though of holding a
global read-write lock for the duration of this operation which hinders
registration of new modules in parallel. In addition there was a fair
amount of internal duplication between this "global module registry" and
the store-local module registry. Finally relying on global state for
information like this gets a bit more brittle over time as it seems best
to scope global queries to precisely what's necessary rather than
holding extra information.

With the refactoring in wasm backtraces done in #4183 it's now possible
to always have a `StoreOpaque` reference when a backtrace is collected
for symbolication and otherwise Trap-identification purposes. This
commit adds a `StoreOpaque` parameter to the `Trap::from_runtime`
constructor and then plumbs that everywhere. Note that while doing this
I changed the internal `traphandlers::lazy_per_thread_init` function to
no longer return a `Result` and instead just `panic!` on Unix if memory
couldn't be allocated for a stack. This removed quite a lot of
error-handling code for a case that's expected to quite rarely happen.
If necessary in the future we can add a fallible initialization point
but this feels like a better default balance for the code here.

With a `StoreOpaque` in use when a trap is being symbolicated that means
we have a `ModuleRegistry` which can be used for queries and such. This
meant that the `GlobalModuleRegistry` state could largely be dismantled
and moved to per-`Store` state (within the `ModuleRegistry`, mostly just
moving methods around).

The final state is that the global rwlock is not exclusively scoped
around insertions/deletions/`is_wasm_trap_pc` which is just a lookup and
atomic add. Otherwise symbolication for a backtrace exclusively uses
store-local state now (as intended).

The original motivation for this commit was that frame information
lookup and pieces were looking to get somewhat complicated with the
addition of components which are a new vector of traps coming out of
Cranelift-generated code. My hope is that by having a `Store` around for
more operations it's easier to plumb all this through.
afonso360 pushed a commit to afonso360/wasmtime that referenced this pull request Jun 30, 2022
…4325)

Previous to this commit Wasmtime would use the `GlobalModuleRegistry`
when learning information about a trap such as its trap code, the
symbols for each frame, etc. This has a downside though of holding a
global read-write lock for the duration of this operation which hinders
registration of new modules in parallel. In addition there was a fair
amount of internal duplication between this "global module registry" and
the store-local module registry. Finally relying on global state for
information like this gets a bit more brittle over time as it seems best
to scope global queries to precisely what's necessary rather than
holding extra information.

With the refactoring in wasm backtraces done in bytecodealliance#4183 it's now possible
to always have a `StoreOpaque` reference when a backtrace is collected
for symbolication and otherwise Trap-identification purposes. This
commit adds a `StoreOpaque` parameter to the `Trap::from_runtime`
constructor and then plumbs that everywhere. Note that while doing this
I changed the internal `traphandlers::lazy_per_thread_init` function to
no longer return a `Result` and instead just `panic!` on Unix if memory
couldn't be allocated for a stack. This removed quite a lot of
error-handling code for a case that's expected to quite rarely happen.
If necessary in the future we can add a fallible initialization point
but this feels like a better default balance for the code here.

With a `StoreOpaque` in use when a trap is being symbolicated that means
we have a `ModuleRegistry` which can be used for queries and such. This
meant that the `GlobalModuleRegistry` state could largely be dismantled
and moved to per-`Store` state (within the `ModuleRegistry`, mostly just
moving methods around).

The final state is that the global rwlock is not exclusively scoped
around insertions/deletions/`is_wasm_trap_pc` which is just a lookup and
atomic add. Otherwise symbolication for a backtrace exclusively uses
store-local state now (as intended).

The original motivation for this commit was that frame information
lookup and pieces were looking to get somewhat complicated with the
addition of components which are a new vector of traps coming out of
Cranelift-generated code. My hope is that by having a `Store` around for
more operations it's easier to plumb all this through.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. wasmtime:config Issues related to the configuration of Wasmtime wasmtime:ref-types Issues related to reference types and GC in Wasmtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants