Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying trap semantics #55

Closed
alexcrichton opened this issue Jun 22, 2022 · 6 comments
Closed

Clarifying trap semantics #55

alexcrichton opened this issue Jun 22, 2022 · 6 comments

Comments

@alexcrichton
Copy link
Collaborator

In implementing post-return I realized that I ended up having a few questions about the semantics of traps with components which may be worth calling out explicitly in the canonical ABI explainer or otherwise just resolving my questions here:

  • If a core wasm instance as part of a component instance generates a trap, is the component intended to still be usable? Or should the component be completely blocked from all future calls?
  • What are the semantics of validation errors in lifting/lowering? As specified if the Python spec code is read as a trap basically throwing an exception, then I think it has these consequences:
    • In canon_lift if the lower operation traps/errors then may_leave is set to False which would trigger the prior assert the next time the function is called (if it could even be called)
    • In canon_lift if the lift operation traps/errors then may_enter is left as False and all future calls to canon_lift will trap
    • In canon_lower there are similar consequences as canon_lift but in reverse.

I vaguely recall the idea that when an instance traps it poisons the entire component instance as being impossible to enter, but I'm not sure if that's what's expected (since otherwise that would transitively poison everything right now I think and only an embedder could catch the error). Otherwise though I didn't see many other references to traps and recovery in the current canonical ABI explainer.

@lukewagner
Copy link
Member

Great questions:

  • The intention is that, after a trap, the whole component instance (and even all transitively-linked component instances) are locked down. And you're right that there's no way to stop propagation in wasm before it hits the host. (The ability to contain traps is this Post-MVP "blast zone" feature for expressing and dealing with partial failure.)
  • The Canonical ABI Python code doesn't implement this "lockdown on trap" behavior (the may_enter/may_leave stuff is meant to cause traps, not implement the lockdown after a trap). I was already thinking that I need to add a new top-level Python spec-function to define whole components (not just individual imports/exports), and I suppose this is where the trap semantics would go.

@alexcrichton
Copy link
Collaborator Author

Ok that sounds reasonable. A question along those lines though: in some cases a trap is theoretically recoverable for example with an OOM trap from a realloc function. In such a situation is it worth having this be recoverable or would this also simply poison the whole instance and everything it's linked to?

@lukewagner
Copy link
Member

Yeah, that's a good question too. Thinking through the realloc example: memory.grow would return -1, but realloc isn't allowed to fail (otherwise, we'd have to add a new implicit "OOM" partial-failure mode to every interface-typed call (which has a bunch of tricky follow-on questions)), so the realloc core code would explicitly execute unreachable which, at that point, isn't distinct from any other unreachable and thus would be tricky to specify recovery for without going to the full "blast zone" trap-recovery feature.

Post-MVP, a custom adapter function, which itself controls the call to realloc, could manually convert a realloc failure into, e.g., the error case of an expected return type (as a sortof custom short-circuit). Alternatively, if we didn't want to punt to custom adapter functions, we could add a new canonopt that says to map realloc failures to return cases, but I'd suggest we make that a post-MVP consideration as well.

@alexcrichton
Copy link
Collaborator Author

Ok that sounds like a reasonable stance for now. To summarize every component will have a "locked" flag and if set would cause a trap on entry to the instance in any case. In Python-like-pseudo-code we'd set the locked flag to True just before entry and to False after exit with the current Instance bits, is that right? Although now that I type this this sounds a lot like the may_enter flag so I may be missing something.

In general though the intention is that for the MVP trap is equated with "the instance is poisoned and cannot ever be reused" and runtimes will need to ensure that the embedder either loses access to an instance on a trap or otherwise if an instance is reused it never actually goes back into wasm and quickly returns an embedder-level error.

@lukewagner
Copy link
Member

Yeah, I'd say we could reuse the may_enter flag: the only tweak is that we'd make sure we didn't clear it once set (by trap()) on our way out. Actually, I guess I could just write that as a quick little patch to the existing Python code (having trap() throw an exception caught by canon_lift())... I'll try that in a bit.

@lukewagner
Copy link
Member

Resolved by #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants