-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
funcref <: anyref, or not? #293
Comments
I do think it would be very strange (as long as @lars-t-hansen, would SpiderMonkey hit this same performance problem? If so, are the potential trade offs similar to those in V8? |
@tlively, we're heading down path (1) and would be facing the exact same problem. Currently our "wasm view" is just a JSFunction*, and we have been able to live with this because we don't have call_ref yet, but it's not workable in the long run - we'll need to have a specialized representation inside wasm code. We were resolved to just eat the complexity cost of the unwrapping in the hope that "anyref" is not the preferred type for anything performance-sensitive. |
@lars-t-hansen , I agree we could hope that, but for the use case of round-tripping opaque host references through Wasm, there isn't really an alternative to (FWIW, in V8 we also used to have a single |
Just to check my understanding, isn't there an option (2b), where you pass a function pointer along unmodified as well, but implement respective unwrapping logic in down casts to funcref (ref.is/as_func and friends), which perform it by need? Symmetrically, a call in JS could handle raw Wasm functions and implicitly wrap them. That way, the penalty would only be on the rare operations. (In fact, both directions seem independent, so an engine could independently choose between eager or lazy un/wrapping anyref functions on the way in vs out.) That said, I can see that none of these options is a pleasure to implement. AFAICS, plain option (2) is only possible if |
I think we should keep |
I think we should have If, in the future, we want to have a native wasm GC type for optimally implementing source-language closures, I think the path forward is to have some new heap type specifically designed for compilation from closures (by, e.g., allowing access to the closure fields before calling the closure) and references to this new closure-heap-type would be subtypes of |
I see some of the arguments for removing the subtyping. In particular, if we postpone func.bind, the use case is much weaker. As Luke suggests, we could later introduce a new type closureref that would be a subtype of anyref. Of course, we can also go the opposite way, leave in the subtyping and later introduce a rawfuncref type that's not an anyref. So it comes down to the relative benefit and disadvantages of introducing the type hierarchy separation now. The pro clearly is that it gives more leeway to implementations in representing funcrefs. If at least one engine could demonstrate a benefit in practice, I'd be convinced. The main downside is more irregularity and complexity in the type system (even before we consider introducing a separate closureref type). In particular, we'd have:
That is not necessarily hard but somewhat unpleasant. |
I added this to the agenda for our meeting on Tuesday. |
Another approach to maintain the subtyping relation and also give JS API users the possibility to pass opaque references without cost, would be to introduce another type in the hierarchy below |
@manoskouk, such a type (basically the complement of what we previously had as externref) might be useful. But it would not solve this specific problem, because like all subtyping, the relation wasmref <: anyref would have to be coercion-free, i.e., downcasting from anyref to wasmref cannot require a representation change. If it did, it would break any efficient implementation of subtyping. Consider this example:
If going from anyref to wasmref required a representation change, then casting from $t to $u would require a full copy of the array. Besides the unbounded cost and the fact that it breaks identity, this is even impossible when state is involved. Consider:
Casting from $st to $su would likewise require a copy, since we have to copy the contained array field. But we cannot do that, since that would silently duplicate the stateful i32 field. So, no, unfortunately, this does not work. If we allow non-normalised references anywhere in the subtype hierarchy, we have to allow them everywhere and handle them in the corresponding elimination forms. If you wanted to distinguish normalised and non-normalised types then they cannot be in a subtype relation. |
@rossberg I see, thanks. An additional question: you mentioned before that we would need separate null values if we have disjoint type hierarchies. Could you elaborate on that? Why cannot we have a single |
Yes, absolutely!
That would defeat the purpose of anyref, though. Its intended role is to be able to import something without constraining it to be either host or Wasm, so that e.g. host things can be freely virtualised in Wasm and vice versa. The only thing we could do is removing the subtyping between anyref and anything below it, and always require explicit conversions (which then could perform a representation change). I'm not sure if that's desirable, though.
The point of disjoint hierarchies is that they can use different internal representations (including different size). Consequently, their null values have to be assumed to have different representations as well. And because of that, the type system must not allow to mix them up. For the same reason, we'd need separate bottom types. |
Yes, this is the same thing as I am suggesting: Add a type which can be a wasm reference (with the host's representation) or a host reference, and requires conversions to be cast to and from wasm types (you call it |
Ah, I see.
If funcref and dataref are put in separate hierarchies for the sake of allowing different representations, then you cannot allow a union between them either – like subtyping, a union requires a compatible representation. Imagine datarefs being implemented with one word, funcrefs with two (closures as fat pointers). Clearly, their null values would be incompatible as well. |
Right, but it was my impression from the discussion that the representation problems arise mostly at the wasm/JS boundary. |
Well, that's only one class of problem that could motivate splitting the type hierarchies. In the original discussions, the ability to represent function refs completely differently was another. If we go for the separation, it makes most sense to carry it all the way through, so that all such use cases are covered.
Right. |
This conversation was resolved in #307 |
In the "reference types" proposal, after loooong discussions we decided to drop the previously-planned
funcref <: anyref
relation, mostly due to a lack of use cases and concerns about limiting implementation flexibility (and, as a consequence, reachable performance).The GC proposal currently suggests to re-introduce this subtyping relation. I haven't seen any discussion of newly discovered use cases for it. Have I missed it? Has any other reasoning changed, why having this relation has in the meantime become desirable?
Meanwhile, I have recently come upon a concrete performance concern.
The background is that JavaScript and Wasm have different needs from their function/funcref objects; so to make pure-Wasm calls as fast as possible, we use different internal representations for the "JS view" and the "Wasm view" onto the same function reference. (They are both representation-compatible with
anyref
.) That means that on the boundary between both languages, a conversion/"unwrapping" step is required when a function reference is passed as a parameter or return value of a function call. Additionally, preparing a function to actually be callable from both worlds is a nontrivial amount of work (that's only performed when necessary, for obvious reasons).The resulting situation is that when an exported Wasm function that takes an
anyref
parameter is called from JavaScript, and another function F is passed as this parameter, we have two options:(1) We can perform a relatively complex check whether F is a function that could be called from Wasm (because it originated there, or was prepared for it via an import/export cycle or the "Type Reflection" proposal's
new WebAssembly.Function
constructor), and if so, "unwrap" it to its Wasm representation, and otherwise pass it along unchanged.The drawback is that JS-to-Wasm calls get more expensive whenever they have
anyref
-typed parameters. When the value being passed is a function of any kind, the overhead increases further.(2) We can unconditionally pass the pointer along, without attempting to unwrap it.
The drawback is that a function passed to Wasm this way always becomes an opaque reference on the Wasm side (which is not out of place for an
anyref
), even if it originally was a Wasm function. In particular, aref.is_func
check on it would return0
, and it could not be cast to eitherfuncref
or a more specific signature.My inclination is that (2) is preferable, because
anyref
is in particular useful for round-tripping opaque host references, and performance matters. However, I concede that this behavior is somewhat displeasing, in particular when considering that the same "can't downcast it back" limitation would not apply to structs/arrays coming back from a similar roundtrip through JavaScript. (That's because we're investing a lot of effort to make sure structs/arrays can be passed around with maximum efficiency, i.e. just passing the pointers along -- they don't have to be callable so that doesn't cause performance overhead elsewhere.)If we decided not to reintroduce the
funcref <: anyref
relation, then the question wouldn't pose itself, and (2) would be the obvious way to go.The text was updated successfully, but these errors were encountered: