-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorporate RTTs into the Type #119
Comments
RTTs are explicit in order to
They also are a possible extension hook for other features and host-specific customisation, e.g. attaching JS object prototypes or accessors through an external API.
Not once we have type parameters, see the example in Post-MVP. Moreover, those are supposed to support polymorphic recursion (as allowed by languages like C# or OCaml), so there generally can't even be an instantiation-time bound on the number of RTTs needed. |
I'm not sure what the point of this comment is. The suggestion I gave them makes RTTs even more explicit.
The design for type parameters in #114 has multiple problems. One of which is its misunderstanding of how casting works in languages with parametric polymorphism. #116 specifically discusses why the example your reference is not how pairs are implemented. Another problem not raised in #116 is that the RTT design fails to incorporate the covariance of pairs. These casting problems with the post-MVP are of course in addition to its undecidability problem. Given these problems, the post-MVP should not be used as an argument against considering alternatives until it has been revised and demonstrated to work for an actual language. On the other hand, there are extensions to the suggestion I make here that have been implemented and demonstrated to work for a major industry language. So the claim of practical extensibility here is not a hypothesis—it is a proven fact. |
Can you elaborate? I don't understand this part. |
@taralx, in production engines, type descriptors, as used by the GC and for other purposes, tend to be represented as separate heap objects. For example, in existing Web engines, these are known as hidden classes, maps (V8), or shapes (SpiderMonkey). They tend to be allocated and managed by fairly elaborate mechanisms, shared between objects, and used in a first-class manner inside the engine. At first glance, making this explicit may seem like overkill for the MVP, but it becomes more relevant once we introduce type parameters:
(An alternative would be to mark type parameters as "static" and "dynamic", but that is more coupled and has the disadvantage of not explicating the cost for allocating new RTTs.) |
The techniques in #116 talking about how existing systems support parametric polymorphism, without allocating new RTTs dependent on RTTs for type parameters, extend to first-class polymorphism and polymorphic recursion. I think the issue you are referring to is these features in combination with reified downcasting. But reified downcasting also does not need to allocate new RTTs dependent on RTTs for type parameters. That is a particular implementation strategy with alternatives that language implementations choose between depending on how they expect the tradeoffs to apply to their application. In particular, it is a space-saving technique, just like putting the v-table in the descriptor saves space. But unlike v-tables where objects of the same exact class will always have the same methods, objects of the same class will have different type arguments, and so putting those type arguments into the descriptor rather than the object can introduce time overhead and thwart a number of optimizations (e.g. caches). In some cases, like immutable pairs, the type arguments can be completely redundant because the values in the pair already have the relevant RTT information. Also, your use of a cache to ensure a canonical representative requires multithreaded coordination, introducing a synchronization point at every such allocation in multithreaded systems with sharing. All in all, there's a wide variety of implementation strategies for reified downcasting, and building in a particular one would seem to be counter to WebAssembly's design philosophy. Furthermore, there are already published techniques that specifically support reified downcasting. Those techniques are expressive enough to let the application choose how it implements reified downcasting, including the implementation strategy you are suggesting as well as the others I mentioned. They are compatible with the suggestion made here—in fact, they rely on the suggestion made here. They also do not require all polymorphic functions to be reified—it is still the application that is in charge of explicitly passing around reification information. In fact, these techniques support optimizations/expressiveness that significantly reduce the amount of reification information that needs to be passed around. The Post-MVP, on the other hand, does not support these optimizations/expressiveness. It also cannot express common reified generic data structures like Java/C# arrays, whereas the aforementioned techniques that build on the suggestion I made here can. Altogether, I believe the suggestion made here address the important concerns you raise, and in a manner better than the Post-MVP. |
Could you expand on this? Even if RTTs were to be generated at instantiation-time and referenced statically, the act of creating an RTT would still require a full structural subtyping calculation (assuming no other changes).
Based on this quote, is the idea also to change the way (e.g.) struct types are declared, by requiring that they can only be shallow compositions of RTTs? |
My sentence is referring to problems with value-level subtyping, whereas your comment seems to be referring to type-declaration validation. Value-level subtyping is done many many times throughout module validation, whereas type-declaration validation is done once. Also, "full structural sytyping calculation" is different for structural and nominal systems. With nominal systems, fixpoints in types are made explicit, as are the relationships between these fixpoints. So when we have to check if the bodies two nominal types are compatible (because we declared one nominal type to be a subtype of the other), we can do so without recursively checking the relationships between fixpoints (because they have been explicitly declared). On the other hand, a structural comparison has to recurse into these fixpoints and determine if they are compatible. This is why nominal systems can decidably express irregularly coinductive structures, whereas structural systems can only express regularly coinductive structures. The Post-MVP mentions that "recursive type definitions with parameters must [be] uniformly recursive", thereby ensuring regularity. But nearly every polymorphic language does not abide by this restriction. Here's a minified ML example of a pattern that is known to occur in practice (see Section 3 of "What are principal typings and what are they good for?" by Trevor Jim):
This datatype is inexpressible in the Post-MVP due to the above restriction, and relaxing that restriction would make the Post-MVP undecidable (even without bounded polymorphism). On the other hand, it is known how to express this datatype in a nominal type system.
Unfortunately, I do not understand your question. |
The main thrust of my question is that just moving RTT generation to instantiation-time would not make the type system nominal, so I'm trying to understand how this is accomplished. Is it still permitted for me to define something like
or would I now write something like
|
The suggestion specifies two steps, the second of which is to incorporate RTTs into the type. |
This is somewhat hard to engage with, given that I quoted that step, asked a question about it, and your response was that you didn't understand me. Is my example accurate? "Incorporating RTTs into the type" doesn't provide a clear picture of how the MVP's existing mechanism for defining types would change in this proposal. |
Oops, sorry. In switching back and forth between the two discussions I lost track of context here. Thanks for your patience. For your example, you would only write something like your first snippet. Every type declaration would specify the structure of the type as well as associate an RTT for the type. So if you had something like
they would be defining two distinct types because their values would have distinct RTTs. (Meta note just for clarity: obviously we're brushing over details many here. These examples are conceptual.) |
@rossberg Thanks for the detailed reply. It's taking me a bit of time to digest; your patience is appreciated. :) |
In this example:
Assuming this is correct, then this seems like a "true MVP" version of SOIL - the RTT is effectively the Am I correct in thinking with this version, there would be no reason for RTTs not to statically list the parent types they're tracking? |
In fact, an RTT needs to explicitly specify its immediate parent (like in
If
See #73
Woot! That's what I was aiming for. |
Oh, my example should have been more specific.
This should be a type error (sidestepping questions of whether
Ah I missed this earlier! I can also make the positive statement that this issue's proposal seems to capture a particularly small delta between the current structural MVP and a sensible nominal MVP, assuming RTTs are appropriately renamed/resyntax'd in a "full" proposal of this. |
Correct. All subtyping is intentional in this suggestion, so |
Fix a couple of parser omissions regarding elem & data syntax. Refactor grammar in spec slightly for more clarity. Adjust tests.
Closing this in favor of #275, since that is based on concrete implementer feedback and the details of the type system we have settled on for the MVP. |
Following up on #111, this is a suggestion focusing purely on reducing incompatibilies. So unlike #109, this will not add or alter any (important) functionality, it just provides a different way to express current MVP's key functionality in a way that is more extensible.
In the current MVP, every reference has an associated RTT. These RTTs are generally (always?) determinable at instantiation time. So the suggestion is
$my_rtt_for_pairs
rttref $my_rtt_for_pairs
, making explicit the most-precise statically known RTT for the value (which may have more precise RTTs that can be ascertained by downcasting)This addresses the issue with unamortizable quadratic-time subtyping (#117) because we can now use the constant-time subtyping algorithm for RTTs.
This potentially addresses the incompatibility issue with bounded parametric polymorphism (#116) because it makes the static type system nominal and there are known decidable and expressive extensions to nominal subtyping incorporating both parametric polymorphism and existential types.
If we furthermore remove the requirement that all RTTs extend from a common base RTT, then this addresses the incompatiblity issue with pointer tagging (#118) because there is no longer a universal top type that all variants must be downcastable from.
Independent of that, if we furthermore remove the expectation that RTTs are derived canonically from their type signatures, then we also provide a mechanism for ensuring data abstraction without requiring dynamic boxing and unboxing as added in WebAssembly/proposal-type-imports#8.
The text was updated successfully, but these errors were encountered: