-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Exact Table Types #624
Closed
+250
−0
Closed
Changes from 2 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,245 @@ | ||
# Exact Table Types | ||
|
||
## Summary | ||
|
||
In Luau, we currently support two kinds of table types: sealed, inexact table types (the kind you | ||
use as parameter types) and unsealed, exact table types (the kind the type inference engine uses | ||
when checking table literals and subsequent assignments to them in the same scope). The RFCs for | ||
[unsealed table types][unsealed-tables] use language like _precision_, but is unclear about their | ||
exact meaning. This RFC aims to clarify the language and syntax around different facets of table | ||
types, and proposes the introduction of sealed, exact table types and a migration plan to make them | ||
the default table type. | ||
|
||
## Motivation | ||
|
||
In the current design of table types, we have conflated two axes of properties, namely whether | ||
tables are sealed vs. unsealed and whether tables are exact vs. inexact. In the interest of both | ||
making the language clearer and in making typechecking behavior easier to implement and explain, we | ||
want to disentangle these concepts. For the sake of precision, this RFC aims to always specify both | ||
facets, but in practice, we believe programmers will only really be concerned with whether or not a | ||
table type is exact. | ||
|
||
The first distinction, sealed vs. unsealed, deals with state during typechecking. Namely, a _sealed_ | ||
table type is one that will not be changed during typechecking. So, if we have a binding with a | ||
sealed table type, we know that the type of the binding itself will not change. By contrast, | ||
_unsealed_ table types are stateful, and an assignment to a binding can grow the set of properties | ||
on that type. All unsealed tables are sealed when they leave the scope they were created in, so | ||
unsealed tables should be thought of as a technique used for typechecking imperative-style table | ||
initialization code. | ||
|
||
The second distinction, exact vs. inexact, deals with how the table type relates to other table | ||
types. In particular, an exact table type indicates that the table has _only_ the properties listed | ||
in its type, while an inexact table type indicates that the table has _at least_ the properties | ||
listed in its type. This corresponds to whether or not _width subtyping_ is allowed on the type. | ||
|
||
With this language, we currently have two table types which are determined contextually, and have no | ||
syntactic difference. When a table type is used as the type of a function parameter, e.g. `function | ||
getx(t : {x: number}) return x end`, we treat is as _sealed_ and _inexact_ meaning the type of the | ||
parameter will _not_ change, but call sites to the function will permit tables that have _more_ | ||
properties than the ones listed. Within the body of the function however, the typechecker should | ||
only permit the use of properties explicitly listed in the type since it was annotated. If the type | ||
was instead inferred, a dual condition should be true --- the properties explicitly listed in | ||
the type are precisely the ones whose existence the function depends on. Meanwhile, when a table | ||
type is used for a local definition, it is given an _unsealed_ and _exact_ type meaning we always | ||
know _precisely_ the set of properties it has, but that the type itself is stateful and assignments | ||
are allowed to grow the set of properties the table has. | ||
|
||
This alludes to a gap of two sorts of table types: __sealed, exact tables_ and _unsealed, inexact | ||
tables_. We believe the latter is undesirable in the sense that there does not appear to be a | ||
compelling use case for _forgetting_ some of the structure of the table you're building statefully. | ||
The main consequence of doing so would be that you would be unable to make further assignments to a | ||
property after forgetting it was present. The former, however, is the novel proposal for this RFC. | ||
|
||
## Design | ||
|
||
A _sealed, exact table_ type is effectively the most strict we could be about a particular table. It | ||
says that the table does not allow width subtyping, and therefore must have _exactly_ the properties | ||
described by the type, and further says that the type itself is final meaning assignments to missing | ||
properties would subsequently be rejected. To imagine why this is useful, we can consider a small | ||
fragment of a hypothetical vector math library: | ||
|
||
```lua | ||
type Vec2 = { x: number, y: number } | ||
type Vec3 = { x: number, y: number, z: number } | ||
|
||
function length(pt: Vec2) | ||
return math.sqrt(pt.x^2 + pt.y^2) | ||
end | ||
``` | ||
|
||
If the table type `Vec2` here is `inexact`, we would allow `length` to be applied to a table of type | ||
`Vec3`. However, in this case, our domain knowledge tells us that `length`'s behavior is incorrect | ||
for `Vec3` since it does not take into account the `z`-component. While we could design a function | ||
that instead worked for both, it may be reasonable to instead say that this function only operates | ||
on the _exact_ table. While this particular example is admittedly a bit contrived, | ||
[similar proposals][ts-exact-types] for other languages like TypeScript have collected many examples | ||
where API designers wanted to be able specify exact types. One general pattern of use cases is | ||
state machines where you might define a number of types representing individual states and then a | ||
union over all such types. With only inexact types, the type system permits undesirable combinations | ||
that can introduce unwanted behavior in code using the state machine. | ||
|
||
```luau | ||
type StartState = { | ||
state: "Start", | ||
startData: number, | ||
} | ||
|
||
type OtherState = { | ||
state: "Other", | ||
otherData: string, | ||
} | ||
|
||
type State = StartState | OtherState | ||
|
||
-- but we can construct a `State` that is neither a `StartState` nor an `OtherState`. | ||
local notRight: State = { state: "Start", startData: 42, otherData: "foo" } | ||
``` | ||
|
||
Since this proposal would involve having two sorts of sealed table types, _exact_ and _inexact_, and | ||
their difference is not something we can distinguish contextually in general (as was the case was | ||
_sealed_ and _unsealed_), we have to propose a syntax for them. This proposal posits that the best | ||
path forward is to make sealed, exact tables the default table that you get when you write, e.g. `{ | ||
x: number, y: number }` as an annotation on a parameter or in a type alias. To annotate a table as | ||
inexact and therefore open to extension, we would include an ellipsis in the type, e.g. `{ x: | ||
number, y: number, ...}`. This was originally proposed as an alternative design in the [width | ||
subtyping][width-subtyping] proposal. | ||
|
||
There are a few apparent benefits to this approach. First, it means that a programmer designing an | ||
API has to make a conscious decision that "yes, in fact, it is sensible to treat all of the unlisted | ||
properties of this parameter as being unrelated to the behavior of this function." The type checker | ||
should prefer approaches that, by default, rule out more possible bugs in API usage while keeping | ||
explicit acceptance of the looser behavior natural and low friction. The motivation for taking this | ||
approach really being that, as a programmer, you are _forced_ to notice when a program you want to | ||
succeed produces an error, but you can remain blissfully unaware when programs you want to be | ||
rejected succeed silently. | ||
|
||
Secondly, though relatedly, making all table types inexact-by-default seems to exacerbate confusions | ||
when considering both the meaning of and the subtyping relations between table types. Consider, for | ||
example, the inexact table type `{...}` (currently written `{}` in the inexact-by-default world). | ||
Right away, the syntax of the type matches the syntax of the value form for the empty table. This | ||
makes it tempting to think of the type as something like "the empty table type" which one might | ||
expect has a single inhabiting value, namely the empty table. However, with inexact-by-default, this | ||
`{}` type is really the type of _every single table_ and it functions as a sort of top type for | ||
tables. If you do mistakenly read it (even momentarily) as the empty table type, you can then be | ||
surprised by the fact that it is _not_ a subtype of any more specific table, such as `{ x: number, | ||
y: number }`. This makes sense with the understanding in mind that it is the type of _all_ tables, | ||
rather than just the empty table, but then you may be surprised that it _is_ a subtype of `{[any]: | ||
any}`, but is not a subtype of, say, `{[any]: number}`. These sorts of surprises and confusions have | ||
come up in internal discussions of subtyping for Luau, and the syntax appears to not be doing us any | ||
favors here. | ||
|
||
Thirdly, we argue that it gives the most coherent and intuitive syntax for distinguishing exact vs. | ||
inexact tables since we believe that being able to distinguish between them and choose the | ||
appropriate one is important. Ellipses have an already established connotation of omission which | ||
maps exactly to how sealed, inexact types are _omitting_ some of the properties that the tables | ||
actually have (since tables with arbitrarily many additional properties can be accepted at an | ||
inexact table type). The author is unaware of existing syntax that carries the same strong | ||
connotation, but for exactness. One proposal discussed in the alternatives would be to follow Flow | ||
in using `{| x: number, y: number |}` to mean a sealed, exact table. | ||
|
||
Though we are proposing to make the default interpretation of table _annotations_ sealed and exact, | ||
we maintain that the sensible default inference behavior is to default to inferring inexact table | ||
types for unannotated table parameters. Consider the following example: | ||
|
||
```lua | ||
function dostuff(t) | ||
... | ||
t.x = 4 | ||
... | ||
t.y = 5 | ||
... | ||
end | ||
``` | ||
|
||
In this case, we expect that `t` will be given the inferred type `{x: number, y: number, ...}`. | ||
However, if we also passed `t` to another function that requires an exact table, we can propagate | ||
this requirement to `dostuff`. | ||
|
||
```lua | ||
function dostuff(t) | ||
... | ||
t.x = 4 | ||
... | ||
t.y = 5 | ||
... | ||
dootherstuff(t) | ||
... | ||
end | ||
|
||
function dootherstuff(pt: {x: number, y: number}) | ||
... | ||
end | ||
``` | ||
|
||
So, in this case, `t` would be given the exact table type `{x: number, y: number}` during inference. | ||
Note that this is still true even if `dostuff` did not refer to some or all of the properties of `t` | ||
outside of passing it into `dootherstuff`. That is, the type will be the same even if the code was | ||
as follows: | ||
|
||
```lua | ||
function dostuff(t) | ||
return dootherstuff(t) | ||
end | ||
|
||
function dootherstuff(pt: {x: number, y: number}) | ||
... | ||
end | ||
``` | ||
|
||
If instead the table is passed to a function requiring an exact table type with _different_ | ||
properties than the ones used in `dostuff`, we will emit an error indicating that they are in | ||
conflict. | ||
|
||
```lua | ||
function dostuff(t) | ||
... | ||
t.z = 4 | ||
... | ||
... | ||
dootherstuff(t) | ||
... | ||
end | ||
|
||
function dootherstuff(pt: {x: number, y: number}) | ||
... | ||
end | ||
``` | ||
|
||
In this case, `t` has two constraints: one indicating that it should be the exact table type | ||
`{x: number, y: number}` and the other indicating that it should have the property `z`. These | ||
constraints cannot be reconciled, and so we will get an error at the call site of `dootherstuff`. | ||
|
||
## Drawbacks | ||
|
||
The main drawback of doing this is that the existing RFC for [sealed table | ||
subtyping][width-subtyping] was already implemented as of March 1st, meaning that this is ultimately | ||
a breaking change. As such, it would be important to manage the change as a gradual rollout to | ||
mitigate problems. Since Luau currently only has sealed, inexact table types we can start by adding | ||
the syntax for explicit inexact tables (e.g. `{x: number, y: number, ...}`) and implement a warning | ||
that explains the default behavior will change in the future and to change it to the explicit | ||
inexact syntax if the programmer would like to keep the current behavior. At a later date, we can | ||
enable exact-by-default, but should provide an explicit suggestion in the error message to tell the | ||
programmer that if they _intended_ to allow tables with additional properties, they can change the | ||
signature of the function to include the `...`. | ||
|
||
## Alternatives | ||
|
||
There are two primary alternatives to this proposal. First, we could stick with the decision that | ||
was already made in the [sealed table subtyping][width-subtyping] proposal and leave sealed, inexact | ||
tables as the default table type. We then have either the choice of doing nothing else or adding | ||
a new type syntax for sealed, exact tables. If we do nothing, we are reducing the ability for API | ||
designers to rule out certain classes of bugs like the one presented with vector types. The | ||
implicit assumption of width subtyping, after all, is that properties not listed in the type are | ||
orthogonal to the function using that type, but of course, this assumption is sometimes wrong. It is | ||
possible still that inexactness is so often necessary that we see little safety benefit from making | ||
exactness available. If we add a new syntax for sealed, exact tables instead, the most sensible | ||
syntax would likely follow Flow in including pipes, so `{| x: number, y: number |}` would be the | ||
type of a sealed, exact table. This approach does not require a migration path, but would likely | ||
reflect the belief that the added safety of exact table types is only rarely necessary. The author | ||
believes it is worth considering, however, that Flow has also been migrating to | ||
[exact-by-default][flow-exact-by-default] for their object types. | ||
|
||
[unsealed-tables]: https://github.com/Roblox/luau/blob/master/rfcs/unsealed-table-literals.md | ||
[ts-exact-types]: https://github.com/microsoft/TypeScript/issues/12936 | ||
[width-subtyping]: https://github.com/Roblox/luau/blob/master/rfcs/sealed-table-subtyping.md | ||
[flow-exact-by-default]: https://medium.com/flow-type/how-to-upgrade-to-exact-by-default-object-type-syntax-7aa44b4d08ab |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the discussion about how inexact tables will work should be more prominent than just a section inside "Drawbacks", or is this intended to go in a different RFC? It feels like an important point, especially migration
e.g., Personally, I have no particular opinion, but were alternatives in syntax considered (such as TypeScript's
interface { }
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is (admittedly brief) mention in the design section about the syntax of inexact tables. Is there something you're concerned isn't discussed? I can, at minimum, say explicitly that the behavior of inexact tables defined using the proposed inexact syntax would be the same as the current behavior (i.e. they support the width subtyping already accepted and implemented in #38.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, I see it now - I must have missed it earlier.
No particular issue for me, just a note that it is very likely to get comments down the road from users about "why was XYZ not chosen", especially for syntax - but this is more of a nitpicking comment