From 104fc4a97ce536b7193bd3a56ff304b5384c5ec2 Mon Sep 17 00:00:00 2001 From: Igor Konnov Date: Mon, 4 Jul 2022 11:12:49 +0200 Subject: [PATCH 1/4] wip: partial update of ADR002 --- docs/src/adr/002adr-types.md | 55 ++++++++++++++++++------------------ 1 file changed, 28 insertions(+), 27 deletions(-) diff --git a/docs/src/adr/002adr-types.md b/docs/src/adr/002adr-types.md index 52e5840bd9..3f3eb5780e 100644 --- a/docs/src/adr/002adr-types.md +++ b/docs/src/adr/002adr-types.md @@ -2,7 +2,7 @@ | authors | revision | revision date | | -------------------------------------- | --------:| --------------:| -| Shon Feder, Igor Konnov, Jure Kukovec | 5 | April 08, 2022 | +| Shon Feder, Igor Konnov, Jure Kukovec | 6 | July 01, 2022 | *This is an architectural decision record. For user documentation, check the [Snowcat tutorial][] and [Snowcat HOWTO][].* @@ -12,7 +12,7 @@ alternative solutions. In this __ADR-002__, we fix one solution that seems to be most suitable. The interchange format for the type inference tools will be discussed in a separate ADR. -1. How to write types in TLA+ (Type System 1). +1. How to write types in TLA+ (Type Systems 1 and 1.2). 1. How to write type annotations (as a user). This document assumes that one can write a simple type checker that computes @@ -20,12 +20,6 @@ the types of all expressions based on the annotations provided by the user. Such an implementation is provided by the type checker Snowcat. See the [manual chapter](../apalache/typechecker-snowcat.md) on Snowcat. -In contrast, the [type inference -algorithm](https://github.com/informalsystems/apalache/tree/types) by @Kukovec -is fully automatic and thus it eliminates the need for type annotations. -Jure's algorithm is using Type System 1 too. The type inference algorithm -is still in the prototype phase. - System engineers often want to write type annotations and quickly check types when writing TLA+ specifications. This document is filling this gap. @@ -33,6 +27,9 @@ when writing TLA+ specifications. This document is filling this gap. ### 1.1. Type grammar (Type System 1, or TS1) +**Upgrade warning.** This system will be replaced with [Type System 1.2](#ts12) +in September of 2022. + We simply write types as strings that follow the type grammar: ``` @@ -41,6 +38,7 @@ T ::= 'Bool' | 'Int' | 'Str' | 'Set' '(' T ')' | 'Seq' '(' T ')' | '<<' T ',' ...',' T '>>' + // warning: the following type will be removed in Type System 1.2 | '[' field ':' T ',' ...',' field ':' T ']' | '(' T ',' ...',' T ')' '=>' T | typeConst @@ -63,7 +61,8 @@ The type rules have the following meaning: - The rule `<>` produces a tuple type over types that are produced by `T`. *Types at different positions may differ*. - The rule `[field: T, ..., field: T]` produces a record type over types that - are produced by `T`. *Types at different positions may differ*. + are produced by `T`. Types at different positions may differ. + *This syntax will change in [Type System 1.2](#ts12).* - The rule `(T, ..., T) => T` defines an operator whose result type and parameter types are produced by `T`. - The rule `typeConst` defines an uninterpreted type (or a reference to a type alias), look for an explanation below. - The rule `typeVar` defines a type variable, look for an explanation below. @@ -127,17 +126,24 @@ of `typeConst`, the name should be an identifier in the upper case. The type che of the constant type. For examples, see [Section 2.4](#useTypeAlias). -### 1.3. Type System 1.2, including precise records, variants, and rows + +### 1.3. Type System 1.2, including precise records, variants, and rows -**This is work in progress.** You can track the progress of this work in [Issue -401][]. Once this work is complete, we will switch to Type System 1.2. +**Feature under test.** By default, Snowcat is using Type System 1. To enable +Type System 1.2, pass the following argument to `typecheck` or another command, +e.g., `check`: + +```sh +$ apalache-mc typecheck --features=rows MySpec.tla +``` As discussed in [ADR014][], many users expressed the need for precise type checking for records in Snowcat. Records in untyped TLA+ are used in two capacities: as plain records and as variants. While the technical proposal is given in [ADR014][], we discuss the extension of the type grammar in this -ADR-002. To this end, we extend the grammar with new records, variants, and -rows as follows: +ADR-002. If you do not know about row typing, it may be useful to check the +Wikipedia page on [Row polymorphism][]. We extend the grammar with new +records, variants, and rows as follows: ``` // Type System 1.2 @@ -159,11 +165,9 @@ T2 ::= | 'Variant' '(' ')' variantOption ::= - // A variant option with a fully defined structure. - | { tag: stringLiteral, field: T2, ..., field: T2 } - // a variant option with a partially defined structure - // (a variant option over a row). - | { tag: stringLiteral, field: T2, ..., field: T2, typeVar } + // A variant option with a fully defined structure, + // tagged with a name that is defined with 'identifier' + identifier '(' T2 ')' // Special syntax for the rows, which is internal to the type checker. row ::= @@ -186,12 +190,8 @@ row ::= Int, b: Str, f: Bool, g: Set(Int) }`. * `v1` is a variant that has one of the two possible shapes: - - - It has the fields `tag` of type `Str` and `a` of type `Int` (if the field - `tag` is equal to `"A"`). - - - It has the fields `tag` of type `Str` and `b` of type `Bool` (if the field - `tag` is equal to `"B"`). + - It is tagged with `A` and is associated a value of type `Int`. + - It is tagged with `B` and is associated a value of type `B`. * `v2` is an empty variant, which admits no options. It has the type `Variant()`. @@ -252,7 +252,7 @@ better tests the field `type` carefully. [tag |-> "2a", bal |-> 2, val |-> 3]} ``` -In Type System 1.2 ([Section 1.3](#ts-1.2)), this set has the type of a set over a variant +In Type System 1.2 ([Section 1.3](#ts12)), this set has the type of a set over a variant type: ```tla @@ -275,7 +275,7 @@ which is used internally by types for user-defined operators, on top of their types for TLA+ expressions that do not contain user-defined operators. We expect that this type system will evolve in the future. That is why we call -it __Type System 1__. [Section 1.3](#ts-1.2) presents its extension to __Type System +it __Type System 1__. [Section 1.3](#ts12) presents its extension to __Type System 1.2__. Feel free to suggest __Type System 2.0__ :-) ## 2. How to write type annotations (as a user) @@ -529,3 +529,4 @@ AtMostOne == [Snowcat HOWTO]: https://apalache.informal.systems/docs/HOWTOs/howto-write-type-annotations.html [ADR014]: https://github.com/informalsystems/apalache/blob/unstable/docs/src/adr/014adr-precise-records.md [Issue 401]: https://github.com/informalsystems/apalache/issues/401 +[Row polymorphism]: https://en.wikipedia.org/wiki/Row_polymorphism From 3ad792bdc6a4ffead002297cd53a23139030897c Mon Sep 17 00:00:00 2001 From: Igor Konnov Date: Tue, 5 Jul 2022 12:46:09 +0200 Subject: [PATCH 2/4] update ADR002 with variants --- docs/src/adr/002adr-types.md | 122 +++++++++++++++++++++-------------- 1 file changed, 73 insertions(+), 49 deletions(-) diff --git a/docs/src/adr/002adr-types.md b/docs/src/adr/002adr-types.md index 3f3eb5780e..c6e63f7b51 100644 --- a/docs/src/adr/002adr-types.md +++ b/docs/src/adr/002adr-types.md @@ -2,7 +2,7 @@ | authors | revision | revision date | | -------------------------------------- | --------:| --------------:| -| Shon Feder, Igor Konnov, Jure Kukovec | 6 | July 01, 2022 | +| Shon Feder, Igor Konnov, Jure Kukovec | 7 | July 05, 2022 | *This is an architectural decision record. For user documentation, check the [Snowcat tutorial][] and [Snowcat HOWTO][].* @@ -147,34 +147,32 @@ records, variants, and rows as follows: ``` // Type System 1.2 -T2 ::= +T12 ::= // all types of Type System 1 T // A new record type with a fully defined structure. - // The set of fields may be empty. - | '{' field ':' T2 ',' ...',' field ':' T2 '}' - // A new record type with a partially defined structure - // (the type variable should be a 'row'). - // The set of fields may be empty. - | '{' field ':' T2 ',' ...',' field ':' T2 ',' typeVar '}' - // A variant that contains several options. - | variantOption '|' ... '|' variantOption - // A variant of undefined structure (the type variable should be a 'row') + // The set of fields may be empty. If typeVar is present, + // the record type is parameterized (typeVar must be of the 'row' kind). + | '{' field ':' T12 ',' ...',' field ':' T12 [',' typeVar] '}' + // A variant that contains several options, + // optionally parameterized (typeVar must be of the 'row' kind). + | variantOption '|' ... '|' variantOption '|' [typeVar] + // A purely parameterized variant (typeVar must be of the 'row' kind). | 'Variant' '(' typeVar ')' - // An empty variant - | 'Variant' '(' ')' variantOption ::= // A variant option with a fully defined structure, // tagged with a name that is defined with 'identifier' - identifier '(' T2 ')' + identifier '(' T12 ')' // Special syntax for the rows, which is internal to the type checker. row ::= - // A row with a fully defined structure. - | '(|' field ':' T2 '|' ...'|' field ':' T2 '|)' - // A row with a partially defined structure (ending with a row). - | '(|' field ':' T2 '|' ...'|' field ':' T2 '|' typeVar '|)' + // A row with a fully defined structure + // (having at least one field). + | '(|' field ':' T12 '|' ...'|' field ':' T12 '|)' + // A row with a partially defined structure + // (having at least one field and ending with a variable of the 'row' kind). + | '(|' field ':' T12 '|' ...'|' field ':' T12 '|' typeVar '|)' ``` **Examples.** @@ -185,21 +183,18 @@ row ::= * `r2` is a record that has the fields `a` of type `Int` and `b` of type `Str` and other fields, whose precise structure is captured with a type variable `c`. The type of `r2` is `{ a: Int, b: Str, c }`. More precisely, the - variable `c` should be a row. For instance, `c` can be equal to the row `(| + variable `c` must be a row. For instance, `c` can be equal to the row `(| f: Bool | g: Set(Int) |)`; in this case, `r2` would be a record of type `{ a: Int, b: Str, f: Bool, g: Set(Int) }`. * `v1` is a variant that has one of the two possible shapes: - - It is tagged with `A` and is associated a value of type `Int`. - - It is tagged with `B` and is associated a value of type `B`. + - Either it carries the tag `A` and an associated value of type `Int`, or + - It carries the tag `B` and an associated value of type `Bool`. + - The type of `v1` is `A(Int) | B(Bool)`. -* `v2` is an empty variant, which admits no options. It has the type - `Variant()`. - -* `v3` is a variant whose structure is defined by the type variable `b`. - The type of `v3` is `Variant(b)`. Note that `b` type variable should be - a row. For instance, it can be equal to the type `(| A: { tag: Str, f: Int } - | B: { tag: Str, g: Str } |)`. +* `v2` is a variant whose structure is entirely defined by the type variable + `b`. The type of `v2` is `Variant(b)`. Note that `b` must be a + row. For instance, it could be equal to `(| A: Int | B: Str |)`. Note that this syntax encapsulates rows in records and variants. We introduce the syntax for row types for completeness. Most likely, the users will never @@ -235,7 +230,7 @@ captures all interesting cases that occur in practice. Obviously, this type system considers ill-typed some perfectly legal TLA+ values. For instance, we cannot assign a reasonable type to `{1, TRUE}`. -**Sets of records in Type System 1.** +**Sets of tagged records in Type System 1.** We can assign a reasonable type to `{[type |-> "1a", bal |-> 1], [type |-> "2a", bal |-> 2, val |-> 3]}`, a pattern that often occurs in practice, e.g., see @@ -245,28 +240,53 @@ probably not what you expected, but it is the best type we can actually compute without having algebraic datatypes in TLA+. It also reminds the user that one better tests the field `type` carefully. -**Sets of records in Type System 1.2.** Consider the following set: +In retrospect, we have found that almost every user of Apalache made a typo in +their record types (including the Apalache developers!). Hence, we are +migrating to Type System 1.2. + +**Sets of tagged records (variants) in Type System 1.2.** Apalache provides the +user with the module [Variants.tla][] that implements operators over [variant +types][]. + +Using variants, we can write the above set of messages as follows: ```tla -{[tag |-> "1a", bal |-> 1], - [tag |-> "2a", bal |-> 2, val |-> 3]} +{ + Variant("M1a", [bal |-> 1]), + Variant("M2a", [bal |-> 2, val |-> 3]) +} ``` In Type System 1.2 ([Section 1.3](#ts12)), this set has the type of a set over a variant type: ```tla -Set({ tag: "1a", bal: Int } | { tag: "2a", bal: Int, val: Int }) + Set( + M1a({ bal: Int }) + | M2a({ bal: Int, val: Int }) + | a + ) +``` + +Note that the variant type is open-ended (parameterized with `a`) in the above +example, as we have not restricted its type. If we want to restrict the type to +exactly two options, we have to do that explicitly: + +```tla + \* @typeAlias: MESSAGE = M1a({ bal: Int }) | M2a({ bal: Int, val: Int }); + LET \* @type: Int => MESSAGE; + M1a(bal) == Variant("M1a", [bal |-> bal]) + IN + LET \* @type: (Int, Int) => MESSAGE; + M2a(bal, val) == Variant("M2a", [bal |-> bal, val |-> val]) + IN + { M1a(1), M2a(2, 3) } ``` -The value of the field `tag` serves as a type tag. However, we have to fix a -set of patterns that turn a variant type into a precise record type. In untyped -TLA+, such pattern is a set comprehension, e.g., `{ r \in S: r.tag = "1a" }`. -In the typed version, we define a minimal set of operators over variants in the -module -[Variants.tla](https://github.com/informalsystems/apalache/blob/unstable/src/tla/Variants.tla). -For instance, instead of writing the set comprehension, we have to use a filter -over a set of variants: `FilterByTag(S, "1a")`. +Many programming languages would automatically declare constructors such as +`M2a` and `M1a` from the type declaration. Since we are extending TLA+ with +types, we have to introduce some idiomatic boilerplate code. This could be +handled better in a surface syntax that is designed with types in mind. **Other type systems.** Type System 1 is also very much in line with the [type system by Stephan Merz and Hernan Vanzetto](https://dblp.org/search?q=Automatic+Verification+of+%7BTLA%7D+%2B+Proof+Obligations+with+%7BSMT%7D+Solvers), @@ -404,12 +424,14 @@ The type checker uses the type annotation to refine the type of an empty set ### 2.4. Introducing and using type aliases -A type alias is introduced with the annotation `@typeAlias: = ;` -on a dummy operator called `TypeAliases`. For example: +A type alias is introduced with the annotation `@typeAlias: = ;`. +Since it is convenient to group type aliases of a module `MyModule` +in one place, we usually use the following idiom: ```tla -\* @typeAlias: ENTRY = [a: Int, b: Bool]; -EXTypeAliases = TRUE +\* @typeAlias: ID = Int; +\* @typeAlias: ENTRY = { a: ID, b: Bool }; +MyModule_typedefs == TRUE VARIABLE \* @type: Set(ENTRY); @@ -422,8 +444,8 @@ Foo(ms, m) == The use of the dummy operator is a convention followed to simplify reasoning about where type aliases belong, and to ensure all aliases are located in one -place. The `` convention protects against name clashes when the module -is extended or instantiated. +place. The prefix such as the module name protects against name clashes when +the module is extended or instantiated. The actual rules around the placement of the `@typeAlias` annotation allows more flexibility: @@ -433,7 +455,7 @@ flexibility: 1. The names of type aliases must be unique in a module. 1. There is no scoping for aliases within a module. Even if an alias is defined - deep in a tree of LET-IN definitions, it can be references at any level in + deep in a tree of LET-IN definitions, it can be referenced at any level in the module. ## 3. Example @@ -469,7 +491,7 @@ CONSTANT Offers VARIABLE - \* @type: INGREDIENT -> [smoking: Bool]; + \* @type: INGREDIENT -> { smoking: Bool }; smokers, \* @type: Set(INGREDIENT); dealer @@ -530,3 +552,5 @@ AtMostOne == [ADR014]: https://github.com/informalsystems/apalache/blob/unstable/docs/src/adr/014adr-precise-records.md [Issue 401]: https://github.com/informalsystems/apalache/issues/401 [Row polymorphism]: https://en.wikipedia.org/wiki/Row_polymorphism +[Variants.tla]: https://github.com/informalsystems/apalache/blob/unstable/src/tla/Variants.tla +[variant types]: https://en.wikipedia.org/wiki/Tagged_union From af175d5e19d670e60e2fbe9878daa8c4dd7a3553 Mon Sep 17 00:00:00 2001 From: Igor Konnov Date: Tue, 5 Jul 2022 12:48:56 +0200 Subject: [PATCH 3/4] add release notes --- .unreleased/documentation/1922.md | 1 + 1 file changed, 1 insertion(+) create mode 100644 .unreleased/documentation/1922.md diff --git a/.unreleased/documentation/1922.md b/.unreleased/documentation/1922.md new file mode 100644 index 0000000000..ee89edc789 --- /dev/null +++ b/.unreleased/documentation/1922.md @@ -0,0 +1 @@ +- Update ADR002 with the new syntax for variants, see #1922 From a42c37992d1deda91ff4822fedae00c870fc404f Mon Sep 17 00:00:00 2001 From: Igor Konnov Date: Tue, 5 Jul 2022 15:11:22 +0200 Subject: [PATCH 4/4] Apply suggestions from code review Co-authored-by: Thomas Pani Co-authored-by: Shon Feder --- .unreleased/documentation/1922.md | 2 +- docs/src/adr/002adr-types.md | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/.unreleased/documentation/1922.md b/.unreleased/documentation/1922.md index ee89edc789..95a987dc3e 100644 --- a/.unreleased/documentation/1922.md +++ b/.unreleased/documentation/1922.md @@ -1 +1 @@ -- Update ADR002 with the new syntax for variants, see #1922 +Update ADR002 with the new syntax for variants, see #1922 diff --git a/docs/src/adr/002adr-types.md b/docs/src/adr/002adr-types.md index c6e63f7b51..8446bc4ef3 100644 --- a/docs/src/adr/002adr-types.md +++ b/docs/src/adr/002adr-types.md @@ -61,7 +61,7 @@ The type rules have the following meaning: - The rule `<>` produces a tuple type over types that are produced by `T`. *Types at different positions may differ*. - The rule `[field: T, ..., field: T]` produces a record type over types that - are produced by `T`. Types at different positions may differ. + are produced by `T`. *Types at different positions may differ.* *This syntax will change in [Type System 1.2](#ts12).* - The rule `(T, ..., T) => T` defines an operator whose result type and parameter types are produced by `T`. - The rule `typeConst` defines an uninterpreted type (or a reference to a type alias), look for an explanation below. @@ -130,7 +130,7 @@ of the constant type. For examples, see [Section 2.4](#useTypeAlias). ### 1.3. Type System 1.2, including precise records, variants, and rows **Feature under test.** By default, Snowcat is using Type System 1. To enable -Type System 1.2, pass the following argument to `typecheck` or another command, +Type System 1.2, pass `--features=rows` to `typecheck` or another command, e.g., `check`: ```sh @@ -240,7 +240,7 @@ probably not what you expected, but it is the best type we can actually compute without having algebraic datatypes in TLA+. It also reminds the user that one better tests the field `type` carefully. -In retrospect, we have found that almost every user of Apalache made a typo in +In retrospect, we have found that almost every user of Apalache made typos in their record types (including the Apalache developers!). Hence, we are migrating to Type System 1.2. @@ -445,7 +445,7 @@ Foo(ms, m) == The use of the dummy operator is a convention followed to simplify reasoning about where type aliases belong, and to ensure all aliases are located in one place. The prefix such as the module name protects against name clashes when -the module is extended or instantiated. +the module is extended or instantiated. The actual rules around the placement of the `@typeAlias` annotation allows more flexibility: