-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Json ADR #670
Json ADR #670
Changes from 1 commit
944de9f
e0393f8
9cbb990
c2b5d11
4a3f699
db7ed14
8673ede
21525b4
3500ed4
10bf178
1840df4
785f9c5
bd6ce9c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -32,23 +32,93 @@ The following classes are serializable: | |||||
Every serialization will contain a disambiguation field, named `kind`. This field holds the name of the class being serialized. For example, the serialization of a `NameEx` will have the shape | ||||||
``` | ||||||
{ | ||||||
kind: "NameEx" | ||||||
"kind": "NameEx" | ||||||
... | ||||||
} | ||||||
``` | ||||||
|
||||||
## 3. Serializing typed entities | ||||||
## 3. Serializing tagged entities | ||||||
|
||||||
Serializations of typed entities will have an additional field named `type`, containing the type of the expression (see [ADR-002][], [ADR-004][] for a description of our type system and the syntax for types-as-string-annotations respectively). For example, the integer literal `1` is represented by a `ValEx`, which has type `Int` and is serialized as follows: | ||||||
Serializations of entities annotated with a `TypeTag` will have an additional field named `type`, containing the type of the expression (see [ADR-002][], [ADR-004][] for a description of our type system and the syntax for types-as-string-annotations respectively), if the tag is `Typed`, or `Untyped` otherwise. For example, the integer literal `1` is represented by a `ValEx`, which has type `Int` and is serialized as follows: | ||||||
``` | ||||||
{ | ||||||
kind: "NameEx" | ||||||
type: "Int" | ||||||
"kind": "ValEx", | ||||||
"type": "Int" | ||||||
... | ||||||
} | ||||||
``` | ||||||
in the typed encoding, or | ||||||
``` | ||||||
{ | ||||||
"kind": "ValEx", | ||||||
"type": "Untyped" | ||||||
... | ||||||
} | ||||||
``` | ||||||
in the untyped encoding. | ||||||
|
||||||
## 4. Serialization rules | ||||||
## 4. Source information | ||||||
Entities in the internal representation are usually annotated with source information, of the form `{filename}:{startLine}:{startColumn}-{endLine}:{endColumn}`, relating them to a file range in the provided specification (from which they may have been transformed as part of preprocessing). | ||||||
JSON encodings may, but are not required to, contain a `source` providing this information, of the following shape: | ||||||
``` | ||||||
{ | ||||||
"source": { | ||||||
"filename" : <FILENAME>, | ||||||
"from" : { | ||||||
"line" : <STARTLINE>, | ||||||
"column" : <STARTCOLUMN> | ||||||
}, | ||||||
"to" : { | ||||||
"line" : <ENDLINE>, | ||||||
"column" : <ENDCOLUMN> | ||||||
} | ||||||
} | ||||||
} | ||||||
``` | ||||||
or | ||||||
``` | ||||||
{ | ||||||
"source": "UNKNOWN" | ||||||
} | ||||||
``` | ||||||
if no source information is available (e.g. for expressions generated purely by Apalache). | ||||||
Serializations generated by Apalache are always guaranteed to contain a `source` field entry. | ||||||
|
||||||
Example: | ||||||
``` | ||||||
{ | ||||||
"kind" : "NameEx", | ||||||
"type" : "Int", | ||||||
"name" : "myName", | ||||||
"source": { | ||||||
"filename" : "MyModule.tla", | ||||||
"from" : { | ||||||
"line" : 3, | ||||||
"column" : 5 | ||||||
}, | ||||||
"to" : { | ||||||
"line" : 3, | ||||||
"column" : 10 | ||||||
} | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
## 5. Version information | ||||||
JSON serializations of `TlaModule`s contain a filed named `version`, holding a string representation of the current JSON encoding version, shaped `{major}.{minor}`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
This document defines JSON Version 1.0. If and when a different JSON version is defined, this document will be updated accordingly. | ||||||
Apalache may refuse to import, or trigger warnings for, JSON objects with obsolete versions of the encoding in the future. | ||||||
Example: | ||||||
``` | ||||||
{ | ||||||
"kind": "TlaModule", | ||||||
"name": "MyModule", | ||||||
"version": "1.0" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should introduce a different root object for the json format. We may want to add meta information to it later. So I would suggest adding another object on top of {
"name": "ApalacheIR",
"version": "1.0",
"modules": [{
"kind": "TlaModule",
"name": "MyModule",
...
}]
} |
||||||
... | ||||||
} | ||||||
``` | ||||||
|
||||||
## 6. General serialization rules | ||||||
|
||||||
The goal of the serialization is for the JSON structure to mimic the internal representation as closely as possible, for ease of deserialization. | ||||||
Concretely, whenever a class declares a field `fld: T`, its serialization also contains a field named `fld`, containing the serialization of the field value. | ||||||
|
@@ -70,23 +140,20 @@ OperEx( | |||||
Since both sub-expressions, the literals `1`, are identical, their serializations are equal as well: | ||||||
``` | ||||||
{ | ||||||
kind: "ValEx", | ||||||
type: "Int", | ||||||
value: { | ||||||
kind: "TlaInt", | ||||||
value: 1 | ||||||
"kind": "ValEx", | ||||||
"type": "Int", | ||||||
"value": { | ||||||
"kind": "TlaInt", | ||||||
"value": 1 | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
Note that in the above example, the outer `value` field contains a JSON serialization of `TlaInt(1)`, and not just the JSON literal `1`. This is because of deserialization: | ||||||
Apalache encodes its TLA+ integers as `BigInt`, which means that it permits values, for which `.isValidInt` does not hold. If we chose to encode values directly as JSON numbers, we would need a) a sensible encoding of `BigInt` values, which are not valid integers and b) a way to distinguish both variants of `BigInt`, as well as decimals, when deserializing (since JSON is not typed). | ||||||
Alternatively, we could encode all values as strings, which would be similarly indistinguishable when deserializing. Importantly, the `type` field is not guaranteed to contain a hint, as it could be `Untyped`. | ||||||
For the above reasons, we choose to instead serialize `TlaValue` as a JSON object, which is more verbose, but trivial to deserialize. It has the following shape | ||||||
Observe that we choose to serialize `TlaValue` as a JSON object, which is more verbose, but trivial to deserialize. It has the following shape | ||||||
``` | ||||||
{ | ||||||
kind: _ | ||||||
value: _ | ||||||
"kind": _ , | ||||||
"value": _ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what are the underscores? Is this standard JSON semantics? Shall we use the syntax for comments like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The underscores are, at least to me, very obvious meta-language placeholders. No less formal than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, I surely understand what you mean by |
||||||
} | ||||||
``` | ||||||
|
||||||
|
@@ -95,21 +162,27 @@ The `value` field depends on the kind of `TlaValue`: | |||||
1. For `TlaBool`: a JSON Boolean | ||||||
1. For `TlaInt(bigIntValue)`: | ||||||
1. If `bigIntValue.isValidInt`: a JSON number | ||||||
2. Otherwise: `{ bigInt: bigIntValue.toString() }` | ||||||
2. Otherwise: `{ "bigInt: bigIntValue.toString() }` | ||||||
Kukovec marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
1. For `TlaDecimal(decValue)`: a JSON string `decValue.toString` | ||||||
|
||||||
The reason for the non-uniform treatment of integers, is that Apalache encodes its TLA+ integers as `BigInt`, which means that it permits values, for which `.isValidInt` does not hold. | ||||||
|
||||||
While it might seem more natural to encode the entire `TlaValue` as a JSON primitive, without the added object layer, we would have a much tougher time deserializing. | ||||||
We would need a) a sensible encoding of `BigInt` values, which are not valid integers, and b) a way to distinguish both variants of `BigInt`, as well as decimals, when deserializing (since JSON is not typed). | ||||||
We could encode all values as strings, but they would be similarly indistinguishable when deserializing. Importantly, the `type` field of the `ValEx` expression is not guaranteed to contain a hint, as it could be `Untyped` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could cite RFC7159 to justify why we cannot just use unbounded numbers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's about the capabilities of JSON numbers, it has more to do with the unnecessary complexity of interpreting those JSON values as the correct Scala type, which we can avoid pretty much entirely by just having a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason for not using JSON number is that they have a bounded range, something like |
||||||
|
||||||
Take `jsonOf1` to be the serialization of `ValEx( TlaInt( 1 ) )( typeTag = Typed( IntT1() ) )` shown above. The serialization of `1 + 1` is then equal to | ||||||
``` | ||||||
{ | ||||||
kind: "OperEx", | ||||||
type: "Int", | ||||||
oper: "(+)", | ||||||
args: [jsonOf1, jsonOf1] | ||||||
"kind": "OperEx", | ||||||
"type": "Int", | ||||||
"oper": "(+)", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps, but that is the value of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since it is an ADR and we have #726, let's change it to |
||||||
"args": [jsonOf1, jsonOf1] | ||||||
} | ||||||
``` | ||||||
In general, for any given `oper: TlaOper` of `OperEx`, the value of the `oper` field in the serialization equals `oper.name`. | ||||||
|
||||||
## 5. Implementation | ||||||
## 7. Implementation | ||||||
|
||||||
The implementation of the serialization can be found in the class | ||||||
`at.forsyte.apalache.io.json.TlaToJson` of the module `tla-import`, see [TlaToJson][]. | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
package at.forsyte.apalache.io.json | ||
|
||
/** | ||
* JSON serialization versions need not be tied to Apalache versions | ||
* (Apalache versions, obtained from at.forsyte.apalache.tla.tooling.Version | ||
* cannot be read here anyway, due to tooling -> import dependencies) | ||
*/ | ||
object JsonVersion { | ||
val major = 1 | ||
val minor = 0 | ||
def current: String = s"$major.$minor" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I would be a bit more careful here. We should prepare the user that they may face "UNKNOWN" in rare cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, I very explicitly write "or [UNKNOWN] if no source information is available." The next line asserts that a
source
field exists, not that the contents of said field are not "UNKNOWN".There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned about this sentence:
Sometimes, we are missing source information. So "always guaranteed" is way to strong.