algorandfoundation · tristanmenzel · Jun 7, 2024 · May 27, 2024 · May 27, 2024 · May 28, 2024
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,35 @@
+# Algorand TypeScript
+
+Algorand TypeScript is a partial implementation of the TypeScript programming language that runs on the Algorand Virtual Machine (AVM). It includes a statically typed framework for development of Algorand smart contracts and logic signatures, with TypeScript interfaces to underlying AVM functionality that works with standard TypeScript tooling.
+
+It maintains the syntax and semantics of TypeScript such that a developer who knows TypeScript can make safe assumptions
+about the behaviour of the compiled code when running on the AVM. Algorand TypeScript is also executable TypeScript that can be run
+and debugged on a Node.js virtual machine with transpilation to EcmaScript and run from automated tests.
+
+# Guiding Principals
+
+## Familiarity
+
+Where the base language (TypeScript/EcmaScript) doesn't support a given feature natively (eg. unsigned fixed size integers),
+prior art should be used to inspire an API that is familiar to a user of the base language and transpilation can be used to
+ensure this code executes correctly.
+
+## Leveraging TypeScript type system
+
+TypeScript's type system should be used where ever possible to ensure code is type safe before compilation to create a fast
+feedback loop and nudge users into the [pit of success](https://blog.codinghorror.com/falling-into-the-pit-of-success/).
+
+## TEALScript compatibility
+
+[TEALScript](https://github.com/algorandfoundation/tealscript/) is an existing TypeScript-like language to TEAL compiler however the source code is not executable TypeScript, and it does not prioritise semantic compatibility. Wherever possible, Algorand TypeScript should endeavour to be compatible with existing TEALScript contracts and where not possible migratable with minimal changes.
+
+## Algorand Python
+
+[Algorand Python](https://algorandfoundation.github.io/puya/) is the Python equivalent of Algorand TypeScript. Whilst there is a primary goal to produce an API which makes sense in the TypeScript ecosystem, a secondary goal is to minimise the disparity between the two APIs such that users who choose to, or are required to develop on both platforms are not facing a completely unfamiliar API.
+
-
+
+## ABI Abstraction
+
+When possible, Algorand TypeScript should avoid putting the cognitive overhead of ABI encoding/decoding on the developer. For example, there should be no different between AVM byteslices and ABI encoded strings and they should be directly comparable and compatible until the point of encoding (returning, putting in state, array encoding, logging, etc.)
-
+
+## ABI Abstraction
+
+When possible, Algorand TypeScript should avoid putting the cognitive overhead of ABI encoding/decoding on the developer. For example, there should be no different between AVM byteslices and ABI encoded strings and they should be directly comparable and compatible until the point of encoding (returning, putting in state, array encoding, logging, etc.)
+# Architecture decisions
+
+As part of developing Algorand TypeScript we are documenting key architecture decisions using [Architecture Decision Records (ADRs)](https://adr.github.io/). The following are the key decisions that have been made thus far:
+
+- [2024-05-21: Primitive integer types](./architecture-decisions/2024-05-21_primitive-integer-types.md)
+- [2024-05-21: Primitive byte and string types](./architecture-decisions/2024-05-21_primitive-bytes-and-strings.md)
diff --git a/docs/architecture-decisions/2024-05-21_primitive-bytes-and-strings.md b/docs/architecture-decisions/2024-05-21_primitive-bytes-and-strings.md
@@ -0,0 +1,122 @@
+# Architecture Decision Record - Primitive bytes and strings
+
+- **Status**: Draft
+- **Owner:** Tristan Menzel
+- **Deciders**: Alessandro Cappellato (Algorand Foundation), Joe Polny (Algorand Foundation), Rob Moore (MakerX)
+- **Date created**: 2024-05-21
+- **Date decided**: N/A
+- **Date updated**: 2024-05-22
+
+## Context
+
+See [Architecture Decision Record - Primitive integer types](./2024-05-21_primitive-bytes-and-strings.md) for related decision and context.
+
+The AVM's only non-integer type is a variable length byte array. When *not* being interpreted as a `biguint`, leading zeros are significant and length is constant unless explicitly manipulated. Strings can only be represented in the AVM if they are encoded as bytes. The AVM supports byte literals in the form of base16, base64, and utf8 encoded strings. Once a literal has been parsed, the AVM has no concept of the original encoding or of utf8 characters. As a result, whilst a byte array can be indexed to receive a single byte (or a slice of bytes); it cannot be index to return a single utf8 *character* - unless one assumes all characters in the original string were ASCII (ie. single byte) characters.
+
+EcmaScript provides two relevant types for bytes and strings.
+
+ - **string**: The native string type. Supports arbitrary length, concatenation, indexation/slicing of characters plus many utility methods (upper/lower/startswith/endswith/charcodeat/trim etc). Supports concat with binary `+` operator.
+ - **Uint8Array**: A variable length mutable array of 8-bit numbers. Supports indexing/slicing of 'bytes'.
+
+TealScript uses a branded string to represent bytes. Base64/Base16 encoding/decoding is performed with specific ops. The prototype of these objects contains string specific apis that are not implemented.
+
+Algorand Python has specific [Bytes and String types](https://algorandfoundation.github.io/puya/lg-types.html#avm-types) that have semantics that exactly match the AVM semantics. Python allows for operator overloading so these types also use native operators (where they align to functionality in the underlying AVM).
+
+
+## Requirements
+
+- Support bytes AVM type and a string type that supports ASCII UTF-8 strings
+- Use idiomatic TypeScript expressions for string expressions, including concatenation operator (`+`)
+- Semantic compatibility when executing on Node.js (e.g. in unit tests) and AVM
+
+## Principles
+
+- **[AlgoKit Guiding Principles](https://github.com/algorandfoundation/algokit-cli/blob/main/docs/algokit.md#guiding-principles)** - specifically Seamless onramp, Leverage existing ecosystem, Meet devs where they are
+- **[Algorand Python Principles](https://algorandfoundation.github.io/puya/principles.html#principles)**
+- **[Algorand TypeScript Guiding Principles](../README.md#guiding-principals)**
+
+## Options
+
+
+### Option 1 - Direct use of native EcmaScript types
+
+```ts
+const b1 = "somebytes"
+
+const b2 = new Uint8Array([1, 2, 3, 4])
+
+const b3 = b1 + b1
+```
+
+Whilst binary data is often a representation of a utf-8 string, it is not always - so direct use of the string type is not a natural fit. It doesn't allow us to represent alternative encodings (b16/b64) and the existing api surface is very 'string' centric. Much of the api would also be expensive to implement on the AVM leading to a bunch of 'dead' methods hanging off the type (or a significant amount of work implementing all the methods).
+
+The Uint8Array type is fit for purpose as an encoding mechanism but the API is not as friendly as it could be for writing declarative contracts. The `new` keyword feels unnatural for something that is ostensibly a primitive type. The fact that it is mutable also complicates the implementation the compiler produces for the AVM.
+
+### Option 2 - Define a class to represent Bytes
+
+A `Bytes` class is defined with a very specific API tailored to operations which are available on the AVM:
+
+```ts
+class Bytes {
+  constructor(v: string) {
+    this.v = v
+  }
+
+  concat(other: Bytes): Bytes {
+    return new Bytes(this.v + other.v)
+  }
+
+  at(x: uint64): Bytes {
+    return new Bytes(this.v[x])
+  }
+
+  /* etc */
+}
+
+```
+
+This solution provides great type safety and requires no transpilation to run _correctly_ on Node.js. However, non-primitive types in node have equality checked by reference. Again the `new` keyword feels unnatural. Due to lack of overloading, `+` will not work as expected however concatenations do not require the same understanding of "order of operations" and nesting as numeric operations, so a concat method isn't as unwieldy (but still isn't idiomatic).
+
+```ts
+const a = new Bytes("Hello")
+const b = new Bytes("World")
+
+function testValue(x: Bytes) {
+  // No compile error, but will work on reference not value
+  switch(x) {
+    case a:
+      return b
+    case b:
+      return a
+  }
+  return new Bytes("default")
+}
+```
+
+To have equality checks behave as expected we would need a transpilation step to replace bytes values in certain expressions with a primitive type.
+
+### Option 3 - Implement bytes as a class but define it as a type + factory
+
+We can iron out some of the rough edges of using a class by only exposing a factory method for `Bytes` and a resulting type `bytes`. This removes the need for the `new` keyword and lets us use a 'primitive looking' type alias (`bytes` versus `Bytes` - much like `string` and `String`). We can use [tagged templates](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates) to improve the ux of multipart concat expressions.
+
+```ts
+
+const a = Bytes("Hello")
+const b = Bytes.fromHex("ABFF")
+const c = Bytes.fromBase64("...")
+const d = Bytes.fromInts(255, 123, 28, 20)
+
+
+function testValue(x: bytes, y: bytes): bytes {
+  return Bytes`${x} and ${y}`
+}
+
+```
+
+## Preferred option
+
+TBD
+
+## Selected option
+
+TBD
diff --git a/docs/architecture-decisions/2024-05-21_primitive-integer-types.md b/docs/architecture-decisions/2024-05-21_primitive-integer-types.md
@@ -0,0 +1,157 @@
+# Architecture Decision Record - Primitive integer types
+
+- **Status**: Draft
+- **Owner:** Tristan Menzel
+- **Deciders**: Alessandro Cappellato (Algorand Foundation), Joe Polny (Algorand Foundation), Rob Moore (MakerX)
+- **Date created**: 2024-05-21
+- **Date decided**: N/A
+- **Date updated**: 2024-05-22
+
+## Context
+
+The AVM supports two integer types in its standard set of ops.
+
+* **uint64**: An unsigned 64-bit integer where the AVM will error on over or under flows
+* **biguint**: An unsigned variable bit, big-endian integer represented as an array of bytes with an indeterminate number of leading zeros which are truncated by several math ops. The max size of a biguint is 512-bits. Over and under flows will cause errors.
+
+EcmaScript supports two numeric types.
+
+* **number**: A floating point signed value with 64 bits of precision capable of a max safe integer value of 2^53 - 1. A number can be declared with a numeric literal, or with the `Number(...)` factory method.
+* **bigint**: A signed arbitrary-precision integer with an implementation defined limit based on the platform. In practice this is greater than 512-bit. A bigint can be declared with a numeric literal and `n` suffix, or with the `BigInt(...)` factory method.
+
+EcmaScript and TypeScript both do not support operator overloading, despite some [previous](https://github.com/tc39/notes/blob/main/meetings/2023-11/november-28.md#withdrawing-operator-overloading) [attempts](https://github.com/microsoft/TypeScript/issues/2319) to do so.
+
+TealScript [makes use of branded `number` types](https://tealscript.netlify.app/guides/supported-types/numbers/) for all bit sizes from 8 => 512. Since the source code is never executed, the safe limits of the `number` type are not a concern. Compiled code does not perform overflow checks on calculations until a return value is being encoded meaning a uint<8> is effectively a uint<64> until it's returned.
+
+Algorand Python has specific [UInt64 and BigUint types](https://algorandfoundation.github.io/puya/lg-types.html#avm-types) that have semantics that exactly match the AVM semantics. Python allows for operator overloading so these types also use native operators (where they align to functionality in the underlying AVM).
+
+
+## Requirements
+
+- Support uint64 and biguint AVM types
+- Use idiomatic TypeScript expressions for numeric expressions, including mathematical operators (`+`, `-`, `*`, `/`, etc.)
+- Semantic compatibility when executing on Node.js (e.g. in unit tests) and AVM
+
+## Principles
+
+- **[AlgoKit Guiding Principles](https://github.com/algorandfoundation/algokit-cli/blob/main/docs/algokit.md#guiding-principles)** - specifically Seamless onramp, Leverage existing ecosystem, Meet devs where they are
+- **[Algorand Python Principles](https://algorandfoundation.github.io/puya/principles.html#principles)**
+- **[Algorand TypeScript Guiding Principles](../README.md#guiding-principals)**
+
+## Options
+
+### Option 1 - Direct use of native EcmaScript types
+
+EcmaScript's `number` type is ill-suited to representing either AVM type reliably as it does not have the safe range to cover the full range of a uint64. Being a floating point number, it would also require truncating after division.
+
+EcmaScript's `bigint` is a better fit for both types but does not underflow when presented with a negative number, nor does it overflow at any meaningful limit for the AVM types.
+
+If we solved the over/under flow checking with transpilation we still face an issue that `uint64` and `biguint` would not have discrete types and thus, we would have no type safety against accidentally passing a `biguint` to a method that expects a `uint64` and vice versa.
+
+### Option 2 - Define classes to represent the AVM types
+
+A `UInt64` and `BigUint` class could be defined which make use of `bigint` internally to perform maths operations and check for over or under flows after each op.
+
+```ts
+class UInt64 {
+
+  private value: bigint
+
+  constructor(value: bigint | number) {
+    this.value = this.checkBounds(value)
+  }
+
+  add(other: UInt64): UInt64 {
+    return new UInt64(this.value + other.value)
+  }
+
+  /* etc */
+}
+
+```
+
+This solution provides the ultimate in type safety and semantic/syntactic compatibility, and requires no transpilation to run _correctly_ on Node.js. The semantics should be obvious to anyone familiar with Object Oriented Programming. The downside is that neither EcmaScript nor TypeScript support operator overloading which results in more verbose and unwieldy math expressions.
+
+```ts
+const a = UInt64(500n)
+const b = Uint64(256)
+
+// Not supported (a compile error in TS, unhelpful behaviour in ES)
+const c1 = a + b
+// Works, but is verbose and unwieldy for more complicated expressions and isn't idiomatic TypeScript
+const c2 = a.add(b)
+
+```
+
+### Option 3 - Use tagged/branded number types
+
+TypeScript allows you to intersect primitive types with a simple interface to brand a value in a way which is incompatible with another primitive branded with a different value within the type system.
+
+```ts
+// Constructors
+declare function UInt64(v): uint64
+declare function BigUint(v): uint64
+
+// Branded types
+type uint64 = bigint & { __type?: 'uint64' }
+type biguint = bigint & { __type?: 'biguint' }
+
+
+const a: uint64 = 323n // Declare with type annotation
+const b = UInt64(12n) // Declare with factory
+
+// c1 type is `bigint`, but we can mandate a type hint with the compiler (c2)
+const c1 = a + b
+const c2: uint64 = a + b
+// No TypeScript type error, but semantically ambiguous - is a+b performed as a biguint op or a uint64 one and then converted?
+// (We could detect this as a compiler error though)
+const c3: biguint = a + b
+
+// Type error on b: Argument of type 'uint64' is not assignable to parameter of type 'biguint'. Nice!
+test(a, b)
+
+function test(x: uint64, y: biguint) {
+  // ...
+}
+
+```
+
+This solution looks most like natural TypeScript / EcmaScript and results in math expressions that are much easier to read. The factory methods mimic native equivalents and should be familiar to existing developers.
+
+The drawbacks of this solution are:
+ - Less implicit type safety as TypeScript will infer the type of any binary math expression to be the base numeric type (`number`). A type annotation will be required where ever an identifier is declared and additional type checking will be required by the compiler to catch instances of assigning one numeric type to the other.
+ - In order to have 'run on Node.js' semantics of a `uint64` or `biguint` match 'run on the AVM', a transpiler will be required to wrap numeric operations in logic that checks for over and under flows.
+
+A variation of the above with non-optional `__type` tags would prevent accidental implicit assignment errors, but require explicit casting on all ops
+
+```ts
+declare function Uint64(v): uint64
+declare function BigUint(v): uint64
+
+type uint64 = bigint & { __type: 'uint64' }
+type biguint = bigint & { __type: 'biguint' }
+
+// Require factory or cast on declaration
+const a: uint64 = 323n as uint64
+const b = Uint64(12n)
+
+// Also require factory or cast on math
+let c2: uint64
+
+c2 = a + b // error
+c2 = Uint64(a + b) // ok
+c2 = (a + b) as uint64 // ok
+```
+
+This introduces a degree of type safety at the expense of legibility.
+
+TealScript uses a similar approach to this, but uses `number` as the underlying type rather than `bigint`, which has the aforementioned downside of not being able to safely represent a 64-bit unsigned integer.
+
+
+## Preferred option
+
+TBD
+
+## Selected option
+
+TBD