Skip to content

Commit

Permalink
Generic values (#9)
Browse files Browse the repository at this point in the history
* redefining Value (and thus CESKM) to be more general

* much, much better readme

* readme bikeshedding

* undoing readme bikeshedding

* Kont<T> is now 'tagless'

* slowly but surely improving the readme; package.json

* a little more readme finesse
  • Loading branch information
gatlin authored Apr 7, 2021
1 parent 65b7459 commit e180fc7
Show file tree
Hide file tree
Showing 7 changed files with 522 additions and 507 deletions.
313 changes: 157 additions & 156 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ Precursor is a small, experimental programming language implemented as a pure
TypeScript (or pure JavaScript) library.

You can read more details below in the *overview*, and you can even
[try it out in a live demonstration in your browser](https://niltag.net/code/precursor)
[try it out in a live demonstration in your browser][precursordemo].

Licensed under the `GPL-3`.
[precursordemo]: https://niltag.net/code/precursor

Licensed under the `GPL-3` where it can be, and the `WTFPL` elsewhere.

# build and install from source

Expand All @@ -17,201 +19,200 @@ and, in fact, must - simply by running:
npm i
```

# overview

What follows is my best attempt at a summary for anyone who wanders into this
repository and wants to know how to find out more.
I think the best way to get a feel for its usage is to take a look at the unit
tests in the `__tests__` directory.
# synopsis

---
## an attempt with words

Precursor is a small programming language which you may grow and build upon (a
"precursor," if you like).
*precursor*, if you like).

The default distribution consists of 3 components which work together "out of
the box".

## *CESKM* Evaluator
the box":

`ceskm.ts` defines a CESKM [machine][cekarticle] to evaluate Precursor.
It is called a *CESKM* machine because it consists of five components:

- **c**ontrol-string, the program expression being evaluated;
- **e**nvironment, a mapping from variable names to *addresses* or
*definitions*;
- **s**tore, a subsequent mapping from *addresses* to ***values***;
- **k**ontinuation, the current [continuation][contarticle]; and
- **m**eta stack, a control stack used in tandem with the continuation.
- a small [**call-by-push-value**][cbpvarticle] language, `Cbpv`, defined as a
data type that you can manipulate in code (`grammar.ts`);
- a [CESK][cekarticle]-based evaluator which operates on `Cbpv` objects
(`ceskm.ts`) ;
- a parser for an [s-expression][sexprarticle] syntax, which parses source code
`string`s into `Cbpv` values(`parser.ts`).

[cekarticle]: https://en.wikipedia.org/wiki/CEK_Machine
[contarticle]: https://en.wikipedia.org/wiki/Continuation
[cbpvarticle]: https://en.wikipedia.org/wiki/Call-by-push-value
[sexprarticle]: https://en.wikipedia.org/wiki/S-expression

The language grammar (below) can ultimately be thought of as the operating
instructions for this machine.
You can see examples of the syntax parsed by the default parser in
`__tests__/index.test.ts`.

The objective of a CESKM machine is to evaluate the **c**ontrol string down to
a value.
Precursor (currently) defines the following language of values, meant to
resemble JSON primitives (modified slightly for presentation):
## example

```typescript
type Value
= { tag: 'closure', exp: Cbpv, env: Env }
| { tag: 'continuation', kont: Kont }
| { tag: 'number', v: number }
| { tag: 'boolean', v: boolean }
| { tag: 'string' , v: string }
| { tag: 'record' , v: Record<string, Value> }
| { tag: 'array' , v: Value[] };
```
The following is an example usage of Precursor.
We will use the parser that comes with the library, and create an evaluator
that can compute with `number`, `boolean`, and `null` values.

In addition to numbers, booleans, strings, records, and arrays, we have
First, we sub-class `CESKM` and specify the values our machine can work with.

- *closures*: ongoing computations with a closed environment which have more
work to be done before they can produce a result `Value`, created with the
`!` operator (see `Grammar`); and
- *continuations*: continuations can be bound to variables using `shift`, and
this is what is inside that variable. If you don't know what a continuation
is, I refer you to the [article on the subject above][contarticle].
```typescript
import {
CESKM,
Value,
parse_cbpv // use the pre-fab s-expression parser
} from "precursor-ts";

### Step by step
import { strict as assert } from "assert";

The base class implements a protected method `step` which "purely" acts on a
*state* value consisting of the five components listed above.
The output of each `step` is used as the input to the next `step`; evaluation
terminates when `step` returns a `Value` type instead.
type Val = number | boolean | null ;

The public method `run` implements this algorithm, but you are free to override
it or supplement it with your own (for instance, you might want a "debug mode"
where the machine yields each state to a logging system for review).
class ExampleMachine<Val> extends CESKM<Val> {
constructor (program: string) { super(parse_cbpv(program)); }
```
## Grammar
Now we must override the methods `literal` and `primop`.
`grammar.ts` defines the Precursor grammar, `Cbpv`, as a plain-old-JSON type.
Here is that definition, modified slightly for presentation.
`literal` defines how "literal" values are to be converted into `Value`s.
A literal is something like a number (eg, `42`), `"doubly quoted string"`, or
boolean `#t`rue `#f`alse symbols.
You decide which of these to accept and how to evaluate them literally.
```typescript
type Cbpv
/* Positive */
= { tag: 'cbpv_number' ; v: number } // eg, 5
| { tag: 'cbpv_boolean' ; v: boolean } // #t, #f
| { tag: 'cbpv_string' ; v: string } // "double-quotes only"
| { tag: 'cbpv_symbol' ; v: string } // immutable
| { tag: 'cbpv_primop' ; op: string; erands: Cbpv[] } // see below
| { tag: 'cbpv_suspend'; exp: Cbpv }
/* Negative */
| { tag: 'cbpv_apply'; op: Cbpv; erands: Cbpv[] } // eg, (op arg1 arg2)
| { tag: 'cbpv_abstract'; args: string[]; body: Cbpv } // eg, (λ (x) (...))
| { tag: 'cbpv_let'; v: string; exp: Cbpv; body: Cbpv } // (let x 5 (...))
| { tag: 'cbpv_letrec'; bindings: [string,Cbpv][]; body: Cbpv } // see below
| { tag: 'cbpv_if'; c: Cbpv; t: Cbpv; e : Cbpv } //the author is iffy on this one
| { tag: 'cbpv_resume'; v: Cbpv } // weird
| { tag: 'cbpv_reset'; exp: Cbpv } // weird
| { tag: 'cbpv_shift'; karg: string; body: Cbpv } // weird
;
protected literal(v: any): Value<Val> {
if ("number" === typeof v
|| "boolean" === typeof v
|| null === v)
{ return { v }; }
throw new Error(`${v} not a primitive value`);
}
```
`Cbpv` stands for [*call-by-push-value*][cbpvarticle], a language foundation
which is neither lazy **nor** strict.
Instead, term evaluation is handled explicitly by two operators: `!`
("suspend") and `?` ("resume").
`primop` defines the *primitive operations* ("primops") your machine can
perform on `Value`s.
The `CESKM` base class defines no primops: by default, the machine can only
"do" what you permit it to do.
[cbpvarticle]: https://en.wikipedia.org/wiki/Call-by-push-value
*Aside*: The built-in parser, by convention, treats all symbols beginning with
`prim:` as primitive operators, eg:
```
(prim:mul 1 2)

=>

{
"tag": "cbpv_primop",
"op": "prim:mul",
"erands": [
{
"tag": "cbpv_literal",
"v": 1
},
{
"tag": "cbpv_literal",
"v": 2
}
]
}
```
### A polarizing subject
There is no brilliant reason for this, it just keeps the interaction between
the parser and the evaluator simple in lieu of a more principled mechanism.
In call-by-push-value terms are sorted into two (for lack of a better word, oy)
kinds:
```typescript
protected primop(op_sym: string, args: Value<Val>[]): Value<Val> {
switch (op_sym) {
case "prim:mul": {
if (! ("v" in args[0]) || ! ("v" in args[1]))
{ throw new Error(`arguments must be values`); }
if ("number" !== typeof args[0].v || "number" !== typeof args[1].v)
{ throw new Error(`arguments must be numbers`); }
let result: unknown = args[0].v * args[1].v;
return { v: result as Val };
}
// ... other prim ops
default: return super.primop(op_sym, args);
}
}
}
```

*Positive* terms are data: terms which require no further evaluation or work by
the machine in order to render as result values.
Literals (numbers, strings, booleans, etc), variables, and *primops* are all
positive.
Primitive operators are not complete terms by themselves - they aren't
variables you can pass around as an argument.
Think of them as the "assembly" instructions of your evaluator.
You can write functions that call primops and pass *those* around all day.

---

*Primitive operators* ("primops") are basic operations that the Precursor
machine can perform on data.
You might think of them as the basic instruction set for a CPU.
By default the `CESKM` class defines a handful of primops to manipulate the
basic data types but it is easy (and expected and encouraged) for you to add
your own; see the unit tests for a concrete example!
Having supplied the universe of result types and filled in how they relate to
literal expressions and what primitive operators are defined for them, you can
`run` your machine down to a `Value<Result>`.

*Negative* terms are those which express some **irreversible** work to be done.
For example, function abstraction (`cbpv_abstract` above) pops the top frame
from the argument stack; function application pushes a frame on it and
evaluates its (negative) operator; an `if` expression essentially chooses
between two continuations and throws one away; etc.
```typescript
const example_machine = new ExampleMachine(`
(letrec (
(square (λ (n)
(let n (? n) ; prim-op arguments must be *fully* evaluated.
(prim:mul n n)))) ; higher level languages might not expose primops
) ; directly.
((? square) 3) ; a function defined in a `letrec` is automatically
; "suspended" and must be "resumed" with `?` before
; applying it to arguments (in this case, `3`).
)
`);

`!` *suspends* a negative ("active," "ongoing") computation into a closure
value; `?` *resumes* suspended computations in order to evaluate them.
assert.deepStrictEqual(example_machine.run(), { v: 9 });
```
### To be continued
## are there data structures? a type system?
`shift` and `reset` are [delimited continuation][delimccarticle] operators.
A *continuation* is an abstract representation of the control state of the
program (according to Wikipedia).
It represents a point in the computation with a specified amount of remaining
work.
Ultimately I would like to include a type checker for `Cbpv` which supports
*linear call-by-push-value with graded coeffects.*
I'll let you look up the parts of that which interest you.
[delimccarticle]: https://en.wikipedia.org/wiki/Delimited_continuation
As for data structures,
When handled with care, these four operators are very powerful:
1. Nothing yet,
2. look at this:
```
```typescript
const example_machine = new ExampleMachine(`
(letrec (
(load (λ () (shift k
(! (λ (f) ((? (prim:record-get "load" f)) k))))))
(save (λ (v) (shift k
(! (λ (f) ((? (prim:record-get "save" f)) v k))))))
(return (λ (x) (shift k
(! (λ (_) (? x))))))
(run-state (λ (st comp)
(let handle (reset (? comp))
((? handle) (prim:record-new
"load" (! (λ (continue)
(let res (! (continue st))
((? run-state) st res))))
"save" (! (λ (v continue)
(let res (! (continue _))
((? run-state) v res)))))))))
(increment-state (λ ()
(let n ((? load))
(let _ ((? save) (prim:add n 1))
(let n-plus-1 ((? load))
((? return) n-plus-1))))))
(cons (λ (a b) (reset ((shift k k) a b))))
)
((? run-state) 255 (! ((? increment-state))))
(let p1 ((? cons) 3 #f)
p1)
)
; result: 256
`);
console.log(example_machine.run());
```
`!`, `?`, `shift`, and `reset` are here used to implement a small [effect
system][effectsysarticle], in this case modeling a mutable state effect.

[effectsysarticle]: https://en.wikipedia.org/wiki/Effect_system

There's a lot more to say but not a lot of time!
Hopefully though if you are the sort of person whom this could potentially
excite, you'll be excited by now.

## Parser
This prints the following:
```json
{
"_kont": {
"_args": [
{
"v": 3
},
{
"v": false
}
],
"_kont": {}
}
}
```
Precursor comes with a parser for a small *s-expression* (think lisp) surface
language which builds the `Cbpv` expressions evaluated by `CESKM`.
The `parse_cbpv` function in `parser.ts` can be used without any fuss for
exactly this.
This captured a set of arguments being passed to a "function" `(shift k k)` and
converted them into what looks suspiciously like a composite or product value
of some kind.
This language is meant to closely mirror the structure of the grammar itself;
it's not supposed to win any awards for usability or ergonomics.
This is why it exists in a separate module and why `CESKM` consumes a custom
data type and not source code directly.
Stay tuned!
# Questions / Comments / Issues
# questions / comments
Feel free to email the author at `gatlin+precursor@niltag.net`.
You may also use the "Issues" feature on GitHub to report any feedback.
You can submit bugs through the Issues feature at
https://github.com/gatlin/precursor-ts .
As well you may email me at `gatlin+precursor@niltag.net`.
I reserve the right to be terrible at replying; you should absolutely not take
it personally.
Loading

0 comments on commit e180fc7

Please sign in to comment.