pomelo
is a (hobby, WIP) compiler for Standard ML (SML) '97 implemented in Rust.
So far, I've implemented basic lexing, parsing, and lowering to an intermediate representation (essentially just desugaring derived forms and doing name resolution). My goal is to eventually extend this to implement a Language Server and interpreter. This is extremely rough and incomplete and I'm learning a lot as I go!
Please see the docs for more detailed information.
A pomelo is a fun (big!) citrus fruit, which also happens to have "ML" in its name.
SML is a statically-typed functional language (with some imperative constructs) and was a precursor to later ML-family languages like OCaml and Haskell.
The ML family has a lot of really cool stuff:
- Hindley-Milner type inference, polymorphism, etc
- Pattern matching
- Module system
so lots of opportunities for me to learn (/get really confused)!
For now, I don't plan to touch modules or imperative stuff (see Scope below). However, just trying to implement the Core is plenty for me at this point.
The language definition and standard library:
SML compilers:
Here are some similar (but much more complete) projects by others:
Core SML language (so no modules). Also, no imperative stuff except for maybe basic I/O (so no arrays or references).
The following is a summary of what I've done so far. Please see the docs for more detailed information.
pomelo-lex
contains a basic lexer, influenced by rustc_lexer
.
pomelo-parse
creates a concrete syntax tree.
This is modeled off of rust-analyzer
's parser and also uses rowan
to create the concrete and abstract/typed syntax tree.
pomelo-hir
defines the high-level intermediate representation (HIR).
The HIR is very similar to the AST, except all of the derived forms (see Appendix A of the Definition) are desugared to their more basic equivalent form (similar to how loops, etc. are desugared away in rustc
's HIR).
Pomelo's HIR is represented as a tree stored in an arena.
The arena is basically just a wrapper around a Vec
, see pomelo-hir::arena
or la_arena
.
This module also contains the code for lowering from the AST. It also contains some name resolution -- usages of variables or type constructors are annotated with references to the location in the HIR where their identifiers are defined.
pomelo-fmt
is an (incomplete) code formatter for Core SML.
This uses the algorithm from "Prettyprinting (Oppen 1980)".
I did this mostly just for fun, although I do also want to use this to format the pretty-printed HIR (which is itself valid SML).
This is the part I've been looking forward to the most! Currently reading up on Hindley-Milner, etc.
As of now, I do not plan to use salsa
or any other kind of fancy system for caching completed queries, but this could be added later for more fun/learning/perf.