Skip to content

Latest commit

 

History

History
684 lines (613 loc) · 25.2 KB

DenotativeLC-Tangle.org

File metadata and controls

684 lines (613 loc) · 25.2 KB

Design and Implementation of the DenotativeLC Programming Language

Getting Started

What is DenotativeLC

DenotativeLC is a lambda calculus with algebraic data types developed in a unique style which I call “Denotative Relational Literate Programming” with a novel emedding technique I call “Tagless-complete”.

Try it out

  • Requirements: Emacs, Cabal, Haskell

Head to the Main function heading and do C-c, use the following GHCi interperter.

Example

fst (v, s v)

Licenses

Prose

  • CC-BY-SA

A copy of the license is included in the repository.

Code

  • AGPL v3

A copy of the license is included in the repository.

Motivation / Goals

Writing in Denotative Relational Literate Programming

Denotational Design

Denotational design[fn:1] is a methodology developed by Conal Elliott, to improve the state of software development, currently most software is developed without any design and a dose of (let’s just do it), this leaves a bunch of problems like technical debt, leaky abstraction and much more, denotational design at it’s essence is to make a mathematical model of your types and operations by simple math that has been discovered for other reasons, for example we can denotate a stack data structure to be a function that takes in a natural number and returns an a, so Nat -> a, then we will denotate the functions that use the Stack like empty for an empty stack.

Literate Programming

Literate programming[fn:2] is a paradigm where you mix code and prose (concerning the program at hand) at the same location, this has many benefits for example, readability, it is much easier for the reader to understand the underlying code compared to just inline comments or no comments at all, I think this is an extremely relatable factor for a lot us programmers, as Harold Abelson said:

Programs must be written for people to read, and only incidentally for machines to execute.

I am personally interested in exploring combining denotational design and literate programming together, where you specify your design with denotational design and only then, think about the implementation.

Functional Relational Programming

Okay, so we have denotational design and we use literate programming to put it at the forefront, we need a structure or better yet an architecture, for the implementation of said program, this is where functional relational programming[fn:3] comes at the foreground, it is an architecture designed to reduce complexity via using functional and relational programming, it turns out that it actually works nicely with literate programming and by extension denotational design.

Combining all three

Literate programming provides the structure needed, denotational design provides the design and functional relational programming provides an efficient implementation architecture.

  • Structure: Literate Programming
  • Design: Denotational Design
  • Implementation: Functional Relational Programming

Tagless-complete embedding

I went on a quest to find a way to to make a programming language using denotational semantics, after some time, I discovered tagless-final, let’s call it executable denotational semantics.

Tagless-final

Tagless-final[fn:4] is a technique of embedding domain specific languages (DSLs) in to the host langauge (Haskell in my case), instead of using data constructors as a representation of DSL terms, you typically use functions inside of a typeclass (or ML modules), this allows excellent extensibility, it is a fine solution to the expression problem, the main attraction for me is that from the author’s words:

Doing a tagless-final embedding is literally writing a denotational semantics for the DSL – in a host programming language rather than on paper.

That quote got me excited and after understanding the technique, I can say it does a really good job at it. There is one huge pitfall though for creating DSLs and that’s that, A tagless-final embedding is a shallow embedding, meaning we have no AST at our disposal, for applications like parsing or optimization.

The path for a solution

I have looked into many papers to solve this problem like ~Deep embedding with class~[fn:5] and ~Combining Shallow and Deep EDSLs~[fn:6], the former uses an AST data type for a representation and includes a backdoor via existenial type for extension, the issue comes from the fact that it loses the denotational semantic properties that makes tagless-final so attractive, while the latter starts with a deep embedding with a shallow wrapper on top, which automatically makes the technique non-applicable for my needs, a bad solution for this problem is to lift term level functions to the type level and embed those functions to constructors, while this works, it is a far cry from a realistic and satisfactory programming experience since in Haskell, type level programming is inadequate, with no type level lambdas or first class type level functions (called type families).

What is tagless-complete?

Tagless-complete is a combination of tagless-final and tagless-initial, tagless-inital is equivalent to tagless-final except it uses a GADT for the semantic functions (or constructors in this case), we then translate into a final embedding via a translation function, finally we evaluate via tagless-final.

Design

Functionality

DenotativeLC is a simply typed lambda calculus with algebraic data types, the following defines the inference rules in DenotativeLC.

For Expressions:

Variables

Variables are de bruijn indice.

static/img/var.png

Successor

The successor of a variable. static/img/succ.png

Abstractions

Lambda abstraction.

static/img/lam.png

Applications

Lambda application. static/img/app.png

Pair

Pair of two expressions. static/img/pair.png

Fst

First of the pair. static/img/fst.png

Snd

Secondd of the pair. static/img/snd.png

Unit

The unit value. static/img/unit.png

Inr

The right of Either. static/img/inr.png

Inl

The left of Either. static/img/inl.png

Specification

AST

Types

Expressions

Variables

Similar as in the types heading we can also use Haskell’s variable semantics therefore we don’t have to specify variables in our language. Variables are de bruijn indices, we can specify them as:

variable0 :: wrap (h, a) a

And since we use de bruijn we have to specify the inductive case as:

variableSucc :: wrap h a -> wrap (h, any) a
Abstractions

Abstractions correspond to function definitions, which we can think of as a function that takes an indentifier and an expression:

abstraction :: wrap (env, a) b -> wrap env (a -> b)
Application

Application is just function application and can be specified as:

application :: wrap env (a -> b) -> wrap env a -> wrap env b
Pair

Pair is the combination of two types, in Haskell it is the (,) type.

pair :: wrap env a -> wrap env b -> wrap env (a, b)
Fst

fst is grabbing the first value of the pair type.

fst :: wrap env (a, b) -> wrap env a
Snd

Snd is identical to Fst except it grabs the second value.

snd :: wrap env (a, b) -> wrap env b
Unit

Unit is the terminal object of the CCC, we can describe the function as:

unit :: wrap env ()
Inl

inl comes from co-cartesisan, it is the dual of cartesian, in category theory we can easily receieve a dual from just flipping the arrows, this is an example of that, instead of the pair type we get Either.

inl :: wrap h a -> wrap h (Either a b)
Inr

inr is the right side equivalent to inl.

inr :: wrap h b -> wrap h (Either a b)

Parser

The parser is another common component of a programming language, in DenotativeLC, with parser combinators we can build smaller parsers into one big one therefore simplifying my work so that I only need need to specify one function but first we need to build a parser type that we will use.

Parser type

The following is a parser type with Void for errors and String being the streaming type, Parsec comes from the Megaparsec library.

type Parser = Parsec Void String
Parsing specification

We can easily specify a single function from the combined lower level parsers with the following:

parseDenotativeLC :: Parser (wrap h a)

Evaluator

The evaluator is a function that evaluates terms in the closed term

eval :: wrap () a -> a

The full picture

The full API:

wrap env a
variable0 :: wrap (env, a) a
variableSucc :: wrap env a -> wrap (env, any) a
application :: wrap env (a -> b) -> wrap env a -> wrap env b
abstraction :: wrap (env, a) b -> wrap env (a -> b)
pair :: wrap env a -> wrap env b -> wrap env (a, b)
fst :: wrap env (a, b) -> wrap env a
snd :: wrap env (a, b) -> wrap env b
unit :: wrap env ()
inl :: wrap h a -> wrap h (Either a b)
inr :: wrap h b -> wrap h (Either a b)
parseDenotativeLC :: Parser (wrap h a)
eval :: wrap () a -> a  

Denotation

We are going to make semantic functions that map the lambda calculus world to a closed cartesian category[fn:7], first we have to define the semantic domain though.

AST

Note: we are building a denotation with the environment being tuples instead of a type-level list since that is much simpler to reason with (it’s also the way I started it before switching).

Types

The Semantic Domain: The Functor Category

You can pick any closed cartesian category really but the functor category is simple, so it works out. So let’s model it.

_ :: Lambda (wrap a b) => wrap a b -> (a -> b)

Expressions

Variables
exr

In compiling to categories exr, correspond to the snd function in a pair, it’s in the cartesian part of the CCC, it’s defined as:

exr (a, b) = b
exl

exl corresponds to fst and it’s defined as:

exl (a, b) = a
Typing context

Typing context is a tuple that contains the term and it’s type, it looks like this: .

(.)

Simple composition. The composition primitive is necessary for a category to be a category so we can use this primitive.

Back to variables

Generally variabels correspond to identity, id but since we have the typing environment, it infact corresponds to exr.

variables0 = exr

We also have to inductive case to worry about, which can be defined beautifully as:

variablesSucc e1 = e1 . exl 
Abstractions
Curry

Curry is a higher-order function that takes in a function: (a, b) -> c and curries it to be: a -> b -> c. It’s notion in the CCC is the closed part focusing on the expontential type (the function type).

Back to the abstraction function

Abstraction in the tagless-final paper is just curry but it’s type arguments a and b are flipped, I prefer to use the curry semantics, rather than add new functions, so we must consider that the typing context is unsual where the type identifer is the first and the added argument is the second. exl extracts the first element of the tuple and we use the second argument of abstraction to apply the function therefore we gat a function a -> b.

abstraction e1 = curry e1
Applications
△ operator

The operator takes in two terms and constructs a function that is a tuple of those functions, we can specify it as:

f ~△ g = \x -> (f x, g x)

It’s notion is in cartesian part of CCC and it’s the introduction form, cartesian adds products to the category.

apply

apply is a function that takes in a tuple and apply’s the first term to the second. apply is in the closed part of CCC.

Back to application

We have what we need to make denotation.

application a b = apply . a  b
Pair

The corresponds perfectly as the introduction form to the pair.

pair e1 e2 = e1  e2

All the functions concerning products is the cartesian part of the CCC, which has introduction and projections.

Fst

Fst is exl.

fst e1 = exl e1  
Snd

Snd is exr

snd e1 = exr e1  
Unit

The unit function corresponds to the it function earlier.

unit e1 = it e1
Inl

inl and inr have identical representations therefore their denotations are simple.

inl e1 = inl e1
Inr
inr e1 = inr e1

Parsing

Considering, Megaparsec (the library that am using) does not have a denotation in their documentation, I can not in good conscience give a denotation to the parsing function at hand.

Evaluator

The evaluator (for evaluation rather pretty-printing or other purposes) takes in expressions in closed terms meaning only well-typed, we can define the function by applying the argument to unit.

eval e = e ()

The full picture

This shows the complete denotation, I think it shows the beauty and elegance of denotational design, combined with literate programming.

_ :: Lambda (wrap a b) => wrap a b -> (a -> b)
variables0 = exr
abstraction e1 = curry e1
variablesSucc e1 = e1 . exl   
application a b = apply . a  b
pair e1 e2 = e1  e2
fst e1 = exl e1
snd e1 = exr e1
unit e1 = it e1
inl e1 = inl e1
inr e1 = inr e1
eval e = e ()  

Implementation

Infrastructure

Add libraries

Run bash to install MegaParsec.

cabal install --lib megaparsec

Language extensions

Am using advanced GHC extensions to mostly compute at the type level per the requirements of Denotative embedding.

{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE PartialTypeSignatures #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE PartialTypeSignatures #-}

Imports

Load imports

GHCi requires us to load imports before using them.

:set -package base
:set -package megaparsec

Importing a parser library and type level programming libraries.

import Text.Megaparsec
import Text.Megaparsec.Char
import Data.Void  

Multi-line

This options allows literate programming with Haskell to be much better where it allows to make multi-line functions, (org-babel connects to ghci).

:set +m

Evaluate everything

Main function

<<extensions>>

module DenotativeLC where

<<imports>>  
<<category-classes>>
<<ast-class>>
<<category-instances>>
<<reader>>
<<ast-instance>>
<<eval>>
<<parser>>
<<translator>>

main :: IO ()
main = do
   line <- getLine
   case parse parseDenotativeLC "" line of
     Right t -> do
       let val = cps t (const undefined)
       let evaluated = eval $ toFinal val
       print evaluated
       main
     Left err -> print err

Essential State

Types

Function type

The main type that we are going to use is the function type (->), it comes built in with Haskell.

Relations

In the out of the tar pit paper, the authors suggest only using relations and more generally the relational algebra for the state part of a program, we adhere to the paper by using record types analogously as relations. As I said before, record types in Haskell can be analogous to relations (tables in SQL), infact, this approach is used in Persistent which is the most popular ORM in Haskell and the native Haskell database Project-M36 (check this project out, it’s really underrated). The main relation is the R relation which has one pair, unR is the attribute’s name and it’s type is the function type. Using the deriving functionality we can show that Reader is actually the a closed bicartesian category. Let’s define it:

newtype Reader h a = MkReader {unReader :: h -> a} deriving (Category, Cartesian, CoCartesian, Closed, Terminal)

The R relation is actually isomorphic to the function type since they are representially the same.

Essential Logic

AST

Classes

Category

First things first, we should show that Reader is actually a bi-cartesian closed category.

class Category wrap where
  id' :: wrap a a
  (<<) :: (wrap b c) -> (wrap a b) -> (wrap a c)
class Cartesian wrap where
  triangle :: (wrap a c) -> (wrap a d) -> (wrap a (c, d))
  exl :: wrap (a, b) a
  exr :: wrap (a, b) b
class CoCartesian wrap where
  inl :: wrap h a -> wrap h (Either a b)
  inr :: wrap h b -> wrap h (Either a b)

class Closed wrap where
  apply :: wrap ((a -> b), a) b
  curry' :: (wrap (a, b)  c) -> (wrap a (b -> c))
  uncurry' :: (wrap a (b -> c)) -> (wrap (a, b) c)

class Terminal wrap where
  it' :: a `wrap` ()
AST

We can now finally make a typeclass to represent each term in our language, we are using HOAS.W e can use the homomorphism principle to show the denotation more clearly.

class (Category wrap, Cartesian wrap, CoCartesian wrap, Closed wrap, Terminal wrap) => AST wrap where
  variable0 :: wrap (h, a) a
  variableSucc :: wrap h a -> wrap (h, any) a
  abstraction :: wrap (h, a) b -> wrap h (a -> b)
  application :: wrap h (a -> b) -> wrap h a -> wrap h b
  pair :: wrap h a -> wrap h b -> wrap h (a, b)
  fst' :: wrap h (a, b) -> wrap h a
  snd' :: wrap h (a, b) -> wrap h b
  unit :: wrap h ()
  inl' :: wrap h a -> wrap h (Either a b)
  inr' :: wrap h b -> wrap h (Either a b)
-- Necessary comment for the where clause to be closed, ob-haskell should be improved :)

Instances and Semantic Functions

Category

Since (->) and Reader are isomorphic, we can write instances of (->) and use the deriving machinery to get it for free for Reader.

instance Category (->) where
  id' =  id
  (<<) =  (.)
instance Cartesian (->) where
  triangle f g = \x -> (f x, g x)
  exl = fst
  exr = snd
instance CoCartesian (->) where
  inl e = \h -> Left (e h)
  inr e = \h -> Right (e h)
instance Closed (->) where
  apply (f, x) = f x
  curry' = curry 
  uncurry' = uncurry
instance Terminal (->) where
  it' = \a -> ()
AST

Now that we have sufficiently shown that DenotativeLC corresponds to a closed bi-cartesisan category, we can make a tagless-final typeclass for the AST.

fe1 e1 = \x -> (unReader e1 x)
fe2 e2 = \x -> (unReader e2 x)

instance AST Reader where
  variable0 = MkReader snd
  variableSucc v = MkReader $ unReader v . fst
  abstraction e1 = MkReader $ curry (unReader e1)
  application e1 e2 = MkReader $ apply . (triangle (fe1 e1) (fe2 e2))
  pair e1 e2 = MkReader $ triangle (fe1 e1) (fe2 e2)
  fst' e1 = MkReader $ \h -> fst $ (unReader e1 h)
  snd' e1 = MkReader $ \h -> snd $ (unReader e1 h)
  unit = MkReader $ it'
  inl' f = MkReader $ (\env -> Left (unReader f env))  
  inr' f = MkReader $ (\env -> Right (unReader f env))

--       

Evaluator

Evaluation is simple with just the function:

eval e = unReader e ()

Accidental State And Control

Other (Interfacing)

Parser

To be able to parse, we need to have a an AST representation which tagless-final does not have, (remember we are using functions instead of constructors). Am going to use a GADT as an initial encoding, after that use Dynamic to fix type differences and only then can we move to functions and tagless-final.

Tree representation

Initial is the GADT that I was talking about, it is nearly equivalent to the AST typeclass except we have constructors instead of functions.

data Initial i h a where
  Eval :: i h a -> Initial i h a
  Variable :: Initial i (h, a) a
  VariableSucc :: Initial i h a -> Initial i (h, any) a
  Abstraction :: Initial i (h, a) b -> Initial i h (a -> b)
  Application :: Initial i h (a -> b) -> Initial i h a -> Initial i h b
  Pair :: Initial i h a -> Initial i h b -> Initial i h (a, b)
  Fst :: Initial i h (a, b) -> Initial i h a
  Snd :: Initial i h (a, b) -> Initial i h b
  Unit :: Initial i h ()
  Inl :: Initial i h a -> Initial i h (Either a b)
  Inr :: Initial i h b -> Initial i h (Either a b)
--

We also need an eliminator to get that type back.

cps :: Exists -> (forall i h a. Initial i h a -> r) -> r
cps (Symantic t) k = k t
 

Back to parser

Simple parser combinations, we are not using the R type but instead we are overloading the operations

<<exist>>
<<cps>>
data Wrap a = A a | B a | C a | D a | F a

type Parser = Parsec Void String  
-- parseVar :: Parser Exists
-- parseVar = string "v" *> pure (Symantic Variable)
parseVar :: Parser Exists
parseVar = string "v" *> pure (Symantic Variable)


parseVariableSucc = do
  string "s"
  i <- parseVar <|> parseVariableSucc
  let b = cps i (const Variable)
  pure (Symantic (VariableSucc b))
  
parseAbstraction :: Parser Exists
parseAbstraction = do
 char '\\'
 space1
 v <- parseVar
 let b = cps v (const Variable)
 char '.'
 pure (Symantic (Abstraction b))
  
parseApplication :: Parser Exists 
parseApplication = do
  a <- parseAbstraction
  b  <- parseVar
  let (x, y) = (cps a (const undefined), cps b (const undefined))
  pure (Symantic (Application x y))

parsePair :: Parser Exists
parsePair = do
  char '('
  space1
  a <- parseDenotativeLC
  let b = cps a (const undefined)
  char ','
  space1
  c <- parseDenotativeLC
  let d = cps c (const undefined)
  char ')'
  pure (Symantic (Pair b d))
  
  
parseFst :: Parser Exists
parseFst = do
  pair <- parsePair
  let a = cps pair (const undefined)
  pure (Symantic (Fst a))

parseSnd :: Parser Exists  
parseSnd = parseFst  
  
parseUnit :: Parser Exists
parseUnit = string "unit" *> pure (Symantic Unit)

parseInl :: Parser Exists
parseInl = do
  string "left"
  space1
  a <- parseDenotativeLC
  let b = cps a (const undefined)
  pure (Symantic (Inl b))
  
parseInr :: Parser Exists
parseInr = do
  string "right"
  space1
  a <- parseDenotativeLC
  let b = cps a (const undefined)
  pure (Symantic (Inl b))

parseDenotativeLC :: Parser Exists
parseDenotativeLC = parseVar <|> parseVariableSucc <|> parseAbstraction <|> parseApplication <|> parsePair <|> parseFst <|> parseSnd <|> parseInl <|> parseInr

Translation to tagless-final (Translator)

Now that we have an AST, we can then use tagless-final, we just have to translate the initial to the final first.

toFinal :: AST wrap => forall h a. Initial wrap h a -> wrap h a
toFinal (Eval i) = i
toFinal (Variable) = variable0
toFinal (VariableSucc a) = variableSucc (toFinal a)
toFinal (Application a b) = application (toFinal a) (toFinal b)
toFinal (Abstraction a) = abstraction (toFinal a)
toFinal (Pair a b) = pair (toFinal a) (toFinal b)
toFinal (Fst a) = fst' (toFinal a)
toFinal (Snd b) = snd' (toFinal b)
toFinal (Unit) = unit
toFinal (Inl a) = inl' (toFinal a) 
toFinal (Inr a) = inr' (toFinal a)   

Footnotes

[fn:1]: https://www.typetheoryforall.com/2022/08/04/21-Conal-Eliott-2.html [fn:2]: https://en.wikipedia.org/wiki/Literate_programming [fn:3]: https://okmij.org/ftp/tagless-final/course/lecture.pdf [fn:4]: https://curtclifton.net/papers/MoseleyMarks06a.pdf [fn:5]: https://dl.acm.org/doi/10.1007/978-3-031-21314-4_3 [fn:6]: https://emilaxelsson.github.io/documents/svenningsson2013combining.pdf [fn:7]: https://keikun555.github.io/documents/lambda.pdf