-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A shallow embedding of K definitions into Lean 4 #4552
Comments
A small script to print the AST for a
output
|
A manual encoding of |
Lean 4 features for translationThe following features have been used to manually translate K's Imp semantics to Lean 4. Some of what follows is outlined above. This is to have a bullet-style list of what is needed for the translation. Type definitionMechanisms we'll need to define types. Inductive TypesThese are particularly useful for user-defined syntax. They can express the productions needed to encode the semantics of Imp. Mutually Inductive TypesEven if the language-specific productions can be modeled with inductive types, K's internals must be defined using mutually inductive types. Example: StructuresStructures will likely have some use. Type ClassesThese are used to overload symbols. We want to overload symbols for injections of one sort into another, and we'll likely want to overload more symbols. Native TypesWhen possible, we should resort to Lean's native types such as Proving Facilities and ChecksWe could also auto-generate some proving facilities to make proving with the generated Lean more time efficient. Additionally, we should have checks in place, in the form of theorems, to ensure that the generated translation makes sense. For instance, we should have theorems ensuring the generated injections satisfy the diamond identity described above. NicetiesSome mechanisms in Lean can help a lot (in my opinion) in the long run when using the generated translation. NotationLean allows the definition of custom notation. Apart from prettifying things, this can help de-clutter horrid definitions and make working with the translation less cognitively taxing. CoercionsIn addition to notation, coercions can also significantly impact the syntactic overhead that one needs to go through to read a term. We should consider which coercions to set by default and how/if we want to enable optional coercions. Although the latter probably doesn't matter at this early stage. AttributesWe will probably want to add attributes such as |
Manual translation of IMP semantics to Lean 4import Lean.Data.AssocList
namespace KoreToLean
open Lean.AssocList
abbrev KId := String
-- Missing: syntax KResult ::= Value
/- TRANSLATION OF -/
/- syntax Expr ::= ... -/
/- Note that the production Value doesn't appear in definition.kore -/
inductive Expr where
| fromInt : Int → Expr -- Value (Int)
| fromBool : Bool → Expr -- Value (Bool)
| fromId : KId → Expr -- Id
| brackets : Expr → Expr -- "(" Expr ")"
| minus : Expr → Expr -- "-" Expr
| not : Expr → Expr -- "!" Expr
| mul : Expr → Expr → Expr -- Expr "*" Expr
| div : Expr → Expr → Expr -- Expr "/" Expr
| add : Expr → Expr → Expr -- Expr "+" Expr
| sub : Expr → Expr → Expr -- Expr "-" Expr
| geq : Expr → Expr → Expr -- Expr ">=" Expr
| gt : Expr → Expr → Expr -- Expr ">" Expr
| leq : Expr → Expr → Expr -- Expr "<=" Expr
| lt : Expr → Expr → Expr -- Expr "<" Expr
| eq : Expr → Expr → Expr -- Expr "==" Expr
| neq : Expr → Expr → Expr -- Expr "!=" Expr
| and : Expr → Expr → Expr -- Expr "&&" Expr
| or : Expr → Expr → Expr -- Expr "||" Expr
deriving Repr, BEq
instance (priority := low): Coe Int Expr where
coe v := Expr.fromInt v
instance (priority := low): Coe Bool Expr where
coe v := Expr.fromBool v
instance (priority := low): Coe String Expr where
coe s := Expr.fromId s
instance (priority := low): OfNat Expr n where
ofNat := Expr.fromInt n
example: 3 = Expr.fromInt 3 := rfl
/- This notation could be inferred from the syntax declaration
Note that for common symbols such as * we resort to *-/
/- Notation precedence is made up, the highest is 1024 -/
notation:max "⟦"v"⟧" => Expr.value v
notation:max "(" e ")'" => Expr.brackets e
notation:max e"⁻¹" => Expr.minus e
notation:max "!'" e:40 => Expr.not e
infixl:200 "*'" => Expr.mul
infixl:250 "/'" => Expr.div
infixl:150 "+'" => Expr.add
infixl:150 "-'" => Expr.sub
infixl:100 ">='" => Expr.geq
infixl:100 ">'" => Expr.gt
infixl:100 "<='" => Expr.leq
infixl:100 "<'" => Expr.lt
infixl:100 "=='" => Expr.eq
infixl:100 "!='" => Expr.neq
infixl:150 "&&'" => Expr.and
infixl:150 "||'" => Expr.or
/- Now we can write expressions such as -/
#check 3 +' true /' false
#check (false⁻¹ ==' true)' ==' 4
inductive Stmt where
| assign : KId -> Expr -> Stmt
| if_else : Expr -> Stmt -> Stmt -> Stmt
| if_single : Expr -> Stmt -> Stmt
| while : Expr -> Stmt -> Stmt
| curly_brakets : Stmt -> Stmt
| empty_brakets : Stmt
| seq_stmt : Stmt -> Stmt -> Stmt -- check priority
deriving Repr, BEq
notation kid "=" e ";" => Stmt.assign kid e
notation "if(" e ")" s₁:200 "else" s₂:200 => Stmt.if_else e s₁ s₂
notation "if(" e ")" s:200 => Stmt.if_single e s
notation "while(" e:200 ")" s:200 => Stmt.while e s
notation "{" s "}" => Stmt.curly_brakets s
notation "{}" => Stmt.empty_brakets
notation:70 s₁:70 "⁀" s₂:70 => Stmt.seq_stmt s₁ s₂
#check "x" = 3;
/- The following should be subsorts of KItem:
EnvCell,
Id,
GeneratedTopCell,
GeneratedCounterCell,
KCell,
Int,
Bool,
Map,
Expr,
Stmt
But we only add the following:
Id,
Int,
Bool,
Expr,
Stmt
-/
inductive KItem where
| inj_Int : Int → KItem
| inj_Bool : Bool → KItem
| inj_Id : String → KItem
| inj_Expr : Expr → KItem
| inj_Stmt : Stmt → KItem
deriving Repr, BEq
instance: OfNat KItem n where
ofNat := KItem.inj_Int n
/- We need to define the canonical injections from any sort to KItem
Canonical injections will be represented by the inj type class -/
/- NOTE: Instances of Inj classes will all have the @[simp] attribute! -/
class Inj (a b : Type) where
inj : a → b
abbrev InjKItem (a : Type) := Inj a KItem
abbrev InjExpr (a : Type) := Inj a Expr
@[simp]
def injExpr {α : Type} [i : InjExpr α] (a : α) : Expr := i.inj a
/- Expr injections for Int, Bool and Nat -/
@[simp]
instance: InjExpr Int where
inj := Expr.fromInt
@[simp]
instance: InjExpr Bool where
inj := Expr.fromBool
@[simp]
instance: InjExpr KId where
inj := Expr.fromId
instance: Coe Int Expr where
coe := injExpr
instance: Coe Bool Expr where
coe := injExpr
instance: Coe KId Expr where
coe := injExpr
@[simp]
def injKItem {α : Type} [i : InjKItem α] (a : α) : KItem := i.inj a
/- KItem injections for Int, Bool and Nat -/
@[simp]
instance: InjKItem Int where
inj := KItem.inj_Int
@[simp]
instance: InjKItem Bool where
inj := KItem.inj_Bool
@[simp]
instance: InjKItem KId where
inj := KItem.inj_Id
instance: Coe Int KItem where
coe := injKItem
instance: Coe Bool KItem where
coe := injKItem
instance: Coe KId KItem where
coe := injKItem
/- KItem injections for Expr -/
@[simp]
instance: InjKItem Expr where
inj expr := match expr with
| Expr.fromInt e => KItem.inj_Int e
| Expr.fromBool e => KItem.inj_Bool e
| Expr.fromId e => KItem.inj_Id e
| e => KItem.inj_Expr e
instance: Coe Expr KItem where
coe := injKItem
#check (3 +' true /' false : Expr)
#check (3 +' true /' false : KItem)
/- With the current definition we have the following triangle:
KItem
/ \
Expr |
\ /
Int/Bool/Id
In which an integer can be considered a KItem via being a expresison or as an integer. The following theorems show that
inj{Expr,KItem} ∘ inj{Int,Expr} = inj{Int,KItem}, i.e.
injKItem ∘ injExpr = injKItem for the domains Int/Bool/Id
-/
set_option pp.all false -- change to `true` for full terms
theorem inj_Int_sound (n : Int) : ((n : Expr) : KItem) = (n : KItem) := by simp
theorem inj_Bool_sound (b : Bool) : ((b : Expr) : KItem) = (b : KItem) := by simp
theorem inj_Id_sound (s : KId) : ((s : Expr) : KItem) = (s : KItem) := by simp
example: 3 = ((3 : Expr) : KItem) := rfl
example: 3 = KItem.inj_Int 3 := rfl
/- Note that 3 : KItem is KItem.inj_Int 3.
With this formalization, elements of the form
`KItem.inj_Expr (Expr.fromInt n)` are dangling -/
example: 3 ≠ KItem.inj_Expr (3 : Int) := KItem.noConfusion
/- KItem injections for Stmt -/
@[simp]
instance: InjKItem Stmt where
inj := KItem.inj_Stmt
instance: Coe Stmt KItem where
coe := injKItem
#check ("x" = 3; : KItem)
#check (3 : KItem)
/- K structures like K, Map, Cells... -/
/- Note that these structures don't belong to the sort KItem -/
abbrev K := List KItem
notation ".K" => @List.nil KItem
infixr:10 "~>" => @List.cons KItem -- Shouldn't have high priority
-- Coercion makes a singleton list
instance : Coe KItem K where
coe kitem := kitem ~> .K
example: ((4 : Int) : KItem) = ((4 : Expr) : KItem) := rfl
abbrev Map := Lean.AssocList KItem KItem
notation ".Map" => @nil KItem KItem
notation M"[" v "<-" k "]" => @cons KItem KItem k v M
#check .Map
#check .Map[3 <- "x"]
structure EnvCell where
env : Map
structure KCell where
k : K
structure GeneratedTopCell where
kCell : KCell
envCell : EnvCell
abbrev Cfg := GeneratedTopCell
notation "<k>" k "</k>" => KCell.mk k
notation "<env>" envMap "</env>" => EnvCell.mk envMap
-- We can't just put ⟪k e⟫. We need to separate them
notation "⟪" k "," e "⟫" => GeneratedTopCell.mk k e
#check (<k> .K </k> : KCell)
#check (<env> .Map </env> : EnvCell)
#check (⟪<k> .K </k>, <env> .Map </env>⟫ : Cfg)
/- We need a function `lookup` to find the value of a key in a Map if we know the key is present -/
theorem contains_find_some {m : Map} {x : KId} (h : contains (x : KItem) m) :
∃ v : KItem, m.find? (x:KItem) = some v := by
induction m <;> simp [contains] at h
case cons k v ms ih =>
cases h <;> simp [find?, *]
case inr h' =>
cases (ih h'); case intro v' s =>
split; exists v; exists v'
def lookup (m : Map) (x : KId) (h : contains (x : KItem) m) : KItem :=
match f: m.find? (x : KItem) with
| some v => v
| none => by
exfalso; cases (contains_find_some h)
case intro k h' => rw [h'] at f; contradiction
--The following proof makes `lookup` noncomputable
/- by
induction m <;> simp[contains] at h
case cons k v ms _ => exact v -/
#print lookup
#reduce lookup (.Map[3 <- "x"]) "x" rfl
#eval lookup (.Map[3 <- "x"]) "x" rfl
example: lookup (.Map[3 <- "x"]) "x" rfl = 3 := by rfl
/- RewriteStep is the 1-step rewrite relation -/
inductive RwS : Cfg → Cfg → Prop where
/- Expr rules -/
--rule - X => 0 -Int X
| neg {n : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n⁻¹ ~> k</k>, e⟫
⟪<k>(0 : Int) - n ~> k</k>, e⟫
-- rule ! B => notBool B
| not {b : Bool} {k : K} {e : EnvCell} :
RwS ⟪<k>!'n ~> k</k>, e⟫
⟪<k>!b ~> k</k>, e⟫
-- rule X + Y => X +Int Y
| add {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n +' m ~> k</k>, e⟫
⟪<k>n + m ~> k</k>, e⟫
-- rule X - Y => X -Int Y
| sub {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n -' m ~> k</k>, e⟫
⟪<k>n - m ~> k</k>, e⟫
-- rule X * Y => X *Int Y
| mul {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n *' m ~> k</k>, e⟫
⟪<k>n * m ~> k</k>, e⟫
-- rule X / Y => X /Int Y (not really /Int)
| div {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n /' m ~> k</k>, e⟫
⟪<k>n / m ~> k</k>, e⟫
-- rule I1 >= I2 => I1 >=Int I2
| ge {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n >=' m ~> k</k>, e⟫
⟪<k>decide (n >= m) ~> k</k>, e⟫
-- rule I1 > I2 => I1 >=Int I2
| gt {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n >' m ~> k</k>, e⟫
⟪<k>decide (n > m) ~> k</k>, e⟫
-- rule I1 <= I2 => I1 >=Int I2
| le {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n <=' m ~> k</k>, e⟫
⟪<k>decide (n <= m) ~> k</k>, e⟫
-- rule I1 < I2 => I1 >=Int I2
| lt {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n <' m ~> k</k>, e⟫
⟪<k>decide (n < m) ~> k</k>, e⟫
-- rule B1 == B2 => B1 ==Bool B2
| eq_bool {n m : Bool} {k : K} {e : EnvCell} :
RwS ⟪<k>n ==' m ~> k</k>, e⟫
⟪<k>n == m ~> k</k>, e⟫
-- rule B1 == B2 => B1 ==Int B2
| eq_Int {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n ==' m ~> k</k>, e⟫
⟪<k>n == m ~> k</k>, e⟫
-- rule B1 != B2 => B1 !=Bool B2
| neq_bool {n m : Bool} {k : K} {e : EnvCell} :
RwS ⟪<k>n !=' m ~> k</k>, e⟫
⟪<k>n != m ~> k</k>, e⟫
-- rule B1 != B2 => B1 !=Int B2
| neq_Int {n m : Int} {k : K} {e : EnvCell} :
RwS ⟪<k>n !=' m ~> k</k>, e⟫
⟪<k>n != m ~> k</k>, e⟫
-- rule true && B => B
| and_t {b : Bool} {k : K} {e : EnvCell} :
RwS ⟪<k>true &&' b ~> k</k>, e⟫
⟪<k>b ~> k</k>, e⟫
-- rule false && _ => false
| and_f {b : Bool} {k : K} {e : EnvCell} :
RwS ⟪<k>false &&' _ ~> k</k>, e⟫
⟪<k>false ~> k</k>, e⟫
-- rule true || _ => true
| or_t {k : K} {e : EnvCell} :
RwS ⟪<k>true ||' _ ~> k</k>, e⟫
⟪<k>true ~> k</k>, e⟫
-- rule false || B => B
| or_f {b : Bool} {k : K} {e : EnvCell} :
RwS ⟪<k>false ||' b ~> k</k>, e⟫
⟪<k>b ~> k</k>, e⟫
/- Calc rules -/
-- rule [step]: <k> S1:Stmt S2:Stmt => S1 ~> S2 ... </k>
| seq_stmt {s₁ s₂ : Stmt} {k : K} {e : EnvCell} :
RwS ⟪<k>s₁⁀s₂ ~> k</k>, e⟫
⟪<k>s₁ ~> s₂ ~> k</k>, e⟫
-- rule [var]:
-- <k> X:Id => V ... </k>
-- <env> X |-> V ... </env>
| eval {x : KId} {k : K} {e : EnvCell} {h : contains (x : KItem) e.env}:
RwS ⟪<k>x ~> k</k>, e⟫
⟪<k>lookup e.env x h ~> k</k>, e⟫
-- rule [assign]:
-- <k> X = V:Value ; => .K ... </k>
-- <env> E => E [ X <- V ] </env>
| assign {x : KId} {v : Expr} {k : K} {env : Map}:
RwS ⟪<k>x = v; ~> k</k>, <env> env </env>⟫
⟪<k>k</k>, <env>env[v <- x]</env>⟫
/- Imp rules -/
-- rule <k> if ( true ) S1 else _ => S1 ... </k>
| if_else_true {s : Stmt} {k : K} {e : EnvCell}:
RwS ⟪<k>if true then s else _ ~> k</k>, e⟫
⟪<k>s ~> k</k>, e⟫
-- rule <k> if ( false ) _ else S2 => S2 ... </k>
| if_else_false {s : Stmt} {k : K} {e : EnvCell}:
RwS ⟪<k>if(false) _ else s ~> k</k>, e⟫
⟪<k>s ~> k</k>, e⟫
-- rule <k> if ( C ) S => if ( C ) S else {} ... </k>
| if {b : Bool} {s : Stmt} {k : K} {e : EnvCell}:
RwS ⟪<k>if(b) s ~> k</k>, e⟫
⟪<k>if(b) s else {} ~> k</k>, e⟫
-- rule <k>
-- while ( C ) S
-- =>
-- if ( C ) {
-- S
-- while ( C ) S
-- }
-- ...
-- </k>
| while {b : Bool} {s : Stmt} {k : K} {e : EnvCell}:
RwS ⟪<k> while(b) s ~> k</k>, e⟫
⟪<k> if(c) {s ⁀ while(b) s} ~> k</k>, e⟫
-- rule <k> { S } => S ~> { } ... </k>
| block {s : Stmt} {k : K} {e : EnvCell}:
RwS ⟪<k> { s } ~> k</k>, e⟫
⟪<k> s ~> {} ~> k</k>, e⟫
-- rule <k> { } => .K ... </k>
| empty {k : K} {e : EnvCell}:
RwS ⟪<k>{} ~> k</k>, e⟫
⟪<k> k </k>, e⟫
infixr:50 "⇛" => RwS
-- set_option pp.notation false
#print RwS.not
#print RwS.neq_Int
#print RwS.seq_stmt
#print RwS.if
example: ⟪<k> "x" = 3; ~> .K </k>, <env> .Map </env>⟫
⇛
⟪<k> .K </k>, <env> .Map[3 <- "x"] </env>⟫
:= by constructor
example:
⟪<k> "x" ~> .K </k>,
<env> .Map[3 <- "x"] </env>⟫
⇛
⟪<k> 3 ~> .K </k>,
<env> .Map[3 <- "x"] </env>⟫
:= by constructor; rw [contains]; rfl
end KoreToLean |
The transformation should be based on
definition.kore
. However, for ease of exposition, examples will be based on the.k
definition.Prelude
We need a manual Lean 4 implementation of (parts of)
domains.md
. The implementation should define sorts likeInt
andMap
, functions over them, and provide useful theorems. Stubs for hooked functions, as well as non-hooked function implementations (hopefully) can be auto-generated.We need to decide if we want to map sorts directly (e.g. K
Int
to Lean 4Int
), or add an extra layer (KInt.mk : Int -> KInt
). In the direct mapping case,abbrev
is useful for aliasing Lean 4 types:The mapping is obvious for some sorts, for others, we might need a custom implementation:
Bool
Bool
Int
Int
String
String
Bytes
ByteArray
List
List KItem
Set
Map
The sort
K
with constructorsdotk : K
andkseq : KItem -> K
can either be modeled asList KItem
, or as generated fromdefinition.kore
.User defined sorts
Sorts that depend on each other should be put in a
mutual
block:It's probably fine to model user lists as
List
.Injection
The sort of ad-hoc polymorphism provided by injections can be modeled using type classes.
It is probably possible to define a transitivity instance to reduce the number of explicitly defined instances, but that complicates type class resolution.
Unfortunately, implementing the
inj
function is not straightforward. To demonstrate the problem, considerHere, we need
Without a deeper embedding, this does not seem directly feasible, but approximations are possible.
Axiomatic approach
Here, we just assert the function and its transitivity. On the downside, all functions calling
inj
have to be markednoncomputable
, so we lose computability.Constructive approach
Here, as expected,
On the downside, the constructors induce spurious terms like
KItem.inj_aexp (AExp.inj_int (KInt.mk 0))
. Since we never refer to such terms in the transition relation, this is probably acceptable.Rewrite rules
It seems straightforward to get an overapproximation of the transition relation by neglecting rule priorities and ensures clauses (and probably other finer details we're just not aware yet).
Free variables become implicit arguments,
requires
becomes an antecedent, and the LHS and RHS are mapped directly.Functions
Ideally, we can map K functions to Lean 4 functions. So far, this seems to be the most challenging part of the transformation. The reason for this is a discrepancy in representation. K function rules have a priority, a
requires
and a pattern to match. These induceif
andmatch
expressions, and the cases for the latter need to be non-overlapping and total.We need a best effort transformation that handles at least the simple cases (like
foo
below). We might also be able to rewrite some functions to a form that's better suited for the transformation. For the rest of the cases, we can still generate a signature, and either handle them axiomatically (by generating theorems fromsimplification
rules), or implement them manually.Totality
The generated Lean 4 functions have to be total. We can handle this by returning
Option
from non-total functions, then expect a proof of definedness for rewrite rules:Here, we implicitly assume the
requires
clauses are non-overlapping (hopefully K enforces this to some extent).Note that we could mark
foo
total
in K. In these case, we can auto generate a proof obligation:From this, we can get the witness for taking rewrite step
do-foo
. An even better solution to generate inline proofs / proof stubs for fallthrough cases:Then the
Option
wrapper and the extra implication can even be omitted:Code generation
The code generator should be implemented in a way that auto generated vs user supplied code is separated.
Also, code generation should be based on an internal model of the generated Lean 4 program (as opposed to appending strings directly as in
ModuleToKore
). In one extreme, the model is the Lean 4 abstract syntax, but full generality is not necessary for our purposes. We should check if there's a Python package that implements the Lean 4 AST.Files representing Lean 4 abstract syntax:
Related:
definition.kore
#4696The text was updated successfully, but these errors were encountered: