Initial commit: 15-unit Lean 4 + PL semantics tutorial curriculum
This commit is contained in:
82
tutorial-02-semantics/08-syntax-representation.md
Normal file
82
tutorial-02-semantics/08-syntax-representation.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# Unit 8 — Representing Syntax
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Encode lambda calculus terms as an inductive type
|
||||
- Understand three binding representations:
|
||||
1. **Named** (strings — simple, but α-equiv isn't definitional)
|
||||
2. **de Bruijn indices** (numbers — α-equiv is free, shifting is painful)
|
||||
3. **Locally nameless** (free vars named, bound vars indexed — compromise)
|
||||
|
||||
We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial,
|
||||
with locally nameless for comparison.
|
||||
|
||||
## Sources
|
||||
|
||||
- syndikos/lean4-stlc `Syntax.lean`: https://github.com/syndikos/lean4-stlc
|
||||
- Chris Henson 2025: https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html
|
||||
- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch
|
||||
- Software Foundations Vol.2: https://softwarefoundations.cis.upenn.edu/
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
-- 8.1 — Named representation
|
||||
inductive NamedTerm where
|
||||
| var (x : String)
|
||||
| lam (x : String) (body : NamedTerm)
|
||||
| app (f arg : NamedTerm)
|
||||
deriving Repr
|
||||
|
||||
-- The Church encoding of identity: λx. x
|
||||
def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x")
|
||||
|
||||
-- Encode λx. λy. x (K combinator)
|
||||
def kNamed : NamedTerm :=
|
||||
sorry
|
||||
|
||||
-- Encode λf. λx. f (f x) (Church numeral 2)
|
||||
def twoNamed : NamedTerm :=
|
||||
sorry
|
||||
|
||||
-- 8.2 — de Bruijn representation
|
||||
-- Variables are numbers: 0 = nearest binder, 1 = next, etc.
|
||||
inductive DBTerm where
|
||||
| var (idx : Nat) -- variable reference by binding distance
|
||||
| lam (body : DBTerm) -- λ. body (no name needed!)
|
||||
| app (f arg : DBTerm)
|
||||
deriving Repr
|
||||
|
||||
-- λ. λ. 1 (= λx. λy. x in named form)
|
||||
def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1))
|
||||
|
||||
-- λ. 0 (= λx. x in named form)
|
||||
def idDB : DBTerm := DBTerm.lam (DBTerm.var 0)
|
||||
|
||||
-- Encode λf. λx. f (f x) (Church 2)
|
||||
def twoDB : DBTerm :=
|
||||
sorry
|
||||
|
||||
-- 8.3 — Locally nameless
|
||||
-- Free variables are strings, bound variables are de Bruijn indices
|
||||
-- (You don't need to implement this fully — just understand the idea)
|
||||
inductive LNTerm where
|
||||
| fvar (x : String) -- free variable
|
||||
| bvar (idx : Nat) -- bound variable (de Bruijn)
|
||||
| lam (body : LNTerm) -- binder
|
||||
| app (f arg : LNTerm)
|
||||
deriving Repr
|
||||
```
|
||||
|
||||
### Key insight for PL semantics
|
||||
|
||||
When we encode **typing contexts** `Γ = x₁:τ₁, x₂:τ₂, ...`, de Bruijn indices
|
||||
give us "index into the context" for free. The last binding is index 0, the
|
||||
second-last is index 1, etc. This makes the typing rules elegant in Lean —
|
||||
no name-clash avoidance needed.
|
||||
|
||||
---
|
||||
|
||||
← [Tutorial 1 — Unit 7](../tutorial-01-basics/07-dependent-types.md) · Next: [Unit 9 — Substitution](09-substitution.md)
|
||||
81
tutorial-02-semantics/09-substitution.md
Normal file
81
tutorial-02-semantics/09-substitution.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# Unit 9 — Substitution and α-Equivalence
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Implement `subst` (substitution) for de Bruijn terms
|
||||
- Implement `lift` (index shifting) — the key operation that avoids capture
|
||||
- Prove fundamental substitution lemmas
|
||||
|
||||
## Sources
|
||||
|
||||
- syndikos/lean4-stlc `Syntax.lean` and `Reduction.lean`
|
||||
- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch
|
||||
- PLFA Part 2: https://plfa.github.io/
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
open DBTerm -- from Unit 8
|
||||
|
||||
-- 9.1 — Lift: shift all *free* variables above a cutoff by k
|
||||
-- Lift is the engine that makes substitution work under binders.
|
||||
-- When we go under a λ, existing de Bruijn indices need to shift by 1
|
||||
-- because a new binder appears.
|
||||
def lift (t : DBTerm) (cutoff k : Nat) : DBTerm :=
|
||||
match t with
|
||||
| var n => if n < cutoff then var n else var (n + k)
|
||||
| lam body => lam (lift body (cutoff + 1) k)
|
||||
| app f a => app (lift f cutoff k) (lift a cutoff k)
|
||||
|
||||
-- Verify: lifting the identity function
|
||||
-- lift (λ. 0) 0 1 should produce λ. 0 (bound variables inside λ are untouched)
|
||||
#eval lift idDB 0 1
|
||||
|
||||
-- 9.2 — Substitution: replace variable `idx` with `s` in term `t`
|
||||
-- This is capture-avoiding substitution via de Bruijn arithmetic.
|
||||
-- Under a λ, the target index shifts by 1 and the replacement gets lifted.
|
||||
def subst (t : DBTerm) (idx : Nat) (s : DBTerm) : DBTerm :=
|
||||
match t with
|
||||
| var n =>
|
||||
if n < idx then var n
|
||||
else if n == idx then s -- found the variable to replace
|
||||
else var (n - 1) -- shift remaining vars down
|
||||
| lam body => lam (subst body (idx + 1) (lift s 0 1))
|
||||
| app f a => app (subst f idx s) (subst a idx s)
|
||||
|
||||
-- Test: substitute λ. 0 for variable 0 in (λ. 0) -- should be identity
|
||||
-- Actually: [0 ↦ λ.0] (λ. 0) — nothing changes (0 is bound)
|
||||
#eval subst (lam (var 0)) 0 (lam (var 0))
|
||||
|
||||
-- 9.3 — Single-step β-reduction (preview of next unit)
|
||||
-- β: (λ. t) s → subst t 0 s
|
||||
def betaRedex : DBTerm := app (lam (var 0)) (lam (var 0)) -- (λx. x) (λy. y) → ...y...
|
||||
|
||||
-- Compute the result of one β step on this term
|
||||
#eval subst (var 0) 0 (lam (var 0)) -- should be lam (var 0) (= λy. y)
|
||||
|
||||
-- 9.4 — Lemma: Substitution under a lambda
|
||||
-- If we substitute for index 0 in a term that's already shifted,
|
||||
-- the lift in the body handles the shift correctly.
|
||||
-- Prove this property:
|
||||
lemma subst_lift_commute (t s : DBTerm) (n : Nat) :
|
||||
subst (lift t 0 1) (n+1) (lift s 0 1) = lift (subst t n s) 0 1 :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- This lemma is *critical* for proving preservation/type safety later.
|
||||
-- It says that shifting before and after substitution agree.
|
||||
```
|
||||
|
||||
### Why this matters
|
||||
|
||||
Substitution is the core operation of operational semantics. Doing it right
|
||||
(no variable capture) is the hardest part of formalizing lambda calculi.
|
||||
de Bruijn indices make capture impossible — the arithmetic is mechanical.
|
||||
The lemmas here are boilerplate you prove once, then reuse everywhere.
|
||||
|
||||
---
|
||||
|
||||
← [Previous: Unit 8](08-syntax-representation.md) · Next: [Unit 10 — Small-Step Semantics](10-small-step-semantics.md)
|
||||
65
tutorial-02-semantics/10-small-step-semantics.md
Normal file
65
tutorial-02-semantics/10-small-step-semantics.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Unit 10 — Small-Step Operational Semantics
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Define single-step reduction `→` for call-by-value λ-calculus
|
||||
- Define reflexive-transitive closure `→*`
|
||||
- Prove determinism: `t → t₁ ∧ t → t₂ ⇒ t₁ = t₂`
|
||||
|
||||
## Sources
|
||||
|
||||
- syndikos/lean4-stlc `Reduction.lean`, `Semantics.lean`
|
||||
- Software Foundations Vol.2 "Smallstep"
|
||||
- PLFA Part 2
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
open DBTerm
|
||||
|
||||
-- 10.1 — Single-step cbv reduction
|
||||
-- Call-by-value: reduce the *argument* before substituting into the function body.
|
||||
-- A value is a λ-abstraction (in pure STLC).
|
||||
inductive Value : DBTerm → Prop where
|
||||
| lam : Value (lam b)
|
||||
|
||||
inductive Step : DBTerm → DBTerm → Prop where
|
||||
-- β-reduction: (λ. t) v → subst t 0 v (where v is a value)
|
||||
| beta (t v : DBTerm) (hv : Value v) :
|
||||
Step (app (lam t) v) (subst t 0 v)
|
||||
-- Congruence: if f → f', then f a → f' a
|
||||
| appL (f f' a : DBTerm) (hstep : Step f f') :
|
||||
Step (app f a) (app f' a)
|
||||
-- Congruence: if v is a value and a → a', then v a → v a'
|
||||
| appR (v a a' : DBTerm) (hv : Value v) (hstep : Step a a') :
|
||||
Step (app v a) (app v a')
|
||||
|
||||
-- 10.2 — Multi-step (reflexive-transitive closure)
|
||||
inductive StepStar : DBTerm → DBTerm → Prop where
|
||||
| refl (t : DBTerm) : StepStar t t
|
||||
| step (t₁ t₂ t₃ : DBTerm) (hstep : Step t₁ t₂) (hstar : StepStar t₂ t₃) :
|
||||
StepStar t₁ t₃
|
||||
|
||||
-- 10.3 — Determinism
|
||||
theorem step_deterministic {t t₁ t₂ : DBTerm} (h₁ : Step t t₁) (h₂ : Step t t₂) : t₁ = t₂ :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- Hint: induction on h₁, then cases on h₂. The tricky case:
|
||||
-- when one side is (lam t) and the other is a congruence rule that also
|
||||
-- produces (lam t) — they match.
|
||||
|
||||
-- 10.4 — Stuck terms
|
||||
-- A term is "stuck" if it's not a value and can't take a step.
|
||||
-- Type soundness = "well-typed terms never get stuck".
|
||||
def Stuck (t : DBTerm) : Prop :=
|
||||
¬ Value t ∧ ¬ ∃ t', Step t t'
|
||||
|
||||
-- Find a stuck term in the untyped calculus
|
||||
def stuckExample : DBTerm :=
|
||||
app (var 0) (var 1) -- a free variable applied to another free variable
|
||||
|
||||
-- Show it's stuck (you'll prove something similar in Unit 12 with types)
|
||||
```
|
||||
78
tutorial-02-semantics/11-stlc-type-system.md
Normal file
78
tutorial-02-semantics/11-stlc-type-system.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# Unit 11 — STLC Type System
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Define simple types (`Nat → Bool`, etc.)
|
||||
- Define typing contexts `Γ` as lists of types
|
||||
- Encode the STLC typing judgment `Γ ⊢ t : τ`
|
||||
|
||||
## Sources
|
||||
|
||||
- syndikos/lean4-stlc `Typing.lean`
|
||||
- Software Foundations Vol.2 "Types"
|
||||
- sdemos/type-inference for type inference perspective
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
open DBTerm
|
||||
|
||||
-- 11.1 — Types
|
||||
inductive Ty where
|
||||
| nat
|
||||
| bool
|
||||
| fn (τ₁ τ₂ : Ty) -- τ₁ → τ₂
|
||||
deriving Repr, DecidableEq
|
||||
|
||||
-- Notations: τ₁ ⇒ τ₂ for function types, like in most PL papers
|
||||
notation:40 τ₁ " ⇒ " τ₂ => Ty.fn τ₁ τ₂
|
||||
|
||||
-- 11.2 — Contexts
|
||||
-- A context is a list of types, indexed by de Bruijn index.
|
||||
-- Position 0 = the most recently bound variable.
|
||||
def Ctx := List Ty
|
||||
|
||||
-- 11.3 — Lookup: get the type of variable `i` in context `Γ`
|
||||
def lookup (Γ : Ctx) (i : Nat) : Option Ty :=
|
||||
match Γ with
|
||||
| [] => none
|
||||
| τ :: _ => if i == 0 then some τ else lookup (Γ.tail) (i - 1)
|
||||
|
||||
-- 11.4 — Typing judgment: Γ ⊢ t : τ
|
||||
-- De Bruijn version — no variable names needed!
|
||||
inductive HasType : Ctx → DBTerm → Ty → Prop where
|
||||
| var (Γ : Ctx) (i : Nat) (h : lookup Γ i = some τ) :
|
||||
HasType Γ (var i) τ
|
||||
| lam (Γ : Ctx) (t : DBTerm) (τ₁ τ₂ : Ty) (h : HasType (τ₁ :: Γ) t τ₂) :
|
||||
HasType Γ (lam t) (τ₁ ⇒ τ₂)
|
||||
| app (Γ : Ctx) (t₁ t₂ : DBTerm) (τ₁ τ₂ : Ty)
|
||||
(h₁ : HasType Γ t₁ (τ₁ ⇒ τ₂)) (h₂ : HasType Γ t₂ τ₁) :
|
||||
HasType Γ (app t₁ t₂) τ₂
|
||||
|
||||
-- 11.5 — Exercise: type derivations
|
||||
-- Build a typing derivation for the identity function λx. x
|
||||
theorem id_typed (τ : Ty) : HasType [] (lam (var 0)) (τ ⇒ τ) :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- Build a typing derivation for the K combinator λx. λy. x
|
||||
theorem k_typed (τ₁ τ₂ : Ty) : HasType [] (lam (lam (var 1))) (τ₁ ⇒ τ₂ ⇒ τ₁) :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- Build a typing derivation for the S combinator: λx. λy. λz. x z (y z)
|
||||
-- (This is a good stress test — it exercises both app and lam multiple times)
|
||||
theorem s_typed (τ₁ τ₂ τ₃ : Ty) :
|
||||
HasType [] (lam (lam (lam (app (app (var 2) (var 0)) (app (var 1) (var 0))))))
|
||||
((τ₁ ⇒ τ₂ ⇒ τ₃) ⇒ (τ₁ ⇒ τ₂) ⇒ τ₁ ⇒ τ₃) :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- 11.6 — Weakening: adding extra types to the context preserves typing
|
||||
theorem weakening (Γ Δ : Ctx) (t : DBTerm) (τ : Ty)
|
||||
(h : HasType Γ t τ) (hsub : ∀ i, lookup Γ i = lookup (Δ ++ Γ) i) : HasType (Δ ++ Γ) t τ :=
|
||||
by
|
||||
sorry
|
||||
```
|
||||
104
tutorial-02-semantics/12-type-safety.md
Normal file
104
tutorial-02-semantics/12-type-safety.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Unit 12 — Type Safety (Progress + Preservation)
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Prove **Progress**: well-typed terms are either values or can step
|
||||
- Prove **Preservation** (Subject Reduction): types survive reduction
|
||||
- Combine them: **Type Soundness** = well-typed programs don't get stuck
|
||||
|
||||
## Sources
|
||||
|
||||
- syndikos/lean4-stlc `Metatheory.lean`
|
||||
- Software Foundations Vol.2
|
||||
- PLFA Part 2: DeBruijn chapter
|
||||
|
||||
## Key Lemmas
|
||||
|
||||
```
|
||||
Canonical Forms:
|
||||
If [] ⊢ v : τ₁ ⇒ τ₂ and v is a value, then v = λ.t
|
||||
|
||||
Substitution Lemma:
|
||||
If τ₁::Γ ⊢ t : τ₂ and Γ ⊢ s : τ₁, then Γ ⊢ subst t 0 s : τ₂
|
||||
|
||||
Preservation:
|
||||
If Γ ⊢ t : τ and t → t', then Γ ⊢ t' : τ
|
||||
|
||||
Progress:
|
||||
If [] ⊢ t : τ, then either Value t or ∃ t', t → t'
|
||||
```
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
open DBTerm
|
||||
open Step
|
||||
open HasType
|
||||
|
||||
-- 12.1 — Canonical forms lemma for functions
|
||||
theorem canonical_forms_fn {t : DBTerm} {τ₁ τ₂ : Ty}
|
||||
(h : HasType [] t (τ₁ ⇒ τ₂)) (hv : Value t) : ∃ body, t = lam body :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- 12.2 — Substitution lemma (the heart of the proof)
|
||||
-- This uses the substitution commutation lemma from Unit 9.
|
||||
theorem substitution_lemma (Γ : Ctx) (t s : DBTerm) (τ₁ τ₂ : Ty)
|
||||
(h₁ : HasType (τ₁ :: Γ) t τ₂) (h₂ : HasType Γ s τ₁) :
|
||||
HasType Γ (subst t 0 s) τ₂ :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- Hint: induction on h₁. The variable case branches on whether the index
|
||||
-- is 0 (use h₂), less than 0 (impossible — Nat), or greater (use h₁).
|
||||
-- The lambda case requires weakening to adjust the context.
|
||||
|
||||
-- 12.3 — Preservation (Subject Reduction)
|
||||
theorem preservation {Γ : Ctx} {t t' : DBTerm} {τ : Ty}
|
||||
(htype : HasType Γ t τ) (hstep : Step t t') : HasType Γ t' τ :=
|
||||
by
|
||||
induction hstep generalizing τ with
|
||||
| beta t v hv =>
|
||||
-- Case (λ.t) v → subst t 0 v
|
||||
-- Invert htype to get HasType (τ₁::Γ) t τ₂ and HasType Γ v τ₁
|
||||
-- Then apply the substitution lemma
|
||||
sorry
|
||||
| appL f f' a hstep_f ih =>
|
||||
-- Case f a → f' a where f → f'
|
||||
sorry
|
||||
| appR v a a' hv hstep_a ih =>
|
||||
-- Case v a → v a' where a → a'
|
||||
sorry
|
||||
|
||||
-- 12.4 — Progress
|
||||
theorem progress {t : DBTerm} {τ : Ty} (h : HasType [] t τ) :
|
||||
Value t ∨ ∃ t', Step t t' :=
|
||||
by
|
||||
induction h with
|
||||
| var Γ i hlook =>
|
||||
-- A variable can't be typed in the empty context
|
||||
sorry
|
||||
| lam Γ body τ₁ τ₂ hbody =>
|
||||
-- A lambda is a value
|
||||
left; exact Value.lam
|
||||
| app Γ f a τ₁ τ₂ hf ha ihf iha =>
|
||||
-- f a: either f is a value, or it can step, etc.
|
||||
sorry
|
||||
|
||||
-- 12.5 — Soundness corollary
|
||||
-- Well-typed programs never get stuck.
|
||||
theorem soundness {t : DBTerm} {τ : Ty} (h : HasType [] t τ) :
|
||||
¬ Stuck t :=
|
||||
by
|
||||
-- A term is stuck if it's not a value AND can't step.
|
||||
-- By progress, it's either a value or can step.
|
||||
-- In both cases, it's not stuck.
|
||||
intro hstuck
|
||||
rcases hstuck with ⟨hnotval, hnostep⟩
|
||||
rcases progress h with (hv | hstep)
|
||||
· exact hnotval hv
|
||||
· apply hnostep
|
||||
exact hstep
|
||||
```
|
||||
119
tutorial-02-semantics/13-hm-declarative.md
Normal file
119
tutorial-02-semantics/13-hm-declarative.md
Normal file
@@ -0,0 +1,119 @@
|
||||
# Unit 13 — Hindley-Milner Declarative Typing Rules
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Extend STLC with **let-polymorphism**
|
||||
- Encode HM types with **type schemes** `∀α. τ`
|
||||
- Formalize the 6 declarative typing rules (Var, App, Abs, Let, Inst, Gen)
|
||||
- Understand the distinction between monomorphic (`τ`) and polymorphic (`σ = ∀α.τ`) types
|
||||
|
||||
## Sources
|
||||
|
||||
- [Wikipedia: Hindley-Milner type system](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system)
|
||||
- [Vaughan 2008](https://www.jeffvaughan.net/docs/hmproof.pdf) (declarative rules, §2)
|
||||
- [sdemos/type-inference](https://github.com/sdemos/type-inference) (Lean formalization)
|
||||
|
||||
## Background
|
||||
|
||||
HM adds **let-polymorphism** to STLC:
|
||||
```
|
||||
let x = e₁ in e₂
|
||||
```
|
||||
In STLC, `x` would have a single monomorphic type. In HM, `x` gets a type
|
||||
*scheme* — e.g., `∀α. α → α` for the identity function — and each use of `x`
|
||||
instantiates the scheme with fresh type variables.
|
||||
|
||||
The declarative rules:
|
||||
```
|
||||
Var: Γ, x:σ ⊢ x : σ
|
||||
App: Γ ⊢ e₁ : τ₁ → τ₂ Γ ⊢ e₂ : τ₁ ⇛ Γ ⊢ e₁ e₂ : τ₂
|
||||
Abs: Γ, x:τ₁ ⊢ e : τ₂ ⇛ Γ ⊢ λx.e : τ₁ → τ₂
|
||||
Let: Γ ⊢ e₁ : σ Γ, x:σ ⊢ e₂ : τ ⇛ Γ ⊢ let x = e₁ in e₂ : τ
|
||||
Inst: Γ ⊢ e : ∀α.τ ⇛ Γ ⊢ e : τ[α := τ']
|
||||
Gen: Γ ⊢ e : τ ∧ α ∉ ftv(Γ) ⇛ Γ ⊢ e : ∀α.τ
|
||||
```
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
-- 13.1 — Types and type schemes
|
||||
inductive MonoType where
|
||||
| tvar (id : Nat) -- type variable α
|
||||
| fn (τ₁ τ₂ : MonoType) -- τ₁ → τ₂
|
||||
deriving Repr, DecidableEq
|
||||
|
||||
notation:40 τ₁ " ⇒ " τ₂ => MonoType.fn τ₁ τ₂
|
||||
|
||||
-- A type scheme: ∀α₁...αₙ. τ (quantified over a set of type variables)
|
||||
structure TypeScheme where
|
||||
vars : List Nat -- bound type variables
|
||||
body : MonoType
|
||||
deriving Repr
|
||||
|
||||
-- 13.2 — HM expression syntax (extend STLC with `let`)
|
||||
inductive HMExpr where
|
||||
| var (i : Nat) -- de Bruijn index
|
||||
| lam (body : HMExpr) -- λ. e
|
||||
| app (f a : HMExpr) -- f a
|
||||
| lett (e₁ e₂ : HMExpr) -- let x = e₁ in e₂
|
||||
deriving Repr
|
||||
|
||||
-- 13.3 — Environments
|
||||
-- Γ maps de Bruijn indices to type schemes
|
||||
def HMEnv := List TypeScheme
|
||||
|
||||
-- 13.4 — Declarative typing rules
|
||||
inductive HMTyping : HMEnv → HMExpr → TypeScheme → Prop where
|
||||
|
||||
-- Var: look up variable in the environment
|
||||
| var (Γ : HMEnv) (i : Nat) (h : lookup Γ i = some σ) : HMTyping Γ (HMExpr.var i) σ
|
||||
|
||||
-- Abs: λx. e — note: x gets a monomorphic type in the lambda body
|
||||
| abs (Γ : HMEnv) (e : HMExpr) (τ₁ τ₂ : MonoType)
|
||||
(h : HMTyping ({vars := [], body := τ₁} :: Γ) e {vars := [], body := τ₂}) :
|
||||
HMTyping Γ (HMExpr.lam e) {vars := [], body := τ₁ ⇒ τ₂}
|
||||
|
||||
-- App: e₁ e₂
|
||||
| app (Γ : HMEnv) (e₁ e₂ : HMExpr) (σ τ₁ τ₂ : MonoType) (σ' : TypeScheme)
|
||||
(h₁ : HMTyping Γ e₁ σ') (h₂ : HMTyping Γ e₂ {vars := [], body := τ₁})
|
||||
(h_eq : σ' = {vars := [], body := τ₁ ⇒ τ₂}) :
|
||||
HMTyping Γ (HMExpr.app e₁ e₂) {vars := [], body := τ₂}
|
||||
|
||||
-- Let: e₁ is generalized and e₂ can use the polymorphic type
|
||||
| lett (Γ : HMEnv) (e₁ e₂ : HMExpr) (σ σ' : TypeScheme) (τ : MonoType)
|
||||
(h₁ : HMTyping Γ e₁ σ) (h₂ : HMTyping (σ :: Γ) e₂ σ') :
|
||||
HMTyping Γ (HMExpr.lett e₁ e₂) σ'
|
||||
|
||||
-- Gen: generalize over free type variables not in the environment
|
||||
| gen (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) (αs : List Nat)
|
||||
(h : HMTyping Γ e σ) (hfresh : ∀ α, α ∈ αs → α ∉ ftv_env Γ) :
|
||||
HMTyping Γ e {σ with vars := αs ++ σ.vars}
|
||||
|
||||
-- Inst: instantiate a type scheme
|
||||
| inst (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) (τ : MonoType) (subst : Nat → MonoType)
|
||||
(h : HMTyping Γ e σ) (hinst : apply_subst subst σ = τ) :
|
||||
HMTyping Γ e {vars := [], body := τ}
|
||||
|
||||
-- 13.5 — Exercise: the identity function is polymorphic
|
||||
def id_type : TypeScheme := {vars := [0], body := MonoType.tvar 0 ⇒ MonoType.tvar 0}
|
||||
|
||||
theorem id_polymorphic : HMTyping [] (HMExpr.lam (HMExpr.var 0)) id_type :=
|
||||
by
|
||||
sorry
|
||||
|
||||
-- 13.6 — Exercise: let x = λy. y in x x (instantiation of polymorphic identity)
|
||||
-- This term should be well-typed — the identity is used at two different types
|
||||
def self_app_id : HMExpr :=
|
||||
HMExpr.lett (HMExpr.lam (HMExpr.var 0)) (HMExpr.app (HMExpr.var 0) (HMExpr.var 0))
|
||||
|
||||
theorem self_app_id_typed (τ : MonoType) :
|
||||
HMTyping [] self_app_id {vars := [], body := τ ⇒ τ} :=
|
||||
by
|
||||
sorry
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
← [Previous: Unit 12](12-type-safety.md) · Next: [Unit 14 — Algorithm W](14-algorithm-w.md)
|
||||
156
tutorial-02-semantics/14-algorithm-w.md
Normal file
156
tutorial-02-semantics/14-algorithm-w.md
Normal file
@@ -0,0 +1,156 @@
|
||||
# Unit 14 — Algorithm W (Hindley-Milner Type Inference)
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Understand type inference as solving unification constraints
|
||||
- Implement **Algorithm W**: the classic HM inference algorithm
|
||||
- Write `unify` (Robinson's unification)
|
||||
- Write `infer` (the main inference loop)
|
||||
|
||||
## Sources
|
||||
|
||||
- [Vaughan 2008, §3](https://www.jeffvaughan.net/docs/hmproof.pdf)
|
||||
- [sdemos/type-inference](https://github.com/sdemos/type-inference)
|
||||
- [Wikipedia: Algorithm W](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_W)
|
||||
|
||||
## Background
|
||||
|
||||
Algorithm W takes an expression and returns a substitution + type:
|
||||
|
||||
```
|
||||
infer(Γ, e) = (S, τ)
|
||||
```
|
||||
|
||||
- `Γ` maps variables to type schemes
|
||||
- `e` is the expression to type
|
||||
- `S` is a type substitution (unifies constraints found during inference)
|
||||
- `τ` is the inferred monotype
|
||||
|
||||
The algorithm uses a supply of *fresh type variables* to build constraints,
|
||||
then unifies them.
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
open MonoType
|
||||
open HMExpr
|
||||
open TypeScheme
|
||||
|
||||
-- 14.1 — Substitutions
|
||||
-- A substitution maps type variables to monotypes
|
||||
abbrev Subst := Nat → MonoType
|
||||
|
||||
-- Identity substitution
|
||||
def idSubst : Subst := fun α => MonoType.tvar α
|
||||
|
||||
-- Apply a substitution to a monotype
|
||||
def applySubst (S : Subst) : MonoType → MonoType
|
||||
| MonoType.tvar α => S α
|
||||
| MonoType.fn τ₁ τ₂ => MonoType.fn (applySubst S τ₁) (applySubst S τ₂)
|
||||
|
||||
-- Compose substitutions: (S₁ ∘ S₂)(α) = S₁(S₂(α))
|
||||
def compose (S₁ S₂ : Subst) : Subst :=
|
||||
fun α => applySubst S₁ (S₂ α)
|
||||
|
||||
-- 14.2 — Unification (Robinson's algorithm)
|
||||
-- Returns a substitution that makes τ₁ and τ₂ equal, if possible.
|
||||
-- Fails if there's a type mismatch (e.g., unifying α → β with α is impossible).
|
||||
def unify (τ₁ τ₂ : MonoType) : Option Subst :=
|
||||
match τ₁, τ₂ with
|
||||
| MonoType.tvar α, MonoType.tvar β =>
|
||||
if α == β then some idSubst
|
||||
else some (fun γ => if γ == α then MonoType.tvar β else MonoType.tvar γ)
|
||||
| MonoType.tvar α, τ =>
|
||||
if occurs α τ then none -- occurs check: α ∉ ftv(τ)
|
||||
else some (fun γ => if γ == α then τ else MonoType.tvar γ)
|
||||
| τ, MonoType.tvar α =>
|
||||
if occurs α τ then none
|
||||
else some (fun γ => if γ == α then τ else MonoType.tvar γ)
|
||||
| MonoType.fn τ₁ᵃ τ₁ᵇ, MonoType.fn τ₂ᵃ τ₂ᵇ =>
|
||||
match unify τ₁ᵃ τ₂ᵃ with
|
||||
| none => none
|
||||
| some S₁ =>
|
||||
match unify (applySubst S₁ τ₁ᵇ) (applySubst S₁ τ₂ᵇ) with
|
||||
| none => none
|
||||
| some S₂ => some (compose S₂ S₁)
|
||||
|
||||
-- Occurs check: does α appear in τ?
|
||||
def occurs (α : Nat) : MonoType → Bool
|
||||
| MonoType.tvar β => α == β
|
||||
| MonoType.fn τ₁ τ₂ => occurs α τ₁ || occurs α τ₂
|
||||
|
||||
-- 14.3 — Fresh variable supply
|
||||
-- We use a counter to generate fresh type variables
|
||||
def freshVar (counter : Nat) : Nat × Nat := (counter, counter + 1)
|
||||
|
||||
-- 14.4 — Generalization: close a type under the environment
|
||||
-- `generalize(Γ, τ)` produces `∀αs. τ` where αs = ftv(τ) \ ftv(Γ)
|
||||
def generalize (Γ : HMEnv) (τ : MonoType) : TypeScheme :=
|
||||
{ vars := (ftv τ).filter (fun α => α ∉ ftv_env Γ)
|
||||
, body := τ }
|
||||
|
||||
-- 14.5 — Algorithm W (the core inference algorithm)
|
||||
-- Returns `(S, τ)` where S is a substitution and τ the inferred type.
|
||||
-- Uses a state monad for the fresh variable counter (simplified here).
|
||||
def inferW (Γ : HMEnv) (e : HMExpr) (counter : Nat) : Option (Subst × MonoType × Nat) :=
|
||||
match e with
|
||||
| HMExpr.var i =>
|
||||
match lookup Γ i with
|
||||
| none => none
|
||||
| some σ => some (idSubst, instantiate σ counter, counter + length σ.vars)
|
||||
|
||||
| HMExpr.lam body =>
|
||||
-- Create a fresh type variable for the parameter
|
||||
let (α, counter') := freshVar counter
|
||||
let τ_param := MonoType.tvar α
|
||||
-- Add x : α to the environment
|
||||
let Γ' := {vars := [], body := τ_param} :: Γ
|
||||
-- Infer the body type
|
||||
match inferW Γ' body counter' with
|
||||
| none => none
|
||||
| some (S, τ_body, counter'') =>
|
||||
some (S, MonoType.fn (applySubst S τ_param) τ_body, counter'')
|
||||
|
||||
| HMExpr.app f a =>
|
||||
match inferW Γ f counter with
|
||||
| none => none
|
||||
| some (S₁, τ_f, counter₁) =>
|
||||
match inferW (applySubstEnv S₁ Γ) a counter₁ with
|
||||
| none => none
|
||||
| some (S₂, τ_a, counter₂) =>
|
||||
let β := freshVar counter₂
|
||||
let α_counter₃ := β.2
|
||||
match unify (applySubst S₂ τ_f) (MonoType.fn τ_a (MonoType.tvar β.1)) with
|
||||
| none => none
|
||||
| some S₃ =>
|
||||
let S := compose S₃ (compose S₂ S₁)
|
||||
some (S, applySubst S₃ (MonoType.tvar β.1), α_counter₃)
|
||||
|
||||
| HMExpr.lett e₁ e₂ =>
|
||||
match inferW Γ e₁ counter with
|
||||
| none => none
|
||||
| some (S₁, τ₁, counter₁) =>
|
||||
let σ₁ := generalize (applySubstEnv S₁ Γ) τ₁
|
||||
let Γ' := σ₁ :: applySubstEnv S₁ Γ
|
||||
match inferW Γ' e₂ counter₁ with
|
||||
| none => none
|
||||
| some (S₂, τ₂, counter₂) =>
|
||||
some (compose S₂ S₁, τ₂, counter₂)
|
||||
|
||||
-- 14.6 — Exercise: infer the type of λx. x
|
||||
def infer_id : Option (Subst × MonoType × Nat) :=
|
||||
inferW [] (HMExpr.lam (HMExpr.var 0)) 0
|
||||
|
||||
-- Should return a substitution mapping α₀ to α₀ and type α₀ → α₀
|
||||
-- #eval infer_id
|
||||
|
||||
-- 14.7 — Exercise: infer the type of let x = λy. y in x x
|
||||
def infer_self_app : Option (Subst × MonoType × Nat) :=
|
||||
inferW [] self_app_id 0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
← [Previous: Unit 13](13-hm-declarative.md) · Next: [Unit 15 — Soundness and Completeness](15-soundness-completeness.md)
|
||||
85
tutorial-02-semantics/15-soundness-completeness.md
Normal file
85
tutorial-02-semantics/15-soundness-completeness.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# Unit 15 — Soundness and Completeness of Algorithm W
|
||||
|
||||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||||
|
||||
## Goals
|
||||
|
||||
- Prove that Algorithm W is **sound**: every inferred type is derivable in the declarative system
|
||||
- Prove that Algorithm W is **complete**: every derivable type can be inferred (up to generalization)
|
||||
- Understand the relationship between inference and declarative rules
|
||||
|
||||
## Sources
|
||||
|
||||
- [Vaughan 2008, §4](https://www.jeffvaughan.net/docs/hmproof.pdf)
|
||||
- [sdemos/type-inference](https://github.com/sdemos/type-inference) `Soundness.lean`, `Completeness.lean`
|
||||
|
||||
## The Theorems
|
||||
|
||||
**Soundness**: If `infer(Γ, e) = (S, τ)`, then `Γ ⊢ e : τ` declaratively.
|
||||
|
||||
**Completeness**: If `Γ ⊢ e : τ` declaratively, then there exists `S, τ'` such that
|
||||
`infer(Γ, e) = (S, τ')` and `τ` is a substitution instance of `τ'` (i.e., W finds the *most general* type).
|
||||
|
||||
## Exercises
|
||||
|
||||
```lean
|
||||
open HMTyping
|
||||
open MonoType
|
||||
|
||||
-- 15.1 — Soundness of Algorithm W
|
||||
theorem soundness_W (Γ : HMEnv) (e : HMExpr) (S : Subst) (τ : MonoType) (counter : Nat)
|
||||
(h : inferW Γ e counter = some (S, τ, _)) :
|
||||
HMTyping (applySubstEnv S Γ) e {vars := [], body := τ} :=
|
||||
by
|
||||
induction e generalizing Γ S τ counter with
|
||||
| var i =>
|
||||
-- Case: variable lookup. Need to relate instantiated type scheme to declarative Var rule.
|
||||
sorry
|
||||
| lam body ih =>
|
||||
-- Case: lambda. Use the IH and the Abs rule.
|
||||
sorry
|
||||
| app f a ihf iha =>
|
||||
-- Case: application. The unification produces a substitution that makes types match.
|
||||
-- Need substitution composition lemmas.
|
||||
sorry
|
||||
| lett e₁ e₂ ih₁ ih₂ =>
|
||||
-- Case: let. Generalization produces a type scheme; the body gets the polymorphic type.
|
||||
sorry
|
||||
|
||||
-- 15.2 — Completeness of Algorithm W
|
||||
theorem completeness_W (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme)
|
||||
(h : HMTyping Γ e σ) :
|
||||
∃ (S : Subst) (τ : MonoType) (counter : Nat),
|
||||
inferW Γ e 0 = some (S, τ, counter) ∧
|
||||
∃ (S' : Subst), applySubst S' τ = σ.body :=
|
||||
by
|
||||
induction h with
|
||||
| var Γ i hlook =>
|
||||
sorry
|
||||
| abs Γ e τ₁ τ₂ hbody ih =>
|
||||
sorry
|
||||
| app Γ e₁ e₂ σ τ₁ τ₂ σ' h₁ h₂ h_eq ih₁ ih₂ =>
|
||||
sorry
|
||||
| lett Γ e₁ e₂ σ σ' τ h₁ h₂ ih₁ ih₂ =>
|
||||
sorry
|
||||
| gen Γ e σ αs hbody hfresh ih =>
|
||||
sorry
|
||||
| inst Γ e σ τ subst hbody hinst ih =>
|
||||
sorry
|
||||
|
||||
-- 15.3 — Principal Types Corollary
|
||||
-- Every well-typed expression has a *principal type* — one that all others are instances of.
|
||||
theorem principal_types (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme)
|
||||
(h : HMTyping Γ e σ) :
|
||||
∃ (τ_principal : MonoType),
|
||||
-- W finds the principal type
|
||||
(∃ S c, inferW Γ e 0 = some (S, τ_principal, c)) ∧
|
||||
-- σ's body is an instance of the principal type
|
||||
(∃ S', applySubst S' τ_principal = σ.body) :=
|
||||
by
|
||||
exact completeness_W Γ e σ h
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
← [Previous: Unit 14](14-algorithm-w.md) · [← Tutorial 2 Index](.) · [← Back to README](../README.md)
|
||||
Reference in New Issue
Block a user