Initial commit: 15-unit Lean 4 + PL semantics tutorial curriculum

This commit is contained in:
2026-05-28 18:14:07 +02:00
commit 0cf85517c2
17 changed files with 1454 additions and 0 deletions

View File

@@ -0,0 +1,82 @@
# Unit 8 — Representing Syntax
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Encode lambda calculus terms as an inductive type
- Understand three binding representations:
1. **Named** (strings — simple, but α-equiv isn't definitional)
2. **de Bruijn indices** (numbers — α-equiv is free, shifting is painful)
3. **Locally nameless** (free vars named, bound vars indexed — compromise)
We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial,
with locally nameless for comparison.
## Sources
- syndikos/lean4-stlc `Syntax.lean`: https://github.com/syndikos/lean4-stlc
- Chris Henson 2025: https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html
- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch
- Software Foundations Vol.2: https://softwarefoundations.cis.upenn.edu/
## Exercises
```lean
-- 8.1 — Named representation
inductive NamedTerm where
| var (x : String)
| lam (x : String) (body : NamedTerm)
| app (f arg : NamedTerm)
deriving Repr
-- The Church encoding of identity: λx. x
def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x")
-- Encode λx. λy. x (K combinator)
def kNamed : NamedTerm :=
sorry
-- Encode λf. λx. f (f x) (Church numeral 2)
def twoNamed : NamedTerm :=
sorry
-- 8.2 — de Bruijn representation
-- Variables are numbers: 0 = nearest binder, 1 = next, etc.
inductive DBTerm where
| var (idx : Nat) -- variable reference by binding distance
| lam (body : DBTerm) -- λ. body (no name needed!)
| app (f arg : DBTerm)
deriving Repr
-- λ. λ. 1 (= λx. λy. x in named form)
def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1))
-- λ. 0 (= λx. x in named form)
def idDB : DBTerm := DBTerm.lam (DBTerm.var 0)
-- Encode λf. λx. f (f x) (Church 2)
def twoDB : DBTerm :=
sorry
-- 8.3 — Locally nameless
-- Free variables are strings, bound variables are de Bruijn indices
-- (You don't need to implement this fully — just understand the idea)
inductive LNTerm where
| fvar (x : String) -- free variable
| bvar (idx : Nat) -- bound variable (de Bruijn)
| lam (body : LNTerm) -- binder
| app (f arg : LNTerm)
deriving Repr
```
### Key insight for PL semantics
When we encode **typing contexts** `Γ = x₁:τ₁, x₂:τ₂, ...`, de Bruijn indices
give us "index into the context" for free. The last binding is index 0, the
second-last is index 1, etc. This makes the typing rules elegant in Lean —
no name-clash avoidance needed.
---
← [Tutorial 1 — Unit 7](../tutorial-01-basics/07-dependent-types.md) · Next: [Unit 9 — Substitution](09-substitution.md)

View File

@@ -0,0 +1,81 @@
# Unit 9 — Substitution and α-Equivalence
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Implement `subst` (substitution) for de Bruijn terms
- Implement `lift` (index shifting) — the key operation that avoids capture
- Prove fundamental substitution lemmas
## Sources
- syndikos/lean4-stlc `Syntax.lean` and `Reduction.lean`
- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch
- PLFA Part 2: https://plfa.github.io/
## Exercises
```lean
open DBTerm -- from Unit 8
-- 9.1 — Lift: shift all *free* variables above a cutoff by k
-- Lift is the engine that makes substitution work under binders.
-- When we go under a λ, existing de Bruijn indices need to shift by 1
-- because a new binder appears.
def lift (t : DBTerm) (cutoff k : Nat) : DBTerm :=
match t with
| var n => if n < cutoff then var n else var (n + k)
| lam body => lam (lift body (cutoff + 1) k)
| app f a => app (lift f cutoff k) (lift a cutoff k)
-- Verify: lifting the identity function
-- lift (λ. 0) 0 1 should produce λ. 0 (bound variables inside λ are untouched)
#eval lift idDB 0 1
-- 9.2 — Substitution: replace variable `idx` with `s` in term `t`
-- This is capture-avoiding substitution via de Bruijn arithmetic.
-- Under a λ, the target index shifts by 1 and the replacement gets lifted.
def subst (t : DBTerm) (idx : Nat) (s : DBTerm) : DBTerm :=
match t with
| var n =>
if n < idx then var n
else if n == idx then s -- found the variable to replace
else var (n - 1) -- shift remaining vars down
| lam body => lam (subst body (idx + 1) (lift s 0 1))
| app f a => app (subst f idx s) (subst a idx s)
-- Test: substitute λ. 0 for variable 0 in (λ. 0) -- should be identity
-- Actually: [0 ↦ λ.0] (λ. 0) — nothing changes (0 is bound)
#eval subst (lam (var 0)) 0 (lam (var 0))
-- 9.3 — Single-step β-reduction (preview of next unit)
-- β: (λ. t) s → subst t 0 s
def betaRedex : DBTerm := app (lam (var 0)) (lam (var 0)) -- (λx. x) (λy. y) → ...y...
-- Compute the result of one β step on this term
#eval subst (var 0) 0 (lam (var 0)) -- should be lam (var 0) (= λy. y)
-- 9.4 — Lemma: Substitution under a lambda
-- If we substitute for index 0 in a term that's already shifted,
-- the lift in the body handles the shift correctly.
-- Prove this property:
lemma subst_lift_commute (t s : DBTerm) (n : Nat) :
subst (lift t 0 1) (n+1) (lift s 0 1) = lift (subst t n s) 0 1 :=
by
sorry
-- This lemma is *critical* for proving preservation/type safety later.
-- It says that shifting before and after substitution agree.
```
### Why this matters
Substitution is the core operation of operational semantics. Doing it right
(no variable capture) is the hardest part of formalizing lambda calculi.
de Bruijn indices make capture impossible — the arithmetic is mechanical.
The lemmas here are boilerplate you prove once, then reuse everywhere.
---
← [Previous: Unit 8](08-syntax-representation.md) · Next: [Unit 10 — Small-Step Semantics](10-small-step-semantics.md)

View File

@@ -0,0 +1,65 @@
# Unit 10 — Small-Step Operational Semantics
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Define single-step reduction `→` for call-by-value λ-calculus
- Define reflexive-transitive closure `→*`
- Prove determinism: `t → t₁ ∧ t → t₂ ⇒ t₁ = t₂`
## Sources
- syndikos/lean4-stlc `Reduction.lean`, `Semantics.lean`
- Software Foundations Vol.2 "Smallstep"
- PLFA Part 2
## Exercises
```lean
open DBTerm
-- 10.1 — Single-step cbv reduction
-- Call-by-value: reduce the *argument* before substituting into the function body.
-- A value is a λ-abstraction (in pure STLC).
inductive Value : DBTerm Prop where
| lam : Value (lam b)
inductive Step : DBTerm DBTerm Prop where
-- β-reduction: (λ. t) v → subst t 0 v (where v is a value)
| beta (t v : DBTerm) (hv : Value v) :
Step (app (lam t) v) (subst t 0 v)
-- Congruence: if f → f', then f a → f' a
| appL (f f' a : DBTerm) (hstep : Step f f') :
Step (app f a) (app f' a)
-- Congruence: if v is a value and a → a', then v a → v a'
| appR (v a a' : DBTerm) (hv : Value v) (hstep : Step a a') :
Step (app v a) (app v a')
-- 10.2 — Multi-step (reflexive-transitive closure)
inductive StepStar : DBTerm DBTerm Prop where
| refl (t : DBTerm) : StepStar t t
| step (t₁ t₂ t₃ : DBTerm) (hstep : Step t₁ t₂) (hstar : StepStar t₂ t₃) :
StepStar t₁ t₃
-- 10.3 — Determinism
theorem step_deterministic {t t₁ t₂ : DBTerm} (h₁ : Step t t₁) (h₂ : Step t t₂) : t₁ = t₂ :=
by
sorry
-- Hint: induction on h₁, then cases on h₂. The tricky case:
-- when one side is (lam t) and the other is a congruence rule that also
-- produces (lam t) — they match.
-- 10.4 — Stuck terms
-- A term is "stuck" if it's not a value and can't take a step.
-- Type soundness = "well-typed terms never get stuck".
def Stuck (t : DBTerm) : Prop :=
¬ Value t ¬ t', Step t t'
-- Find a stuck term in the untyped calculus
def stuckExample : DBTerm :=
app (var 0) (var 1) -- a free variable applied to another free variable
-- Show it's stuck (you'll prove something similar in Unit 12 with types)
```

View File

@@ -0,0 +1,78 @@
# Unit 11 — STLC Type System
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Define simple types (`Nat → Bool`, etc.)
- Define typing contexts `Γ` as lists of types
- Encode the STLC typing judgment `Γ ⊢ t : τ`
## Sources
- syndikos/lean4-stlc `Typing.lean`
- Software Foundations Vol.2 "Types"
- sdemos/type-inference for type inference perspective
## Exercises
```lean
open DBTerm
-- 11.1 — Types
inductive Ty where
| nat
| bool
| fn (τ₁ τ₂ : Ty) -- τ₁ → τ₂
deriving Repr, DecidableEq
-- Notations: τ₁ ⇒ τ₂ for function types, like in most PL papers
notation:40 τ₁ "" τ₂ => Ty.fn τ₁ τ₂
-- 11.2 — Contexts
-- A context is a list of types, indexed by de Bruijn index.
-- Position 0 = the most recently bound variable.
def Ctx := List Ty
-- 11.3 — Lookup: get the type of variable `i` in context `Γ`
def lookup (Γ : Ctx) (i : Nat) : Option Ty :=
match Γ with
| [] => none
| τ :: _ => if i == 0 then some τ else lookup (Γ.tail) (i - 1)
-- 11.4 — Typing judgment: Γ ⊢ t : τ
-- De Bruijn version — no variable names needed!
inductive HasType : Ctx DBTerm Ty Prop where
| var (Γ : Ctx) (i : Nat) (h : lookup Γ i = some τ) :
HasType Γ (var i) τ
| lam (Γ : Ctx) (t : DBTerm) (τ₁ τ₂ : Ty) (h : HasType (τ₁ :: Γ) t τ₂) :
HasType Γ (lam t) (τ₁ τ₂)
| app (Γ : Ctx) (t₁ t₂ : DBTerm) (τ₁ τ₂ : Ty)
(h₁ : HasType Γ t₁ (τ₁ τ₂)) (h₂ : HasType Γ t₂ τ₁) :
HasType Γ (app t₁ t₂) τ₂
-- 11.5 — Exercise: type derivations
-- Build a typing derivation for the identity function λx. x
theorem id_typed (τ : Ty) : HasType [] (lam (var 0)) (τ τ) :=
by
sorry
-- Build a typing derivation for the K combinator λx. λy. x
theorem k_typed (τ₁ τ₂ : Ty) : HasType [] (lam (lam (var 1))) (τ₁ τ₂ τ₁) :=
by
sorry
-- Build a typing derivation for the S combinator: λx. λy. λz. x z (y z)
-- (This is a good stress test — it exercises both app and lam multiple times)
theorem s_typed (τ₁ τ₂ τ₃ : Ty) :
HasType [] (lam (lam (lam (app (app (var 2) (var 0)) (app (var 1) (var 0))))))
((τ₁ τ₂ τ₃) (τ₁ τ₂) τ₁ τ₃) :=
by
sorry
-- 11.6 — Weakening: adding extra types to the context preserves typing
theorem weakening (Γ Δ : Ctx) (t : DBTerm) (τ : Ty)
(h : HasType Γ t τ) (hsub : i, lookup Γ i = lookup (Δ ++ Γ) i) : HasType (Δ ++ Γ) t τ :=
by
sorry
```

View File

@@ -0,0 +1,104 @@
# Unit 12 — Type Safety (Progress + Preservation)
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Prove **Progress**: well-typed terms are either values or can step
- Prove **Preservation** (Subject Reduction): types survive reduction
- Combine them: **Type Soundness** = well-typed programs don't get stuck
## Sources
- syndikos/lean4-stlc `Metatheory.lean`
- Software Foundations Vol.2
- PLFA Part 2: DeBruijn chapter
## Key Lemmas
```
Canonical Forms:
If [] ⊢ v : τ₁ ⇒ τ₂ and v is a value, then v = λ.t
Substitution Lemma:
If τ₁::Γ ⊢ t : τ₂ and Γ ⊢ s : τ₁, then Γ ⊢ subst t 0 s : τ₂
Preservation:
If Γ ⊢ t : τ and t → t', then Γ ⊢ t' : τ
Progress:
If [] ⊢ t : τ, then either Value t or ∃ t', t → t'
```
## Exercises
```lean
open DBTerm
open Step
open HasType
-- 12.1 — Canonical forms lemma for functions
theorem canonical_forms_fn {t : DBTerm} {τ₁ τ₂ : Ty}
(h : HasType [] t (τ₁ τ₂)) (hv : Value t) : body, t = lam body :=
by
sorry
-- 12.2 — Substitution lemma (the heart of the proof)
-- This uses the substitution commutation lemma from Unit 9.
theorem substitution_lemma (Γ : Ctx) (t s : DBTerm) (τ₁ τ₂ : Ty)
(h₁ : HasType (τ₁ :: Γ) t τ₂) (h₂ : HasType Γ s τ₁) :
HasType Γ (subst t 0 s) τ₂ :=
by
sorry
-- Hint: induction on h₁. The variable case branches on whether the index
-- is 0 (use h₂), less than 0 (impossible — Nat), or greater (use h₁).
-- The lambda case requires weakening to adjust the context.
-- 12.3 — Preservation (Subject Reduction)
theorem preservation {Γ : Ctx} {t t' : DBTerm} {τ : Ty}
(htype : HasType Γ t τ) (hstep : Step t t') : HasType Γ t' τ :=
by
induction hstep generalizing τ with
| beta t v hv =>
-- Case (λ.t) v → subst t 0 v
-- Invert htype to get HasType (τ₁::Γ) t τ₂ and HasType Γ v τ₁
-- Then apply the substitution lemma
sorry
| appL f f' a hstep_f ih =>
-- Case f a → f' a where f → f'
sorry
| appR v a a' hv hstep_a ih =>
-- Case v a → v a' where a → a'
sorry
-- 12.4 — Progress
theorem progress {t : DBTerm} {τ : Ty} (h : HasType [] t τ) :
Value t t', Step t t' :=
by
induction h with
| var Γ i hlook =>
-- A variable can't be typed in the empty context
sorry
| lam Γ body τ₁ τ₂ hbody =>
-- A lambda is a value
left; exact Value.lam
| app Γ f a τ₁ τ₂ hf ha ihf iha =>
-- f a: either f is a value, or it can step, etc.
sorry
-- 12.5 — Soundness corollary
-- Well-typed programs never get stuck.
theorem soundness {t : DBTerm} {τ : Ty} (h : HasType [] t τ) :
¬ Stuck t :=
by
-- A term is stuck if it's not a value AND can't step.
-- By progress, it's either a value or can step.
-- In both cases, it's not stuck.
intro hstuck
rcases hstuck with hnotval, hnostep
rcases progress h with (hv | hstep)
· exact hnotval hv
· apply hnostep
exact hstep
```

View File

@@ -0,0 +1,119 @@
# Unit 13 — Hindley-Milner Declarative Typing Rules
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Extend STLC with **let-polymorphism**
- Encode HM types with **type schemes** `∀α. τ`
- Formalize the 6 declarative typing rules (Var, App, Abs, Let, Inst, Gen)
- Understand the distinction between monomorphic (`τ`) and polymorphic (`σ = ∀α.τ`) types
## Sources
- [Wikipedia: Hindley-Milner type system](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system)
- [Vaughan 2008](https://www.jeffvaughan.net/docs/hmproof.pdf) (declarative rules, §2)
- [sdemos/type-inference](https://github.com/sdemos/type-inference) (Lean formalization)
## Background
HM adds **let-polymorphism** to STLC:
```
let x = e₁ in e₂
```
In STLC, `x` would have a single monomorphic type. In HM, `x` gets a type
*scheme* — e.g., `∀α. αα` for the identity function — and each use of `x`
instantiates the scheme with fresh type variables.
The declarative rules:
```
Var: Γ, x:σ ⊢ x : σ
App: Γ ⊢ e₁ : τ₁ → τ₂ Γ ⊢ e₂ : τ₁ ⇛ Γ ⊢ e₁ e₂ : τ₂
Abs: Γ, x:τ₁ ⊢ e : τ₂ ⇛ Γ ⊢ λx.e : τ₁ → τ₂
Let: Γ ⊢ e₁ : σ Γ, x:σ ⊢ e₂ : τ ⇛ Γ ⊢ let x = e₁ in e₂ : τ
Inst: Γ ⊢ e : ∀α.τ ⇛ Γ ⊢ e : τ[α := τ']
Gen: Γ ⊢ e : τ ∧ α ∉ ftv(Γ) ⇛ Γ ⊢ e : ∀α.τ
```
## Exercises
```lean
-- 13.1 — Types and type schemes
inductive MonoType where
| tvar (id : Nat) -- type variable α
| fn (τ₁ τ₂ : MonoType) -- τ₁ → τ₂
deriving Repr, DecidableEq
notation:40 τ₁ "" τ₂ => MonoType.fn τ₁ τ₂
-- A type scheme: ∀α₁...αₙ. τ (quantified over a set of type variables)
structure TypeScheme where
vars : List Nat -- bound type variables
body : MonoType
deriving Repr
-- 13.2 — HM expression syntax (extend STLC with `let`)
inductive HMExpr where
| var (i : Nat) -- de Bruijn index
| lam (body : HMExpr) -- λ. e
| app (f a : HMExpr) -- f a
| lett (e₁ e₂ : HMExpr) -- let x = e₁ in e₂
deriving Repr
-- 13.3 — Environments
-- Γ maps de Bruijn indices to type schemes
def HMEnv := List TypeScheme
-- 13.4 — Declarative typing rules
inductive HMTyping : HMEnv HMExpr TypeScheme Prop where
-- Var: look up variable in the environment
| var (Γ : HMEnv) (i : Nat) (h : lookup Γ i = some σ) : HMTyping Γ (HMExpr.var i) σ
-- Abs: λx. e — note: x gets a monomorphic type in the lambda body
| abs (Γ : HMEnv) (e : HMExpr) (τ₁ τ₂ : MonoType)
(h : HMTyping ({vars := [], body := τ₁} :: Γ) e {vars := [], body := τ₂}) :
HMTyping Γ (HMExpr.lam e) {vars := [], body := τ₁ τ₂}
-- App: e₁ e₂
| app (Γ : HMEnv) (e₁ e₂ : HMExpr) (σ τ₁ τ₂ : MonoType) (σ' : TypeScheme)
(h₁ : HMTyping Γ e₁ σ') (h₂ : HMTyping Γ e₂ {vars := [], body := τ₁})
(h_eq : σ' = {vars := [], body := τ₁ τ₂}) :
HMTyping Γ (HMExpr.app e₁ e₂) {vars := [], body := τ₂}
-- Let: e₁ is generalized and e₂ can use the polymorphic type
| lett (Γ : HMEnv) (e₁ e₂ : HMExpr) (σ σ' : TypeScheme) (τ : MonoType)
(h₁ : HMTyping Γ e₁ σ) (h₂ : HMTyping (σ :: Γ) e₂ σ') :
HMTyping Γ (HMExpr.lett e₁ e₂) σ'
-- Gen: generalize over free type variables not in the environment
| gen (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) (αs : List Nat)
(h : HMTyping Γ e σ) (hfresh : α, α αs α ftv_env Γ) :
HMTyping Γ e {σ with vars := αs ++ σ.vars}
-- Inst: instantiate a type scheme
| inst (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) (τ : MonoType) (subst : Nat MonoType)
(h : HMTyping Γ e σ) (hinst : apply_subst subst σ = τ) :
HMTyping Γ e {vars := [], body := τ}
-- 13.5 — Exercise: the identity function is polymorphic
def id_type : TypeScheme := {vars := [0], body := MonoType.tvar 0 MonoType.tvar 0}
theorem id_polymorphic : HMTyping [] (HMExpr.lam (HMExpr.var 0)) id_type :=
by
sorry
-- 13.6 — Exercise: let x = λy. y in x x (instantiation of polymorphic identity)
-- This term should be well-typed — the identity is used at two different types
def self_app_id : HMExpr :=
HMExpr.lett (HMExpr.lam (HMExpr.var 0)) (HMExpr.app (HMExpr.var 0) (HMExpr.var 0))
theorem self_app_id_typed (τ : MonoType) :
HMTyping [] self_app_id {vars := [], body := τ τ} :=
by
sorry
```
---
← [Previous: Unit 12](12-type-safety.md) · Next: [Unit 14 — Algorithm W](14-algorithm-w.md)

View File

@@ -0,0 +1,156 @@
# Unit 14 — Algorithm W (Hindley-Milner Type Inference)
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Understand type inference as solving unification constraints
- Implement **Algorithm W**: the classic HM inference algorithm
- Write `unify` (Robinson's unification)
- Write `infer` (the main inference loop)
## Sources
- [Vaughan 2008, §3](https://www.jeffvaughan.net/docs/hmproof.pdf)
- [sdemos/type-inference](https://github.com/sdemos/type-inference)
- [Wikipedia: Algorithm W](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_W)
## Background
Algorithm W takes an expression and returns a substitution + type:
```
infer(Γ, e) = (S, τ)
```
- `Γ` maps variables to type schemes
- `e` is the expression to type
- `S` is a type substitution (unifies constraints found during inference)
- `τ` is the inferred monotype
The algorithm uses a supply of *fresh type variables* to build constraints,
then unifies them.
## Exercises
```lean
open MonoType
open HMExpr
open TypeScheme
-- 14.1 — Substitutions
-- A substitution maps type variables to monotypes
abbrev Subst := Nat MonoType
-- Identity substitution
def idSubst : Subst := fun α => MonoType.tvar α
-- Apply a substitution to a monotype
def applySubst (S : Subst) : MonoType MonoType
| MonoType.tvar α => S α
| MonoType.fn τ₁ τ₂ => MonoType.fn (applySubst S τ₁) (applySubst S τ₂)
-- Compose substitutions: (S₁ ∘ S₂)(α) = S₁(S₂(α))
def compose (S₁ S₂ : Subst) : Subst :=
fun α => applySubst S₁ (S₂ α)
-- 14.2 — Unification (Robinson's algorithm)
-- Returns a substitution that makes τ₁ and τ₂ equal, if possible.
-- Fails if there's a type mismatch (e.g., unifying α → β with α is impossible).
def unify (τ₁ τ₂ : MonoType) : Option Subst :=
match τ₁, τ₂ with
| MonoType.tvar α, MonoType.tvar β =>
if α == β then some idSubst
else some (fun γ => if γ == α then MonoType.tvar β else MonoType.tvar γ)
| MonoType.tvar α, τ =>
if occurs α τ then none -- occurs check: α ∉ ftv(τ)
else some (fun γ => if γ == α then τ else MonoType.tvar γ)
| τ, MonoType.tvar α =>
if occurs α τ then none
else some (fun γ => if γ == α then τ else MonoType.tvar γ)
| MonoType.fn τ₁ τ₁, MonoType.fn τ₂ τ₂ =>
match unify τ₁ τ₂ with
| none => none
| some S₁ =>
match unify (applySubst S₁ τ₁) (applySubst S₁ τ₂) with
| none => none
| some S₂ => some (compose S₂ S₁)
-- Occurs check: does α appear in τ?
def occurs (α : Nat) : MonoType Bool
| MonoType.tvar β => α == β
| MonoType.fn τ₁ τ₂ => occurs α τ₁ || occurs α τ₂
-- 14.3 — Fresh variable supply
-- We use a counter to generate fresh type variables
def freshVar (counter : Nat) : Nat × Nat := (counter, counter + 1)
-- 14.4 — Generalization: close a type under the environment
-- `generalize(Γ, τ)` produces `∀αs. τ` where αs = ftv(τ) \ ftv(Γ)
def generalize (Γ : HMEnv) (τ : MonoType) : TypeScheme :=
{ vars := (ftv τ).filter (fun α => α ftv_env Γ)
, body := τ }
-- 14.5 — Algorithm W (the core inference algorithm)
-- Returns `(S, τ)` where S is a substitution and τ the inferred type.
-- Uses a state monad for the fresh variable counter (simplified here).
def inferW (Γ : HMEnv) (e : HMExpr) (counter : Nat) : Option (Subst × MonoType × Nat) :=
match e with
| HMExpr.var i =>
match lookup Γ i with
| none => none
| some σ => some (idSubst, instantiate σ counter, counter + length σ.vars)
| HMExpr.lam body =>
-- Create a fresh type variable for the parameter
let (α, counter') := freshVar counter
let τ_param := MonoType.tvar α
-- Add x : α to the environment
let Γ' := {vars := [], body := τ_param} :: Γ
-- Infer the body type
match inferW Γ' body counter' with
| none => none
| some (S, τ_body, counter'') =>
some (S, MonoType.fn (applySubst S τ_param) τ_body, counter'')
| HMExpr.app f a =>
match inferW Γ f counter with
| none => none
| some (S₁, τ_f, counter₁) =>
match inferW (applySubstEnv S₁ Γ) a counter₁ with
| none => none
| some (S₂, τ_a, counter₂) =>
let β := freshVar counter₂
let α_counter₃ := β.2
match unify (applySubst S₂ τ_f) (MonoType.fn τ_a (MonoType.tvar β.1)) with
| none => none
| some S₃ =>
let S := compose S₃ (compose S₂ S₁)
some (S, applySubst S₃ (MonoType.tvar β.1), α_counter₃)
| HMExpr.lett e₁ e₂ =>
match inferW Γ e₁ counter with
| none => none
| some (S₁, τ₁, counter₁) =>
let σ₁ := generalize (applySubstEnv S₁ Γ) τ₁
let Γ' := σ₁ :: applySubstEnv S₁ Γ
match inferW Γ' e₂ counter₁ with
| none => none
| some (S₂, τ₂, counter₂) =>
some (compose S₂ S₁, τ₂, counter₂)
-- 14.6 — Exercise: infer the type of λx. x
def infer_id : Option (Subst × MonoType × Nat) :=
inferW [] (HMExpr.lam (HMExpr.var 0)) 0
-- Should return a substitution mapping α₀ to α₀ and type α₀ → α₀
-- #eval infer_id
-- 14.7 — Exercise: infer the type of let x = λy. y in x x
def infer_self_app : Option (Subst × MonoType × Nat) :=
inferW [] self_app_id 0
```
---
← [Previous: Unit 13](13-hm-declarative.md) · Next: [Unit 15 — Soundness and Completeness](15-soundness-completeness.md)

View File

@@ -0,0 +1,85 @@
# Unit 15 — Soundness and Completeness of Algorithm W
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
## Goals
- Prove that Algorithm W is **sound**: every inferred type is derivable in the declarative system
- Prove that Algorithm W is **complete**: every derivable type can be inferred (up to generalization)
- Understand the relationship between inference and declarative rules
## Sources
- [Vaughan 2008, §4](https://www.jeffvaughan.net/docs/hmproof.pdf)
- [sdemos/type-inference](https://github.com/sdemos/type-inference) `Soundness.lean`, `Completeness.lean`
## The Theorems
**Soundness**: If `infer(Γ, e) = (S, τ)`, then `Γ ⊢ e : τ` declaratively.
**Completeness**: If `Γ ⊢ e : τ` declaratively, then there exists `S, τ'` such that
`infer(Γ, e) = (S, τ')` and `τ` is a substitution instance of `τ'` (i.e., W finds the *most general* type).
## Exercises
```lean
open HMTyping
open MonoType
-- 15.1 — Soundness of Algorithm W
theorem soundness_W (Γ : HMEnv) (e : HMExpr) (S : Subst) (τ : MonoType) (counter : Nat)
(h : inferW Γ e counter = some (S, τ, _)) :
HMTyping (applySubstEnv S Γ) e {vars := [], body := τ} :=
by
induction e generalizing Γ S τ counter with
| var i =>
-- Case: variable lookup. Need to relate instantiated type scheme to declarative Var rule.
sorry
| lam body ih =>
-- Case: lambda. Use the IH and the Abs rule.
sorry
| app f a ihf iha =>
-- Case: application. The unification produces a substitution that makes types match.
-- Need substitution composition lemmas.
sorry
| lett e₁ e₂ ih₁ ih₂ =>
-- Case: let. Generalization produces a type scheme; the body gets the polymorphic type.
sorry
-- 15.2 — Completeness of Algorithm W
theorem completeness_W (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme)
(h : HMTyping Γ e σ) :
(S : Subst) (τ : MonoType) (counter : Nat),
inferW Γ e 0 = some (S, τ, counter)
(S' : Subst), applySubst S' τ = σ.body :=
by
induction h with
| var Γ i hlook =>
sorry
| abs Γ e τ₁ τ₂ hbody ih =>
sorry
| app Γ e₁ e₂ σ τ₁ τ₂ σ' h₁ h₂ h_eq ih₁ ih₂ =>
sorry
| lett Γ e₁ e₂ σ σ' τ h₁ h₂ ih₁ ih₂ =>
sorry
| gen Γ e σ αs hbody hfresh ih =>
sorry
| inst Γ e σ τ subst hbody hinst ih =>
sorry
-- 15.3 — Principal Types Corollary
-- Every well-typed expression has a *principal type* — one that all others are instances of.
theorem principal_types (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme)
(h : HMTyping Γ e σ) :
(τ_principal : MonoType),
-- W finds the principal type
( S c, inferW Γ e 0 = some (S, τ_principal, c))
-- σ's body is an instance of the principal type
( S', applySubst S' τ_principal = σ.body) :=
by
exact completeness_W Γ e σ h
```
---
← [Previous: Unit 14](14-algorithm-w.md) · [← Tutorial 2 Index](.) · [← Back to README](../README.md)