From 0cf85517c2022ee08c5923d76691b4c94a134d76 Mon Sep 17 00:00:00 2001 From: Hermes Agent Date: Thu, 28 May 2026 18:14:07 +0200 Subject: [PATCH] Initial commit: 15-unit Lean 4 + PL semantics tutorial curriculum --- README.md | 48 ++++++ references.md | 39 +++++ tutorial-01-basics/01-types-and-functions.md | 107 ++++++++++++ tutorial-01-basics/02-inductive-types.md | 106 ++++++++++++ .../03-propositions-and-proofs.md | 90 ++++++++++ .../04-quantifiers-and-equality.md | 97 +++++++++++ tutorial-01-basics/05-advanced-tactics.md | 52 ++++++ .../06-structures-type-classes.md | 68 ++++++++ tutorial-01-basics/07-dependent-types.md | 77 +++++++++ .../08-syntax-representation.md | 82 +++++++++ tutorial-02-semantics/09-substitution.md | 81 +++++++++ .../10-small-step-semantics.md | 65 ++++++++ tutorial-02-semantics/11-stlc-type-system.md | 78 +++++++++ tutorial-02-semantics/12-type-safety.md | 104 ++++++++++++ tutorial-02-semantics/13-hm-declarative.md | 119 +++++++++++++ tutorial-02-semantics/14-algorithm-w.md | 156 ++++++++++++++++++ .../15-soundness-completeness.md | 85 ++++++++++ 17 files changed, 1454 insertions(+) create mode 100644 README.md create mode 100644 references.md create mode 100644 tutorial-01-basics/01-types-and-functions.md create mode 100644 tutorial-01-basics/02-inductive-types.md create mode 100644 tutorial-01-basics/03-propositions-and-proofs.md create mode 100644 tutorial-01-basics/04-quantifiers-and-equality.md create mode 100644 tutorial-01-basics/05-advanced-tactics.md create mode 100644 tutorial-01-basics/06-structures-type-classes.md create mode 100644 tutorial-01-basics/07-dependent-types.md create mode 100644 tutorial-02-semantics/08-syntax-representation.md create mode 100644 tutorial-02-semantics/09-substitution.md create mode 100644 tutorial-02-semantics/10-small-step-semantics.md create mode 100644 tutorial-02-semantics/11-stlc-type-system.md create mode 100644 tutorial-02-semantics/12-type-safety.md create mode 100644 tutorial-02-semantics/13-hm-declarative.md create mode 100644 tutorial-02-semantics/14-algorithm-w.md create mode 100644 tutorial-02-semantics/15-soundness-completeness.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..520c8ac --- /dev/null +++ b/README.md @@ -0,0 +1,48 @@ +# Lean 4 + Programming Language Semantics + +A self-study curriculum for computer scientists who know PL theory and want +to mechanize proofs in Lean 4 — from syntax to soundness of the +Hindley-Milner type system. + +## Structure + +``` +├── README.md ← this file +├── tutorial-01-basics/ ← Lean 4 fundamentals +│ ├── 01-types-and-functions.md +│ ├── 02-inductive-types.md +│ ├── 03-propositions-and-proofs.md +│ ├── 04-quantifiers-and-equality.md +│ ├── 05-advanced-tactics.md +│ ├── 06-structures-type-classes.md +│ └── 07-dependent-types.md +├── tutorial-02-semantics/ ← PL semantics in Lean +│ ├── 08-syntax-representation.md +│ ├── 09-substitution.md +│ ├── 10-small-step-semantics.md +│ ├── 11-stlc-type-system.md +│ ├── 12-type-safety.md +│ ├── 13-hm-declarative.md +│ ├── 14-algorithm-w.md +│ └── 15-soundness-completeness.md +└── references.md ← all cited resources +``` + +## How to Use + +1. Install Lean 4 (VS Code + `lean4` extension): + [lean-lang.org/lean4/doc/quickstart.html](https://lean-lang.org/lean4/doc/quickstart.html) +2. Start with **Tutorial 1**, Unit 1. Each unit is a standalone `.md` file with + inline Lean exercises. +3. Type the exercises into a `.lean` file and make it compile. +4. After Tutorial 1, move to **Tutorial 2** to mechanize STLC and HM. + +## Prerequisites + +- Functional programming (Haskell/OCaml familiarity) +- Basic PL theory (lambda calculus, operational semantics, type systems) +- No prior proof assistant experience required + +## References + +See [references.md](references.md) for the full list of resources. diff --git a/references.md b/references.md new file mode 100644 index 0000000..b7fccf3 --- /dev/null +++ b/references.md @@ -0,0 +1,39 @@ +# References + +## Primary Learning Resources + +| Abbr | Title | Link | +|------|-------|------| +| FPIL | *Functional Programming in Lean* | https://lean-lang.org/functional_programming_in_lean/ | +| TPIL | *Theorem Proving in Lean 4* | https://leanprover.github.io/theorem_proving_in_lean4/ | +| MiL | *Mathematics in Lean* | https://leanprover-community.github.io/mathematics_in_lean/ | + +## PL Semantics Formalizations in Lean 4 + +| Resource | What it covers | Link | +|----------|---------------|------| +| syndikos/lean4-stlc | STLC: syntax, semantics, typing, metatheory (Progress + Preservation) | https://github.com/syndikos/lean4-stlc | +| chenson2018/LeanScratch | Untyped λ-calculus: de Bruijn + locally nameless, confluence proofs | https://github.com/chenson2018/LeanScratch | +| sdemos/type-inference | STLC + HM type inference, soundness & completeness of W | https://github.com/sdemos/type-inference | +| rami3l/PLFaLean | PLFA Parts 2–3 translated to Lean 4 | https://github.com/rami3l/PLFaLean | + +## PL Theory (Non-Lean) + +| Resource | Format | Link | +|----------|--------|------| +| Software Foundations Vol.2 | Rocq book | https://softwarefoundations.cis.upenn.edu/ | +| PLFA Part 2 | Agda book | https://plfa.github.io/ | +| Chris Henson: Beginner Resources for Formalizing Lambda Calculi | Blog post | https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html | + +## HM Type System + +| Resource | Format | Link | +|----------|--------|------| +| Wikipedia: Hindley-Milner type system | Reference | https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system | +| Jeff Vaughan: "A proof of correctness for the HM type inference algorithm" | PDF (2008) | https://www.jeffvaughan.net/docs/hmproof.pdf | + +## Community + +- Official site: https://lean-lang.org/ +- Zulip chat: https://leanprover.zulipchat.com/ +- Lean 4 docs: https://lean-lang.org/documentation/ diff --git a/tutorial-01-basics/01-types-and-functions.md b/tutorial-01-basics/01-types-and-functions.md new file mode 100644 index 0000000..36fd57b --- /dev/null +++ b/tutorial-01-basics/01-types-and-functions.md @@ -0,0 +1,107 @@ +# Unit 1 — Types and Functions + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Define functions with `def` and lambda notation +- Understand Lean's basic types: `Nat`, `Int`, `String`, `Bool`, `List`, `Option` +- Use pattern matching and `match` expressions +- Run code with `#eval` + +## Source + +*Functional Programming in Lean* (FPIL), Chapters 1–2 +→ https://lean-lang.org/functional_programming_in_lean/ + +## Concepts + +| Concept | Lean syntax | Example | +|---------|------------|---------| +| Function definition | `def f (x : Nat) : Nat :=` | `def double (x : Nat) : Nat := x + x` | +| Lambda | `fun x => ...` | `fun x => x + 1` | +| Pattern matching | `match e with \| pat => ...` | `match n with \| 0 => "zero" \| _ => "non-zero"` | +| Recursion | `def f ... := match ...` | Structural recursion only (no general fixpoint) | +| `#eval` | `#eval expr` | `#eval double 21` prints `42` | + +## Exercises + +> **How to work**: Create a file `Unit1.lean`. Write each exercise, then +> `#eval` to test it (if it's a function) or just make it compile (if it's a +> proof or definition). Fill in the `sorry` placeholders. + +### Exercise 1.1 — Warm-up functions + +```lean +-- (a) Write a function that returns the absolute difference between two nats +def absDiff (a b : Nat) : Nat := + sorry + +#eval absDiff 7 3 -- expected: 4 +#eval absDiff 3 7 -- expected: 4 +#eval absDiff 5 5 -- expected: 0 + +-- (b) Write a function that checks whether a list contains an element +def contains [BEq α] (x : α) (xs : List α) : Bool := + sorry + +#eval contains 3 [1, 2, 3, 4] -- expected: true +#eval contains 5 [1, 2, 3, 4] -- expected: false + +-- (c) Write `map` (without looking at the standard library) +def myMap (f : α → β) (xs : List α) : List β := + sorry + +#eval myMap (fun x => x * 2) [1, 2, 3] -- expected: [2, 4, 6] +``` + +### Exercise 1.2 — Recursion on lists + +```lean +-- (a) Sum of a list +def sumList (xs : List Nat) : Nat := + sorry + +#eval sumList [1, 2, 3, 4] -- expected: 10 + +-- (b) Reverse a list (the slow way is fine) +def myReverse (xs : List α) : List α := + sorry + +#eval myReverse [1, 2, 3] -- expected: [3, 2, 1] + +-- (c) Filter: keep elements satisfying p +def myFilter (p : α → Bool) (xs : List α) : List α := + sorry + +#eval myFilter (fun x => x % 2 == 0) [1, 2, 3, 4, 5, 6] -- expected: [2, 4, 6] +``` + +### Exercise 1.3 — Option type + +```lean +-- (a) Safe head +def safeHead (xs : List α) : Option α := + sorry + +#eval safeHead ([1, 2, 3] : List Nat) -- expected: some 1 +#eval safeHead ([] : List Nat) -- expected: none + +-- (b) Safe division +def safeDiv (a b : Nat) : Option Nat := + sorry + +#eval safeDiv 10 2 -- expected: some 5 +#eval safeDiv 10 0 -- expected: none + +-- (c) Lookup in an association list +def lookup [BEq α] (key : α) (alist : List (α × β)) : Option β := + sorry + +#eval lookup "b" [("a", 1), ("b", 2), ("c", 3)] -- expected: some 2 +#eval lookup "d" [("a", 1), ("b", 2), ("c", 3)] -- expected: none +``` + +--- + +← [Back to README](../README.md) · Next: [Unit 2 — Inductive Types](02-inductive-types.md) diff --git a/tutorial-01-basics/02-inductive-types.md b/tutorial-01-basics/02-inductive-types.md new file mode 100644 index 0000000..140fb09 --- /dev/null +++ b/tutorial-01-basics/02-inductive-types.md @@ -0,0 +1,106 @@ +# Unit 2 — Inductive Types + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Define inductive data types with the `inductive` keyword +- Understand constructors and pattern matching +- Write structurally recursive functions over inductive types +- Define `Nat`, `List`, and `Tree` from scratch + +## Sources + +- *Theorem Proving in Lean 4* (TPIL), Chapter 7 "Inductive Types" + → https://leanprover.github.io/theorem_proving_in_lean4/ +- *Functional Programming in Lean* (FPIL), Chapter 1 + +## Concepts + +| Concept | Lean syntax | +|---------|------------| +| Inductive definition | `inductive Name where \| ctor1 : ... → Name` | +| Pattern matching constructors | `match t with \| ctor1 x => ... \| ctor2 y => ...` | +| Structural recursion | Recursion on *structurally smaller* subterms only | +| `deriving` clauses | `deriving Repr, BEq, DecidableEq` for auto-generated code | + +## Exercises + +### Exercise 2.1 — Define `Nat` from scratch + +```lean +inductive MyNat where + | zero : MyNat + | succ : MyNat → MyNat + deriving Repr + +-- (a) Addition +def myAdd (a b : MyNat) : MyNat := + sorry + +-- Test (after defining conversion functions or writing your own checks) + +-- (b) Multiplication +def myMul (a b : MyNat) : MyNat := + sorry + +-- (c) Prove that zero is a right identity: myAdd a MyNat.zero = a +-- (We'll do proofs properly in Unit 3, but try `by rfl` for now) +``` + +### Exercise 2.2 — Define `List α` from scratch + +```lean +inductive MyList (α : Type) where + | nil : MyList α + | cons : α → MyList α → MyList α + deriving Repr + +-- (a) Length +def myLength {α : Type} (xs : MyList α) : Nat := + sorry + +-- (b) Append +def myAppend {α : Type} (xs ys : MyList α) : MyList α := + sorry + +-- (c) Prove: myLength (myAppend xs ys) = myLength xs + myLength ys +-- (Try this after Unit 4/5) +``` + +### Exercise 2.3 — Binary trees + +```lean +inductive BinTree (α : Type) where + | leaf : BinTree α + | node : BinTree α → α → BinTree α → BinTree α + deriving Repr + +-- (a) Count the number of nodes +def size {α : Type} (t : BinTree α) : Nat := + sorry + +-- Example tree: node (node leaf 1 leaf) 2 (node leaf 3 leaf) +def exampleTree : BinTree Nat := + BinTree.node (BinTree.node BinTree.leaf 1 BinTree.leaf) 2 + (BinTree.node BinTree.leaf 3 BinTree.leaf) + +#eval size exampleTree -- expected: 3 + +-- (b) Collect all values into a list (inorder traversal) +def inorder {α : Type} (t : BinTree α) : List α := + sorry + +#eval inorder exampleTree -- expected: [1, 2, 3] + +-- (c) Check if a tree is a binary search tree +-- Assume Int keys. A BST satisfies: left < key < right at every node. +-- Write a function that checks this property. +-- Hint: you may want a helper with min/max bounds. +def isBST (t : BinTree Int) : Bool := + sorry +``` + +--- + +← [Previous: Unit 1](01-types-and-functions.md) · Next: [Unit 3 — Propositions and Proofs](03-propositions-and-proofs.md) diff --git a/tutorial-01-basics/03-propositions-and-proofs.md b/tutorial-01-basics/03-propositions-and-proofs.md new file mode 100644 index 0000000..552e472 --- /dev/null +++ b/tutorial-01-basics/03-propositions-and-proofs.md @@ -0,0 +1,90 @@ +# Unit 3 — Propositions and Proofs + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Understand propositions as types (Curry-Howard) +- Write basic proofs using `example` and `theorem` +- Use the tactic language: `intro`, `apply`, `exact`, `have` + +## Source + +*Theorem Proving in Lean 4* (TPIL), Chapters 2–3 +→ https://leanprover.github.io/theorem_proving_in_lean4/ + +## Concepts + +| Concept | Lean | +|---------|------| +| Proposition | `P : Prop` — a type whose terms are proofs | +| Implication `P → Q` | Function from proofs of `P` to proofs of `Q` | +| Conjunction `P ∧ Q` | `And.intro` (constructor), `.left` / `.right` (projections) | +| Disjunction `P ∨ Q` | `Or.inl` / `Or.inr` | +| Negation `¬ P` | `P → False` | +| Tactics | `intro`, `apply`, `exact`, `have`, `assumption` | + +## Exercises + +### Exercise 3.1 — Implicational logic + +```lean +-- (a) Identity +theorem id_prop (A : Prop) : A → A := + by + intro h + exact h + +-- (b) Composition +theorem compose (A B C : Prop) : (A → B) → (B → C) → (A → C) := + by + sorry + +-- (c) Currying +theorem curry (A B C : Prop) : (A → B → C) → (A ∧ B → C) := + by + sorry +``` + +### Exercise 3.2 — Conjunction and disjunction + +```lean +-- (a) Swap conjuncts +theorem and_comm (A B : Prop) : A ∧ B → B ∧ A := + by + sorry + +-- (b) Disjunction is symmetric +theorem or_comm (A B : Prop) : A ∨ B → B ∨ A := + by + sorry + +-- (c) Forward reasoning with `have` +theorem forward_example (A B C : Prop) (h1 : A → B) (h2 : B → C) (ha : A) : C := + by + have hb : B := h1 ha + sorry +``` + +### Exercise 3.3 — Negation + +```lean +-- (a) Contradiction +theorem contrapositive (A B : Prop) : (A → B) → (¬ B → ¬ A) := + by + sorry + +-- (b) Double negation introduction +theorem double_neg_intro (A : Prop) : A → ¬ ¬ A := + by + sorry + +-- (c) Ex falso quodlibet +theorem ex_falso (A : Prop) : False → A := + by + sorry +``` + +--- + +← [Previous: Unit 2](02-inductive-types.md) · Next: [Unit 4 — Quantifiers and Equality](04-quantifiers-and-equality.md) diff --git a/tutorial-01-basics/04-quantifiers-and-equality.md b/tutorial-01-basics/04-quantifiers-and-equality.md new file mode 100644 index 0000000..a4e73bd --- /dev/null +++ b/tutorial-01-basics/04-quantifiers-and-equality.md @@ -0,0 +1,97 @@ +# Unit 4 — Quantifiers and Equality + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Use `∀` (forall) and `∃` (exists) quantifiers +- Manipulate equality with `rfl`, `rw`, `calc` +- Combine quantifiers with logical connectives + +## Source + +*Theorem Proving in Lean 4* (TPIL), Chapter 4 +→ https://leanprover.github.io/theorem_proving_in_lean4/ + +## Concepts + +| Concept | Lean | +|---------|------| +| Universal `∀ x, P x` | `intro x` to prove; `h x` or `apply h` to use | +| Existential `∃ x, P x` | `⟨x, h⟩` to prove; `rcases h with ⟨x, hx⟩` to use | +| Equality `a = b` | `rfl` when definitional; `rw [h]` when propositional | +| `calc` block | Chain equalities: `calc a = b := h1; _ = c := h2` | + +## Exercises + +### Exercise 4.1 — Universal quantifier + +```lean +-- (a) Prove a simple forall +theorem all_imp (P Q : Nat → Prop) (h : ∀ n, P n → Q n) (x : Nat) (hp : P x) : Q x := + by + sorry + +-- (b) Distributing ∀ over ∧ +theorem forall_and_distrib (P Q : Nat → Prop) : (∀ n, P n ∧ Q n) ↔ (∀ n, P n) ∧ (∀ n, Q n) := + by + constructor + · sorry -- left-to-right + · sorry -- right-to-left + +-- (c) Prove that if every natural is even, then 42 is even +-- (Use: `Even n := ∃ k, n = 2*k`) +def Even (n : Nat) : Prop := ∃ k, n = 2 * k + +theorem all_even_implies_42 : (∀ n, Even n) → Even 42 := + by + sorry +``` + +### Exercise 4.2 — Existential quantifier + +```lean +-- (a) Prove existence +theorem exists_square (a : Nat) : ∃ n, n = a * a := + by + sorry + +-- (b) Distributing ∃ over ∨ +theorem exists_or_distrib (P Q : Nat → Prop) : (∃ n, P n ∨ Q n) ↔ (∃ n, P n) ∨ (∃ n, Q n) := + by + constructor + · sorry + · sorry + +-- (c) Prove: if something is both even and greater than 10, then something is greater than 5 +theorem even_and_gt10_implies_gt5 : (∃ n, Even n ∧ n > 10) → (∃ n, n > 5) := + by + sorry +``` + +### Exercise 4.3 — Equality and rewriting + +```lean +-- (a) Prove symmetry and transitivity of equality (without using `symm`/`trans`) +theorem eq_symm {α : Type} (a b : α) (h : a = b) : b = a := + by + sorry + +theorem eq_trans {α : Type} (a b c : α) (h1 : a = b) (h2 : b = c) : a = c := + by + sorry + +-- (b) Use `calc` to prove a chain of identities +theorem calc_example (a b c d : Nat) (h1 : a = b) (h2 : b = c + 1) (h3 : c + 1 = d) : a = d := + by + sorry + +-- (c) Rewrite inside a goal +theorem rewrite_example (a b : Nat) (h : a = b + 1) : a * 2 = (b + 1) * 2 := + by + sorry +``` + +--- + +← [Previous: Unit 3](03-propositions-and-proofs.md) · Next: [Unit 5 — Advanced Tactics](05-advanced-tactics.md) diff --git a/tutorial-01-basics/05-advanced-tactics.md b/tutorial-01-basics/05-advanced-tactics.md new file mode 100644 index 0000000..a5eea88 --- /dev/null +++ b/tutorial-01-basics/05-advanced-tactics.md @@ -0,0 +1,52 @@ +# Unit 5 — Advanced Tactics + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Prove properties by induction with `induction` +- Use `simp`, `ring`, `omega` for automation +- Use `rcases` and `obtain` for case analysis +- Work with `Nat` arithmetic proofs + +## Sources + +- TPIL Chapters 5–6 → https://leanprover.github.io/theorem_proving_in_lean4/ +- *Mathematics in Lean* (MiL), Chapter 1 → https://leanprover-community.github.io/mathematics_in_lean/ + +## Exercises + +```lean +open Nat + +-- 5.1 — Induction on naturals +theorem add_assoc (a b c : Nat) : (a + b) + c = a + (b + c) := by + sorry + +theorem add_comm (a b : Nat) : a + b = b + a := by + sorry + +theorem mul_comm (a b : Nat) : a * b = b * a := by + sorry + +theorem zero_mul (n : Nat) : 0 * n = 0 := by + sorry + +-- 5.2 — Induction on lists +theorem reverse_reverse (xs : List α) : xs.reverse.reverse = xs := by + sorry + +theorem length_append (xs ys : List α) : (xs ++ ys).length = xs.length + ys.length := by + sorry + +theorem map_id (xs : List α) : xs.map id = xs := by + sorry + +-- 5.3 — Using `simp` and `ring` +theorem square_sum (a b : Nat) : (a + b) ^ 2 = a ^ 2 + 2 * a * b + b ^ 2 := by + sorry +``` + +--- + +← [Previous: Unit 4](04-quantifiers-and-equality.md) · Next: [Unit 6 — Structures and Type Classes](06-structures-type-classes.md) diff --git a/tutorial-01-basics/06-structures-type-classes.md b/tutorial-01-basics/06-structures-type-classes.md new file mode 100644 index 0000000..3f307d1 --- /dev/null +++ b/tutorial-01-basics/06-structures-type-classes.md @@ -0,0 +1,68 @@ +# Unit 6 — Structures and Type Classes + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Define `structure` for records with named fields +- Understand type classes as structures + instance resolution +- Use `deriving` for `Repr`, `BEq`, `Inhabited` +- Write monadic code with `do` notation + +## Sources + +- TPIL Chapter 9 (Structures), Chapter 10 (Type Classes) +- FPIL Chapters 3–5 (overloading, monads) + +## Exercises + +```lean +-- 6.1 — Structures +structure Point where + x : Float + y : Float +deriving Repr + +def distance (p q : Point) : Float := + sorry + +#eval distance {x := 0, y := 0} {x := 3, y := 4} -- expected: 5.0 + +-- 6.2 — Type classes +class Semigroup (α : Type) where + mul : α → α → α + mul_assoc : ∀ a b c : α, mul (mul a b) c = mul a (mul b c) + +instance : Semigroup Nat where + mul := Nat.mul + mul_assoc := Nat.mul_assoc + +-- Define Monoid extending Semigroup with an identity +class Monoid (α : Type) extends Semigroup α where + one : α + mul_one : ∀ a : α, mul a one = a + one_mul : ∀ a : α, mul one a = a + +instance : Monoid Nat where + one := 1 + mul_one := sorry + one_mul := sorry + +-- 6.3 — Option monad and `do` notation +def safeDiv (a b : Nat) : Option Nat := + if b == 0 then none else some (a / b) + +-- Rewrite this with `do` notation: +def compute (x y z : Nat) : Option Nat := + do + let a ← safeDiv x y + let b ← safeDiv a z + some (b + 1) + +#eval compute 10 2 3 -- expected: some 2 +#eval compute 10 0 3 -- expected: none +``` + +--- + +← [Previous: Unit 5](05-advanced-tactics.md) · Next: [Unit 7 — Dependent Types](07-dependent-types.md) diff --git a/tutorial-01-basics/07-dependent-types.md b/tutorial-01-basics/07-dependent-types.md new file mode 100644 index 0000000..f0715a2 --- /dev/null +++ b/tutorial-01-basics/07-dependent-types.md @@ -0,0 +1,77 @@ +# Unit 7 — Dependent Types + +**Tutorial 1: Lean 4 Fundamentals** · [← Back to README](../README.md) + +## Goals + +- Understand types that depend on values +- Work with `Fin n` (numbers < n), `Vector α n` (fixed-length lists) +- Write functions whose return type depends on input +- See why dependent types matter for PL semantics + +## Sources + +- TPIL Chapter 2 "Dependent Type Theory" +- FPIL Chapter 7 "Programming with Dependent Types" +- Henson 2025: "Beginner Resources for Formalizing Lambda Calculi" + +## Concepts + +| Concept | Lean | +|---------|------| +| Pi type `(x : α) → β x` | Function where return type depends on argument | +| Sigma type `Σ x : α, β x` | Dependent pair: value + property | +| `Fin n` | Natural numbers `< n` | +| `Vector α n` | Lists with length tracked in the type | +| `h : a = b → ...` | Equality can rewrite types (substitution principle) | + +## Exercises + +```lean +-- 7.1 — Fin n +-- Define a safe nth function for lists that *cannot* fail at runtime +def safeNth {α : Type} (xs : List α) (i : Fin xs.length) : α := + sorry + +-- (You'll need `match` on both `xs` and `i`) +-- Test: `safeNth [1,2,3] ⟨0, by decide⟩` should return 1 + +-- 7.2 — Vector +inductive Vector (α : Type) : Nat → Type where + | nil : Vector α 0 + | cons : α → {n : Nat} → Vector α n → Vector α (n+1) +deriving Repr + +-- (a) Append two vectors (length is m + n) +def vappend {α : Type} {m n : Nat} (v : Vector α m) (w : Vector α n) : Vector α (m + n) := + sorry + +-- (b) Map over a vector +def vmap {α β : Type} {n : Nat} (f : α → β) (v : Vector α n) : Vector β n := + sorry + +-- 7.3 — Dependent types for ASTs +-- Define an AST where each expression *carries its type* +inductive Expr : Type where + | lit : Nat → Expr + | add : Expr → Expr → Expr + | isZero : Expr → Expr +deriving Repr + +-- Now define the *typing predicate* as a dependent relation +-- (Preview of Tutorial 2) +inductive Ty : Type where | nat : Ty | bool : Ty + +-- Fill this in: `HasType e τ` means "expression e has type τ" +inductive HasType : Expr → Ty → Prop where + -- (your constructors here) + +-- Then prove: every well-typed `isZero` expression returns a bool +theorem isZero_returns_bool (e : Expr) : HasType (Expr.isZero e) Ty.nat → HasType e Ty.nat := + by + sorry +``` + +--- + +← [Previous: Unit 6](06-structures-type-classes.md) · [← Tutorial 1 Index](./) · Next: [Tutorial 2 — Unit 8](../tutorial-02-semantics/08-syntax-representation.md) diff --git a/tutorial-02-semantics/08-syntax-representation.md b/tutorial-02-semantics/08-syntax-representation.md new file mode 100644 index 0000000..189eb81 --- /dev/null +++ b/tutorial-02-semantics/08-syntax-representation.md @@ -0,0 +1,82 @@ +# Unit 8 — Representing Syntax + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Encode lambda calculus terms as an inductive type +- Understand three binding representations: + 1. **Named** (strings — simple, but α-equiv isn't definitional) + 2. **de Bruijn indices** (numbers — α-equiv is free, shifting is painful) + 3. **Locally nameless** (free vars named, bound vars indexed — compromise) + +We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial, +with locally nameless for comparison. + +## Sources + +- syndikos/lean4-stlc `Syntax.lean`: https://github.com/syndikos/lean4-stlc +- Chris Henson 2025: https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html +- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch +- Software Foundations Vol.2: https://softwarefoundations.cis.upenn.edu/ + +## Exercises + +```lean +-- 8.1 — Named representation +inductive NamedTerm where + | var (x : String) + | lam (x : String) (body : NamedTerm) + | app (f arg : NamedTerm) +deriving Repr + +-- The Church encoding of identity: λx. x +def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x") + +-- Encode λx. λy. x (K combinator) +def kNamed : NamedTerm := + sorry + +-- Encode λf. λx. f (f x) (Church numeral 2) +def twoNamed : NamedTerm := + sorry + +-- 8.2 — de Bruijn representation +-- Variables are numbers: 0 = nearest binder, 1 = next, etc. +inductive DBTerm where + | var (idx : Nat) -- variable reference by binding distance + | lam (body : DBTerm) -- λ. body (no name needed!) + | app (f arg : DBTerm) +deriving Repr + +-- λ. λ. 1 (= λx. λy. x in named form) +def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1)) + +-- λ. 0 (= λx. x in named form) +def idDB : DBTerm := DBTerm.lam (DBTerm.var 0) + +-- Encode λf. λx. f (f x) (Church 2) +def twoDB : DBTerm := + sorry + +-- 8.3 — Locally nameless +-- Free variables are strings, bound variables are de Bruijn indices +-- (You don't need to implement this fully — just understand the idea) +inductive LNTerm where + | fvar (x : String) -- free variable + | bvar (idx : Nat) -- bound variable (de Bruijn) + | lam (body : LNTerm) -- binder + | app (f arg : LNTerm) +deriving Repr +``` + +### Key insight for PL semantics + +When we encode **typing contexts** `Γ = x₁:τ₁, x₂:τ₂, ...`, de Bruijn indices +give us "index into the context" for free. The last binding is index 0, the +second-last is index 1, etc. This makes the typing rules elegant in Lean — +no name-clash avoidance needed. + +--- + +← [Tutorial 1 — Unit 7](../tutorial-01-basics/07-dependent-types.md) · Next: [Unit 9 — Substitution](09-substitution.md) diff --git a/tutorial-02-semantics/09-substitution.md b/tutorial-02-semantics/09-substitution.md new file mode 100644 index 0000000..02bfcd8 --- /dev/null +++ b/tutorial-02-semantics/09-substitution.md @@ -0,0 +1,81 @@ +# Unit 9 — Substitution and α-Equivalence + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Implement `subst` (substitution) for de Bruijn terms +- Implement `lift` (index shifting) — the key operation that avoids capture +- Prove fundamental substitution lemmas + +## Sources + +- syndikos/lean4-stlc `Syntax.lean` and `Reduction.lean` +- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch +- PLFA Part 2: https://plfa.github.io/ + +## Exercises + +```lean +open DBTerm -- from Unit 8 + +-- 9.1 — Lift: shift all *free* variables above a cutoff by k +-- Lift is the engine that makes substitution work under binders. +-- When we go under a λ, existing de Bruijn indices need to shift by 1 +-- because a new binder appears. +def lift (t : DBTerm) (cutoff k : Nat) : DBTerm := + match t with + | var n => if n < cutoff then var n else var (n + k) + | lam body => lam (lift body (cutoff + 1) k) + | app f a => app (lift f cutoff k) (lift a cutoff k) + +-- Verify: lifting the identity function +-- lift (λ. 0) 0 1 should produce λ. 0 (bound variables inside λ are untouched) +#eval lift idDB 0 1 + +-- 9.2 — Substitution: replace variable `idx` with `s` in term `t` +-- This is capture-avoiding substitution via de Bruijn arithmetic. +-- Under a λ, the target index shifts by 1 and the replacement gets lifted. +def subst (t : DBTerm) (idx : Nat) (s : DBTerm) : DBTerm := + match t with + | var n => + if n < idx then var n + else if n == idx then s -- found the variable to replace + else var (n - 1) -- shift remaining vars down + | lam body => lam (subst body (idx + 1) (lift s 0 1)) + | app f a => app (subst f idx s) (subst a idx s) + +-- Test: substitute λ. 0 for variable 0 in (λ. 0) -- should be identity +-- Actually: [0 ↦ λ.0] (λ. 0) — nothing changes (0 is bound) +#eval subst (lam (var 0)) 0 (lam (var 0)) + +-- 9.3 — Single-step β-reduction (preview of next unit) +-- β: (λ. t) s → subst t 0 s +def betaRedex : DBTerm := app (lam (var 0)) (lam (var 0)) -- (λx. x) (λy. y) → ...y... + +-- Compute the result of one β step on this term +#eval subst (var 0) 0 (lam (var 0)) -- should be lam (var 0) (= λy. y) + +-- 9.4 — Lemma: Substitution under a lambda +-- If we substitute for index 0 in a term that's already shifted, +-- the lift in the body handles the shift correctly. +-- Prove this property: +lemma subst_lift_commute (t s : DBTerm) (n : Nat) : + subst (lift t 0 1) (n+1) (lift s 0 1) = lift (subst t n s) 0 1 := + by + sorry + +-- This lemma is *critical* for proving preservation/type safety later. +-- It says that shifting before and after substitution agree. +``` + +### Why this matters + +Substitution is the core operation of operational semantics. Doing it right +(no variable capture) is the hardest part of formalizing lambda calculi. +de Bruijn indices make capture impossible — the arithmetic is mechanical. +The lemmas here are boilerplate you prove once, then reuse everywhere. + +--- + +← [Previous: Unit 8](08-syntax-representation.md) · Next: [Unit 10 — Small-Step Semantics](10-small-step-semantics.md) diff --git a/tutorial-02-semantics/10-small-step-semantics.md b/tutorial-02-semantics/10-small-step-semantics.md new file mode 100644 index 0000000..3da5232 --- /dev/null +++ b/tutorial-02-semantics/10-small-step-semantics.md @@ -0,0 +1,65 @@ +# Unit 10 — Small-Step Operational Semantics + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Define single-step reduction `→` for call-by-value λ-calculus +- Define reflexive-transitive closure `→*` +- Prove determinism: `t → t₁ ∧ t → t₂ ⇒ t₁ = t₂` + +## Sources + +- syndikos/lean4-stlc `Reduction.lean`, `Semantics.lean` +- Software Foundations Vol.2 "Smallstep" +- PLFA Part 2 + +## Exercises + +```lean +open DBTerm + +-- 10.1 — Single-step cbv reduction +-- Call-by-value: reduce the *argument* before substituting into the function body. +-- A value is a λ-abstraction (in pure STLC). +inductive Value : DBTerm → Prop where + | lam : Value (lam b) + +inductive Step : DBTerm → DBTerm → Prop where + -- β-reduction: (λ. t) v → subst t 0 v (where v is a value) + | beta (t v : DBTerm) (hv : Value v) : + Step (app (lam t) v) (subst t 0 v) + -- Congruence: if f → f', then f a → f' a + | appL (f f' a : DBTerm) (hstep : Step f f') : + Step (app f a) (app f' a) + -- Congruence: if v is a value and a → a', then v a → v a' + | appR (v a a' : DBTerm) (hv : Value v) (hstep : Step a a') : + Step (app v a) (app v a') + +-- 10.2 — Multi-step (reflexive-transitive closure) +inductive StepStar : DBTerm → DBTerm → Prop where + | refl (t : DBTerm) : StepStar t t + | step (t₁ t₂ t₃ : DBTerm) (hstep : Step t₁ t₂) (hstar : StepStar t₂ t₃) : + StepStar t₁ t₃ + +-- 10.3 — Determinism +theorem step_deterministic {t t₁ t₂ : DBTerm} (h₁ : Step t t₁) (h₂ : Step t t₂) : t₁ = t₂ := + by + sorry + +-- Hint: induction on h₁, then cases on h₂. The tricky case: +-- when one side is (lam t) and the other is a congruence rule that also +-- produces (lam t) — they match. + +-- 10.4 — Stuck terms +-- A term is "stuck" if it's not a value and can't take a step. +-- Type soundness = "well-typed terms never get stuck". +def Stuck (t : DBTerm) : Prop := + ¬ Value t ∧ ¬ ∃ t', Step t t' + +-- Find a stuck term in the untyped calculus +def stuckExample : DBTerm := + app (var 0) (var 1) -- a free variable applied to another free variable + +-- Show it's stuck (you'll prove something similar in Unit 12 with types) +``` diff --git a/tutorial-02-semantics/11-stlc-type-system.md b/tutorial-02-semantics/11-stlc-type-system.md new file mode 100644 index 0000000..bc1bfd0 --- /dev/null +++ b/tutorial-02-semantics/11-stlc-type-system.md @@ -0,0 +1,78 @@ +# Unit 11 — STLC Type System + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Define simple types (`Nat → Bool`, etc.) +- Define typing contexts `Γ` as lists of types +- Encode the STLC typing judgment `Γ ⊢ t : τ` + +## Sources + +- syndikos/lean4-stlc `Typing.lean` +- Software Foundations Vol.2 "Types" +- sdemos/type-inference for type inference perspective + +## Exercises + +```lean +open DBTerm + +-- 11.1 — Types +inductive Ty where + | nat + | bool + | fn (τ₁ τ₂ : Ty) -- τ₁ → τ₂ +deriving Repr, DecidableEq + +-- Notations: τ₁ ⇒ τ₂ for function types, like in most PL papers +notation:40 τ₁ " ⇒ " τ₂ => Ty.fn τ₁ τ₂ + +-- 11.2 — Contexts +-- A context is a list of types, indexed by de Bruijn index. +-- Position 0 = the most recently bound variable. +def Ctx := List Ty + +-- 11.3 — Lookup: get the type of variable `i` in context `Γ` +def lookup (Γ : Ctx) (i : Nat) : Option Ty := + match Γ with + | [] => none + | τ :: _ => if i == 0 then some τ else lookup (Γ.tail) (i - 1) + +-- 11.4 — Typing judgment: Γ ⊢ t : τ +-- De Bruijn version — no variable names needed! +inductive HasType : Ctx → DBTerm → Ty → Prop where + | var (Γ : Ctx) (i : Nat) (h : lookup Γ i = some τ) : + HasType Γ (var i) τ + | lam (Γ : Ctx) (t : DBTerm) (τ₁ τ₂ : Ty) (h : HasType (τ₁ :: Γ) t τ₂) : + HasType Γ (lam t) (τ₁ ⇒ τ₂) + | app (Γ : Ctx) (t₁ t₂ : DBTerm) (τ₁ τ₂ : Ty) + (h₁ : HasType Γ t₁ (τ₁ ⇒ τ₂)) (h₂ : HasType Γ t₂ τ₁) : + HasType Γ (app t₁ t₂) τ₂ + +-- 11.5 — Exercise: type derivations +-- Build a typing derivation for the identity function λx. x +theorem id_typed (τ : Ty) : HasType [] (lam (var 0)) (τ ⇒ τ) := + by + sorry + +-- Build a typing derivation for the K combinator λx. λy. x +theorem k_typed (τ₁ τ₂ : Ty) : HasType [] (lam (lam (var 1))) (τ₁ ⇒ τ₂ ⇒ τ₁) := + by + sorry + +-- Build a typing derivation for the S combinator: λx. λy. λz. x z (y z) +-- (This is a good stress test — it exercises both app and lam multiple times) +theorem s_typed (τ₁ τ₂ τ₃ : Ty) : + HasType [] (lam (lam (lam (app (app (var 2) (var 0)) (app (var 1) (var 0)))))) + ((τ₁ ⇒ τ₂ ⇒ τ₃) ⇒ (τ₁ ⇒ τ₂) ⇒ τ₁ ⇒ τ₃) := + by + sorry + +-- 11.6 — Weakening: adding extra types to the context preserves typing +theorem weakening (Γ Δ : Ctx) (t : DBTerm) (τ : Ty) + (h : HasType Γ t τ) (hsub : ∀ i, lookup Γ i = lookup (Δ ++ Γ) i) : HasType (Δ ++ Γ) t τ := + by + sorry +``` diff --git a/tutorial-02-semantics/12-type-safety.md b/tutorial-02-semantics/12-type-safety.md new file mode 100644 index 0000000..ace5170 --- /dev/null +++ b/tutorial-02-semantics/12-type-safety.md @@ -0,0 +1,104 @@ +# Unit 12 — Type Safety (Progress + Preservation) + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Prove **Progress**: well-typed terms are either values or can step +- Prove **Preservation** (Subject Reduction): types survive reduction +- Combine them: **Type Soundness** = well-typed programs don't get stuck + +## Sources + +- syndikos/lean4-stlc `Metatheory.lean` +- Software Foundations Vol.2 +- PLFA Part 2: DeBruijn chapter + +## Key Lemmas + +``` +Canonical Forms: + If [] ⊢ v : τ₁ ⇒ τ₂ and v is a value, then v = λ.t + +Substitution Lemma: + If τ₁::Γ ⊢ t : τ₂ and Γ ⊢ s : τ₁, then Γ ⊢ subst t 0 s : τ₂ + +Preservation: + If Γ ⊢ t : τ and t → t', then Γ ⊢ t' : τ + +Progress: + If [] ⊢ t : τ, then either Value t or ∃ t', t → t' +``` + +## Exercises + +```lean +open DBTerm +open Step +open HasType + +-- 12.1 — Canonical forms lemma for functions +theorem canonical_forms_fn {t : DBTerm} {τ₁ τ₂ : Ty} + (h : HasType [] t (τ₁ ⇒ τ₂)) (hv : Value t) : ∃ body, t = lam body := + by + sorry + +-- 12.2 — Substitution lemma (the heart of the proof) +-- This uses the substitution commutation lemma from Unit 9. +theorem substitution_lemma (Γ : Ctx) (t s : DBTerm) (τ₁ τ₂ : Ty) + (h₁ : HasType (τ₁ :: Γ) t τ₂) (h₂ : HasType Γ s τ₁) : + HasType Γ (subst t 0 s) τ₂ := + by + sorry + +-- Hint: induction on h₁. The variable case branches on whether the index +-- is 0 (use h₂), less than 0 (impossible — Nat), or greater (use h₁). +-- The lambda case requires weakening to adjust the context. + +-- 12.3 — Preservation (Subject Reduction) +theorem preservation {Γ : Ctx} {t t' : DBTerm} {τ : Ty} + (htype : HasType Γ t τ) (hstep : Step t t') : HasType Γ t' τ := + by + induction hstep generalizing τ with + | beta t v hv => + -- Case (λ.t) v → subst t 0 v + -- Invert htype to get HasType (τ₁::Γ) t τ₂ and HasType Γ v τ₁ + -- Then apply the substitution lemma + sorry + | appL f f' a hstep_f ih => + -- Case f a → f' a where f → f' + sorry + | appR v a a' hv hstep_a ih => + -- Case v a → v a' where a → a' + sorry + +-- 12.4 — Progress +theorem progress {t : DBTerm} {τ : Ty} (h : HasType [] t τ) : + Value t ∨ ∃ t', Step t t' := + by + induction h with + | var Γ i hlook => + -- A variable can't be typed in the empty context + sorry + | lam Γ body τ₁ τ₂ hbody => + -- A lambda is a value + left; exact Value.lam + | app Γ f a τ₁ τ₂ hf ha ihf iha => + -- f a: either f is a value, or it can step, etc. + sorry + +-- 12.5 — Soundness corollary +-- Well-typed programs never get stuck. +theorem soundness {t : DBTerm} {τ : Ty} (h : HasType [] t τ) : + ¬ Stuck t := + by + -- A term is stuck if it's not a value AND can't step. + -- By progress, it's either a value or can step. + -- In both cases, it's not stuck. + intro hstuck + rcases hstuck with ⟨hnotval, hnostep⟩ + rcases progress h with (hv | hstep) + · exact hnotval hv + · apply hnostep + exact hstep +``` diff --git a/tutorial-02-semantics/13-hm-declarative.md b/tutorial-02-semantics/13-hm-declarative.md new file mode 100644 index 0000000..e6a7880 --- /dev/null +++ b/tutorial-02-semantics/13-hm-declarative.md @@ -0,0 +1,119 @@ +# Unit 13 — Hindley-Milner Declarative Typing Rules + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Extend STLC with **let-polymorphism** +- Encode HM types with **type schemes** `∀α. τ` +- Formalize the 6 declarative typing rules (Var, App, Abs, Let, Inst, Gen) +- Understand the distinction between monomorphic (`τ`) and polymorphic (`σ = ∀α.τ`) types + +## Sources + +- [Wikipedia: Hindley-Milner type system](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system) +- [Vaughan 2008](https://www.jeffvaughan.net/docs/hmproof.pdf) (declarative rules, §2) +- [sdemos/type-inference](https://github.com/sdemos/type-inference) (Lean formalization) + +## Background + +HM adds **let-polymorphism** to STLC: +``` +let x = e₁ in e₂ +``` +In STLC, `x` would have a single monomorphic type. In HM, `x` gets a type +*scheme* — e.g., `∀α. α → α` for the identity function — and each use of `x` +instantiates the scheme with fresh type variables. + +The declarative rules: +``` +Var: Γ, x:σ ⊢ x : σ +App: Γ ⊢ e₁ : τ₁ → τ₂ Γ ⊢ e₂ : τ₁ ⇛ Γ ⊢ e₁ e₂ : τ₂ +Abs: Γ, x:τ₁ ⊢ e : τ₂ ⇛ Γ ⊢ λx.e : τ₁ → τ₂ +Let: Γ ⊢ e₁ : σ Γ, x:σ ⊢ e₂ : τ ⇛ Γ ⊢ let x = e₁ in e₂ : τ +Inst: Γ ⊢ e : ∀α.τ ⇛ Γ ⊢ e : τ[α := τ'] +Gen: Γ ⊢ e : τ ∧ α ∉ ftv(Γ) ⇛ Γ ⊢ e : ∀α.τ +``` + +## Exercises + +```lean +-- 13.1 — Types and type schemes +inductive MonoType where + | tvar (id : Nat) -- type variable α + | fn (τ₁ τ₂ : MonoType) -- τ₁ → τ₂ +deriving Repr, DecidableEq + +notation:40 τ₁ " ⇒ " τ₂ => MonoType.fn τ₁ τ₂ + +-- A type scheme: ∀α₁...αₙ. τ (quantified over a set of type variables) +structure TypeScheme where + vars : List Nat -- bound type variables + body : MonoType +deriving Repr + +-- 13.2 — HM expression syntax (extend STLC with `let`) +inductive HMExpr where + | var (i : Nat) -- de Bruijn index + | lam (body : HMExpr) -- λ. e + | app (f a : HMExpr) -- f a + | lett (e₁ e₂ : HMExpr) -- let x = e₁ in e₂ +deriving Repr + +-- 13.3 — Environments +-- Γ maps de Bruijn indices to type schemes +def HMEnv := List TypeScheme + +-- 13.4 — Declarative typing rules +inductive HMTyping : HMEnv → HMExpr → TypeScheme → Prop where + + -- Var: look up variable in the environment + | var (Γ : HMEnv) (i : Nat) (h : lookup Γ i = some σ) : HMTyping Γ (HMExpr.var i) σ + + -- Abs: λx. e — note: x gets a monomorphic type in the lambda body + | abs (Γ : HMEnv) (e : HMExpr) (τ₁ τ₂ : MonoType) + (h : HMTyping ({vars := [], body := τ₁} :: Γ) e {vars := [], body := τ₂}) : + HMTyping Γ (HMExpr.lam e) {vars := [], body := τ₁ ⇒ τ₂} + + -- App: e₁ e₂ + | app (Γ : HMEnv) (e₁ e₂ : HMExpr) (σ τ₁ τ₂ : MonoType) (σ' : TypeScheme) + (h₁ : HMTyping Γ e₁ σ') (h₂ : HMTyping Γ e₂ {vars := [], body := τ₁}) + (h_eq : σ' = {vars := [], body := τ₁ ⇒ τ₂}) : + HMTyping Γ (HMExpr.app e₁ e₂) {vars := [], body := τ₂} + + -- Let: e₁ is generalized and e₂ can use the polymorphic type + | lett (Γ : HMEnv) (e₁ e₂ : HMExpr) (σ σ' : TypeScheme) (τ : MonoType) + (h₁ : HMTyping Γ e₁ σ) (h₂ : HMTyping (σ :: Γ) e₂ σ') : + HMTyping Γ (HMExpr.lett e₁ e₂) σ' + + -- Gen: generalize over free type variables not in the environment + | gen (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) (αs : List Nat) + (h : HMTyping Γ e σ) (hfresh : ∀ α, α ∈ αs → α ∉ ftv_env Γ) : + HMTyping Γ e {σ with vars := αs ++ σ.vars} + + -- Inst: instantiate a type scheme + | inst (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) (τ : MonoType) (subst : Nat → MonoType) + (h : HMTyping Γ e σ) (hinst : apply_subst subst σ = τ) : + HMTyping Γ e {vars := [], body := τ} + +-- 13.5 — Exercise: the identity function is polymorphic +def id_type : TypeScheme := {vars := [0], body := MonoType.tvar 0 ⇒ MonoType.tvar 0} + +theorem id_polymorphic : HMTyping [] (HMExpr.lam (HMExpr.var 0)) id_type := + by + sorry + +-- 13.6 — Exercise: let x = λy. y in x x (instantiation of polymorphic identity) +-- This term should be well-typed — the identity is used at two different types +def self_app_id : HMExpr := + HMExpr.lett (HMExpr.lam (HMExpr.var 0)) (HMExpr.app (HMExpr.var 0) (HMExpr.var 0)) + +theorem self_app_id_typed (τ : MonoType) : + HMTyping [] self_app_id {vars := [], body := τ ⇒ τ} := + by + sorry +``` + +--- + +← [Previous: Unit 12](12-type-safety.md) · Next: [Unit 14 — Algorithm W](14-algorithm-w.md) diff --git a/tutorial-02-semantics/14-algorithm-w.md b/tutorial-02-semantics/14-algorithm-w.md new file mode 100644 index 0000000..f981e28 --- /dev/null +++ b/tutorial-02-semantics/14-algorithm-w.md @@ -0,0 +1,156 @@ +# Unit 14 — Algorithm W (Hindley-Milner Type Inference) + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Understand type inference as solving unification constraints +- Implement **Algorithm W**: the classic HM inference algorithm +- Write `unify` (Robinson's unification) +- Write `infer` (the main inference loop) + +## Sources + +- [Vaughan 2008, §3](https://www.jeffvaughan.net/docs/hmproof.pdf) +- [sdemos/type-inference](https://github.com/sdemos/type-inference) +- [Wikipedia: Algorithm W](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_W) + +## Background + +Algorithm W takes an expression and returns a substitution + type: + +``` +infer(Γ, e) = (S, τ) +``` + +- `Γ` maps variables to type schemes +- `e` is the expression to type +- `S` is a type substitution (unifies constraints found during inference) +- `τ` is the inferred monotype + +The algorithm uses a supply of *fresh type variables* to build constraints, +then unifies them. + +## Exercises + +```lean +open MonoType +open HMExpr +open TypeScheme + +-- 14.1 — Substitutions +-- A substitution maps type variables to monotypes +abbrev Subst := Nat → MonoType + +-- Identity substitution +def idSubst : Subst := fun α => MonoType.tvar α + +-- Apply a substitution to a monotype +def applySubst (S : Subst) : MonoType → MonoType + | MonoType.tvar α => S α + | MonoType.fn τ₁ τ₂ => MonoType.fn (applySubst S τ₁) (applySubst S τ₂) + +-- Compose substitutions: (S₁ ∘ S₂)(α) = S₁(S₂(α)) +def compose (S₁ S₂ : Subst) : Subst := + fun α => applySubst S₁ (S₂ α) + +-- 14.2 — Unification (Robinson's algorithm) +-- Returns a substitution that makes τ₁ and τ₂ equal, if possible. +-- Fails if there's a type mismatch (e.g., unifying α → β with α is impossible). +def unify (τ₁ τ₂ : MonoType) : Option Subst := + match τ₁, τ₂ with + | MonoType.tvar α, MonoType.tvar β => + if α == β then some idSubst + else some (fun γ => if γ == α then MonoType.tvar β else MonoType.tvar γ) + | MonoType.tvar α, τ => + if occurs α τ then none -- occurs check: α ∉ ftv(τ) + else some (fun γ => if γ == α then τ else MonoType.tvar γ) + | τ, MonoType.tvar α => + if occurs α τ then none + else some (fun γ => if γ == α then τ else MonoType.tvar γ) + | MonoType.fn τ₁ᵃ τ₁ᵇ, MonoType.fn τ₂ᵃ τ₂ᵇ => + match unify τ₁ᵃ τ₂ᵃ with + | none => none + | some S₁ => + match unify (applySubst S₁ τ₁ᵇ) (applySubst S₁ τ₂ᵇ) with + | none => none + | some S₂ => some (compose S₂ S₁) + +-- Occurs check: does α appear in τ? +def occurs (α : Nat) : MonoType → Bool + | MonoType.tvar β => α == β + | MonoType.fn τ₁ τ₂ => occurs α τ₁ || occurs α τ₂ + +-- 14.3 — Fresh variable supply +-- We use a counter to generate fresh type variables +def freshVar (counter : Nat) : Nat × Nat := (counter, counter + 1) + +-- 14.4 — Generalization: close a type under the environment +-- `generalize(Γ, τ)` produces `∀αs. τ` where αs = ftv(τ) \ ftv(Γ) +def generalize (Γ : HMEnv) (τ : MonoType) : TypeScheme := + { vars := (ftv τ).filter (fun α => α ∉ ftv_env Γ) + , body := τ } + +-- 14.5 — Algorithm W (the core inference algorithm) +-- Returns `(S, τ)` where S is a substitution and τ the inferred type. +-- Uses a state monad for the fresh variable counter (simplified here). +def inferW (Γ : HMEnv) (e : HMExpr) (counter : Nat) : Option (Subst × MonoType × Nat) := + match e with + | HMExpr.var i => + match lookup Γ i with + | none => none + | some σ => some (idSubst, instantiate σ counter, counter + length σ.vars) + + | HMExpr.lam body => + -- Create a fresh type variable for the parameter + let (α, counter') := freshVar counter + let τ_param := MonoType.tvar α + -- Add x : α to the environment + let Γ' := {vars := [], body := τ_param} :: Γ + -- Infer the body type + match inferW Γ' body counter' with + | none => none + | some (S, τ_body, counter'') => + some (S, MonoType.fn (applySubst S τ_param) τ_body, counter'') + + | HMExpr.app f a => + match inferW Γ f counter with + | none => none + | some (S₁, τ_f, counter₁) => + match inferW (applySubstEnv S₁ Γ) a counter₁ with + | none => none + | some (S₂, τ_a, counter₂) => + let β := freshVar counter₂ + let α_counter₃ := β.2 + match unify (applySubst S₂ τ_f) (MonoType.fn τ_a (MonoType.tvar β.1)) with + | none => none + | some S₃ => + let S := compose S₃ (compose S₂ S₁) + some (S, applySubst S₃ (MonoType.tvar β.1), α_counter₃) + + | HMExpr.lett e₁ e₂ => + match inferW Γ e₁ counter with + | none => none + | some (S₁, τ₁, counter₁) => + let σ₁ := generalize (applySubstEnv S₁ Γ) τ₁ + let Γ' := σ₁ :: applySubstEnv S₁ Γ + match inferW Γ' e₂ counter₁ with + | none => none + | some (S₂, τ₂, counter₂) => + some (compose S₂ S₁, τ₂, counter₂) + +-- 14.6 — Exercise: infer the type of λx. x +def infer_id : Option (Subst × MonoType × Nat) := + inferW [] (HMExpr.lam (HMExpr.var 0)) 0 + +-- Should return a substitution mapping α₀ to α₀ and type α₀ → α₀ +-- #eval infer_id + +-- 14.7 — Exercise: infer the type of let x = λy. y in x x +def infer_self_app : Option (Subst × MonoType × Nat) := + inferW [] self_app_id 0 +``` + +--- + +← [Previous: Unit 13](13-hm-declarative.md) · Next: [Unit 15 — Soundness and Completeness](15-soundness-completeness.md) diff --git a/tutorial-02-semantics/15-soundness-completeness.md b/tutorial-02-semantics/15-soundness-completeness.md new file mode 100644 index 0000000..dd4bbac --- /dev/null +++ b/tutorial-02-semantics/15-soundness-completeness.md @@ -0,0 +1,85 @@ +# Unit 15 — Soundness and Completeness of Algorithm W + +**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) + +## Goals + +- Prove that Algorithm W is **sound**: every inferred type is derivable in the declarative system +- Prove that Algorithm W is **complete**: every derivable type can be inferred (up to generalization) +- Understand the relationship between inference and declarative rules + +## Sources + +- [Vaughan 2008, §4](https://www.jeffvaughan.net/docs/hmproof.pdf) +- [sdemos/type-inference](https://github.com/sdemos/type-inference) `Soundness.lean`, `Completeness.lean` + +## The Theorems + +**Soundness**: If `infer(Γ, e) = (S, τ)`, then `Γ ⊢ e : τ` declaratively. + +**Completeness**: If `Γ ⊢ e : τ` declaratively, then there exists `S, τ'` such that +`infer(Γ, e) = (S, τ')` and `τ` is a substitution instance of `τ'` (i.e., W finds the *most general* type). + +## Exercises + +```lean +open HMTyping +open MonoType + +-- 15.1 — Soundness of Algorithm W +theorem soundness_W (Γ : HMEnv) (e : HMExpr) (S : Subst) (τ : MonoType) (counter : Nat) + (h : inferW Γ e counter = some (S, τ, _)) : + HMTyping (applySubstEnv S Γ) e {vars := [], body := τ} := + by + induction e generalizing Γ S τ counter with + | var i => + -- Case: variable lookup. Need to relate instantiated type scheme to declarative Var rule. + sorry + | lam body ih => + -- Case: lambda. Use the IH and the Abs rule. + sorry + | app f a ihf iha => + -- Case: application. The unification produces a substitution that makes types match. + -- Need substitution composition lemmas. + sorry + | lett e₁ e₂ ih₁ ih₂ => + -- Case: let. Generalization produces a type scheme; the body gets the polymorphic type. + sorry + +-- 15.2 — Completeness of Algorithm W +theorem completeness_W (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) + (h : HMTyping Γ e σ) : + ∃ (S : Subst) (τ : MonoType) (counter : Nat), + inferW Γ e 0 = some (S, τ, counter) ∧ + ∃ (S' : Subst), applySubst S' τ = σ.body := + by + induction h with + | var Γ i hlook => + sorry + | abs Γ e τ₁ τ₂ hbody ih => + sorry + | app Γ e₁ e₂ σ τ₁ τ₂ σ' h₁ h₂ h_eq ih₁ ih₂ => + sorry + | lett Γ e₁ e₂ σ σ' τ h₁ h₂ ih₁ ih₂ => + sorry + | gen Γ e σ αs hbody hfresh ih => + sorry + | inst Γ e σ τ subst hbody hinst ih => + sorry + +-- 15.3 — Principal Types Corollary +-- Every well-typed expression has a *principal type* — one that all others are instances of. +theorem principal_types (Γ : HMEnv) (e : HMExpr) (σ : TypeScheme) + (h : HMTyping Γ e σ) : + ∃ (τ_principal : MonoType), + -- W finds the principal type + (∃ S c, inferW Γ e 0 = some (S, τ_principal, c)) ∧ + -- σ's body is an instance of the principal type + (∃ S', applySubst S' τ_principal = σ.body) := + by + exact completeness_W Γ e σ h +``` + +--- + +← [Previous: Unit 14](14-algorithm-w.md) · [← Tutorial 2 Index](.) · [← Back to README](../README.md)