# Unit 8 — Representing Syntax **Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md) ## Goals - Encode lambda calculus terms as an inductive type - Understand three binding representations: 1. **Named** (strings — simple, but α-equiv isn't definitional) 2. **de Bruijn indices** (numbers — α-equiv is free, shifting is painful) 3. **Locally nameless** (free vars named, bound vars indexed — compromise) We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial, with locally nameless for comparison. ## Sources - syndikos/lean4-stlc `Syntax.lean`: https://github.com/syndikos/lean4-stlc - Chris Henson 2025: https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html - chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch - Software Foundations Vol.2: https://softwarefoundations.cis.upenn.edu/ ## Exercises ```lean -- 8.1 — Named representation inductive NamedTerm where | var (x : String) | lam (x : String) (body : NamedTerm) | app (f arg : NamedTerm) deriving Repr -- The Church encoding of identity: λx. x def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x") -- Encode λx. λy. x (K combinator) def kNamed : NamedTerm := sorry -- Encode λf. λx. f (f x) (Church numeral 2) def twoNamed : NamedTerm := sorry -- 8.2 — de Bruijn representation -- Variables are numbers: 0 = nearest binder, 1 = next, etc. inductive DBTerm where | var (idx : Nat) -- variable reference by binding distance | lam (body : DBTerm) -- λ. body (no name needed!) | app (f arg : DBTerm) deriving Repr -- λ. λ. 1 (= λx. λy. x in named form) def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1)) -- λ. 0 (= λx. x in named form) def idDB : DBTerm := DBTerm.lam (DBTerm.var 0) -- Encode λf. λx. f (f x) (Church 2) def twoDB : DBTerm := sorry -- 8.3 — Locally nameless -- Free variables are strings, bound variables are de Bruijn indices -- (You don't need to implement this fully — just understand the idea) inductive LNTerm where | fvar (x : String) -- free variable | bvar (idx : Nat) -- bound variable (de Bruijn) | lam (body : LNTerm) -- binder | app (f arg : LNTerm) deriving Repr ``` ### Key insight for PL semantics When we encode **typing contexts** `Γ = x₁:τ₁, x₂:τ₂, ...`, de Bruijn indices give us "index into the context" for free. The last binding is index 0, the second-last is index 1, etc. This makes the typing rules elegant in Lean — no name-clash avoidance needed. --- ← [Tutorial 1 — Unit 7](../tutorial-01-basics/07-dependent-types.md) · Next: [Unit 9 — Substitution](09-substitution.md)