2.6 KiB
2.6 KiB
Unit 8 — Representing Syntax
Tutorial 2: PL Semantics in Lean · ← Back to README
Goals
- Encode lambda calculus terms as an inductive type
- Understand three binding representations:
- Named (strings — simple, but α-equiv isn't definitional)
- de Bruijn indices (numbers — α-equiv is free, shifting is painful)
- Locally nameless (free vars named, bound vars indexed — compromise)
We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial, with locally nameless for comparison.
Sources
- syndikos/lean4-stlc
Syntax.lean: https://github.com/syndikos/lean4-stlc - Chris Henson 2025: https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html
- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch
- Software Foundations Vol.2: https://softwarefoundations.cis.upenn.edu/
Exercises
-- 8.1 — Named representation
inductive NamedTerm where
| var (x : String)
| lam (x : String) (body : NamedTerm)
| app (f arg : NamedTerm)
deriving Repr
-- The Church encoding of identity: λx. x
def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x")
-- Encode λx. λy. x (K combinator)
def kNamed : NamedTerm :=
sorry
-- Encode λf. λx. f (f x) (Church numeral 2)
def twoNamed : NamedTerm :=
sorry
-- 8.2 — de Bruijn representation
-- Variables are numbers: 0 = nearest binder, 1 = next, etc.
inductive DBTerm where
| var (idx : Nat) -- variable reference by binding distance
| lam (body : DBTerm) -- λ. body (no name needed!)
| app (f arg : DBTerm)
deriving Repr
-- λ. λ. 1 (= λx. λy. x in named form)
def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1))
-- λ. 0 (= λx. x in named form)
def idDB : DBTerm := DBTerm.lam (DBTerm.var 0)
-- Encode λf. λx. f (f x) (Church 2)
def twoDB : DBTerm :=
sorry
-- 8.3 — Locally nameless
-- Free variables are strings, bound variables are de Bruijn indices
-- (You don't need to implement this fully — just understand the idea)
inductive LNTerm where
| fvar (x : String) -- free variable
| bvar (idx : Nat) -- bound variable (de Bruijn)
| lam (body : LNTerm) -- binder
| app (f arg : LNTerm)
deriving Repr
Key insight for PL semantics
When we encode typing contexts Γ = x₁:τ₁, x₂:τ₂, ..., de Bruijn indices
give us "index into the context" for free. The last binding is index 0, the
second-last is index 1, etc. This makes the typing rules elegant in Lean —
no name-clash avoidance needed.
← Tutorial 1 — Unit 7 · Next: Unit 9 — Substitution