83 lines
2.6 KiB
Markdown
83 lines
2.6 KiB
Markdown
# Unit 8 — Representing Syntax
|
||
|
||
**Tutorial 2: PL Semantics in Lean** · [← Back to README](../README.md)
|
||
|
||
## Goals
|
||
|
||
- Encode lambda calculus terms as an inductive type
|
||
- Understand three binding representations:
|
||
1. **Named** (strings — simple, but α-equiv isn't definitional)
|
||
2. **de Bruijn indices** (numbers — α-equiv is free, shifting is painful)
|
||
3. **Locally nameless** (free vars named, bound vars indexed — compromise)
|
||
|
||
We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial,
|
||
with locally nameless for comparison.
|
||
|
||
## Sources
|
||
|
||
- syndikos/lean4-stlc `Syntax.lean`: https://github.com/syndikos/lean4-stlc
|
||
- Chris Henson 2025: https://chrishenson.net/posts/2025-05-10-formalized_lambda_calculus.html
|
||
- chenson2018/LeanScratch: https://github.com/chenson2018/LeanScratch
|
||
- Software Foundations Vol.2: https://softwarefoundations.cis.upenn.edu/
|
||
|
||
## Exercises
|
||
|
||
```lean
|
||
-- 8.1 — Named representation
|
||
inductive NamedTerm where
|
||
| var (x : String)
|
||
| lam (x : String) (body : NamedTerm)
|
||
| app (f arg : NamedTerm)
|
||
deriving Repr
|
||
|
||
-- The Church encoding of identity: λx. x
|
||
def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x")
|
||
|
||
-- Encode λx. λy. x (K combinator)
|
||
def kNamed : NamedTerm :=
|
||
sorry
|
||
|
||
-- Encode λf. λx. f (f x) (Church numeral 2)
|
||
def twoNamed : NamedTerm :=
|
||
sorry
|
||
|
||
-- 8.2 — de Bruijn representation
|
||
-- Variables are numbers: 0 = nearest binder, 1 = next, etc.
|
||
inductive DBTerm where
|
||
| var (idx : Nat) -- variable reference by binding distance
|
||
| lam (body : DBTerm) -- λ. body (no name needed!)
|
||
| app (f arg : DBTerm)
|
||
deriving Repr
|
||
|
||
-- λ. λ. 1 (= λx. λy. x in named form)
|
||
def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1))
|
||
|
||
-- λ. 0 (= λx. x in named form)
|
||
def idDB : DBTerm := DBTerm.lam (DBTerm.var 0)
|
||
|
||
-- Encode λf. λx. f (f x) (Church 2)
|
||
def twoDB : DBTerm :=
|
||
sorry
|
||
|
||
-- 8.3 — Locally nameless
|
||
-- Free variables are strings, bound variables are de Bruijn indices
|
||
-- (You don't need to implement this fully — just understand the idea)
|
||
inductive LNTerm where
|
||
| fvar (x : String) -- free variable
|
||
| bvar (idx : Nat) -- bound variable (de Bruijn)
|
||
| lam (body : LNTerm) -- binder
|
||
| app (f arg : LNTerm)
|
||
deriving Repr
|
||
```
|
||
|
||
### Key insight for PL semantics
|
||
|
||
When we encode **typing contexts** `Γ = x₁:τ₁, x₂:τ₂, ...`, de Bruijn indices
|
||
give us "index into the context" for free. The last binding is index 0, the
|
||
second-last is index 1, etc. This makes the typing rules elegant in Lean —
|
||
no name-clash avoidance needed.
|
||
|
||
---
|
||
|
||
← [Tutorial 1 — Unit 7](../tutorial-01-basics/07-dependent-types.md) · Next: [Unit 9 — Substitution](09-substitution.md)
|