Files
lean-pl-tutorials/tutorial-02-semantics/08-syntax-representation.md

2.6 KiB
Raw Blame History

Unit 8 — Representing Syntax

Tutorial 2: PL Semantics in Lean · ← Back to README

Goals

  • Encode lambda calculus terms as an inductive type
  • Understand three binding representations:
    1. Named (strings — simple, but α-equiv isn't definitional)
    2. de Bruijn indices (numbers — α-equiv is free, shifting is painful)
    3. Locally nameless (free vars named, bound vars indexed — compromise)

We'll use de Bruijn indices (the "heavy lifter") for the rest of this tutorial, with locally nameless for comparison.

Sources

Exercises

-- 8.1 — Named representation
inductive NamedTerm where
  | var (x : String)
  | lam (x : String) (body : NamedTerm)
  | app (f arg : NamedTerm)
deriving Repr

-- The Church encoding of identity: λx. x
def idNamed : NamedTerm := NamedTerm.lam "x" (NamedTerm.var "x")

-- Encode λx. λy. x  (K combinator)
def kNamed : NamedTerm :=
  sorry

-- Encode λf. λx. f (f x)  (Church numeral 2)
def twoNamed : NamedTerm :=
  sorry

-- 8.2 — de Bruijn representation
-- Variables are numbers: 0 = nearest binder, 1 = next, etc.
inductive DBTerm where
  | var (idx : Nat)    -- variable reference by binding distance
  | lam (body : DBTerm) -- λ. body  (no name needed!)
  | app (f arg : DBTerm)
deriving Repr

-- λ. λ. 1  (= λx. λy. x  in named form)
def kDB : DBTerm := DBTerm.lam (DBTerm.lam (DBTerm.var 1))

-- λ. 0  (= λx. x  in named form)
def idDB : DBTerm := DBTerm.lam (DBTerm.var 0)

-- Encode λf. λx. f (f x)  (Church 2)
def twoDB : DBTerm :=
  sorry

-- 8.3 — Locally nameless
-- Free variables are strings, bound variables are de Bruijn indices
-- (You don't need to implement this fully — just understand the idea)
inductive LNTerm where
  | fvar (x : String)   -- free variable
  | bvar (idx : Nat)    -- bound variable (de Bruijn)
  | lam (body : LNTerm) -- binder
  | app (f arg : LNTerm)
deriving Repr

Key insight for PL semantics

When we encode typing contexts Γ = x₁:τ₁, x₂:τ₂, ..., de Bruijn indices give us "index into the context" for free. The last binding is index 0, the second-last is index 1, etc. This makes the typing rules elegant in Lean — no name-clash avoidance needed.


Tutorial 1 — Unit 7 · Next: Unit 9 — Substitution