# No Inverses for Injective but Non-Surjective Functions?

“I cannot prove that if a function is injective, it has an inverse,” Hideki Hashimoto posed this question to me a while ago. It turned out that this was not the property he really wanted, but it got me into thinking: is it possible at all?

### Preliminaries

Let us start with some basic definitions. A relation from A to B is denoted by the wavy arrow:

``````_↝_ : Set → Set → Set1
A ↝ B = A → B → Set``````

Given a relation `R : A ↝ B` we can always take its converse `R ˘ : B ↝ A`, and a function can be lifted to a relation by `fun`:

``````_˘ : ∀ {A B} → (A ↝ B) → (B ↝ A)
(R ˘) b a = R a b

fun : ∀ {A B} → (A → B) → (A ↝ B)
fun f a b = f a ≡ b``````

A relation `R : A ↝ B` is simple if it does not map one input to multiple outputs. It is entire if everything in `A` is mapped to something in `B` — a more familiar word may be “total”.

``````simple : ∀ {A B} → (A ↝ B) → Set
simple R = ∀ {a b₁ b₂} →  R a b₁ → R a b₂ → b₁ ≡ b₂

entire : ∀ {A B} → (A ↝ B) → Set
entire R = ∀ a → ∃ (λ b → R a b)``````

A function is a relation that is simple and entire. Indeed, one can show that `fun f` is simple and entire for every `f`. Injectivity and surjectivity are similar notions defined for converse of `R`:

``````injective : ∀ {A B} → (A ↝ B) → Set
injective R = ∀ {a₁ a₂ b} → R a₁ b → R a₂ b → a₁ ≡ a₂

surjective : ∀ {A B} → (A ↝ B) → Set
surjective R = ∀ b → ∃ (λ a → R a b)``````

The (constructive variant of the) axiom of choice states that an entire relation `A ↝ B `can be refined to a function `A → B`:

``````ac : ∀ {A B} → (R : A ↝ B) →
(∀ a → ∃ (λ b → R a b)) → ∃ {A → B} (λ f → ∀ a → R a (f a))
ac R R-entire = ((λ a → proj₁ (R-entire a)) , λ a → proj₂ (R-entire a))``````

See the axiom of choice homepage, or the Stanford Encyclopedia of Philosophy for more information on this axiom.

### Inverting Injective and Surjective Functions

Now, let us restate Hashimoto san’s challenge:

Let `fun f : A → B` be injective. Prove that `f` has a left inverse. That is, some `f⁻¹` such that `f⁻¹ (f a) = a` forall `a`.

It turned out that he forgot a condition: `f` is also surjective. If `f` is also (provably) surjective, one can pick some `g ⊆ f˘` using the axiom of choice (since `f` is surjective if and only if `f˘` is total) and further prove that `g ∘ f = id` using injectivity:

``````inv-sur : ∀ {A B} → (f : A → B) →
injective (fun f) → surjective (fun f) →
∃ {B → A} (λ f⁻¹ → (∀ a → f⁻¹ (f a) ≡ a))
inv-sur f f-inj f-sur with ac ((fun f) ˘) f-sur
... | (g , fgb≡b) =  (g , λ a → f-inj {g (f a)} {a} {f a} (fgb≡b (f a)) refl)``````

Like the proof of the constructive axiom of choice, the proof above does not really do much. The proof of surjectivity of `f` has already provided, for every `b : B`, an `a : A` such that `f a ≡ b`. So we simply let `f⁻¹` return that `a`.

Can we lift the restriction that `f` must be surjective? That is, can this be proved?

``````inv : ∀ {A B} → (f : A → B) → injective (fun f) →
∃ {B → A} (λ f⁻¹ → (∀ a → f⁻¹ (f a) ≡ a))``````

To make the scenario clear: we have a (total) function `f : A → B` that is injective but not necessarily surjective. The set `B` could be “larger” than `A` in the sense that there could be some elements `b : B` for which no `f a` equals `b` — that is, `B` may not be “fully covered.” Can we construct `f⁻¹ : B → A` such that `f⁻¹ (f a) ≡ a` for all `a : A`?

At the first glance it did not look like something terribly difficult to do. Given `b`, if it equals some `f a`, let `f⁻¹` simply return that `a`. Otherwise `f⁻¹` could just map `b` to any element in `A`, since this `b` is not used in any invocation of `f⁻¹ (f a)` anyway. It should be possible as long as `A` is not empty, right?

I tried to construct this `f⁻¹` but could not help noticing something funny going on. It turns out that had this function existed, we could, again, prove the law of excluded middle. That is, for any predicate `P : B → Set` and any `b : B`, there would be a decision procedure telling us whether `P b` is true or not.

Provided that we assume proof irrelevance, that is.

### Excluded Middle, Again

Here we go. Let `B` be some type having decidable equality. That is, there exists some `eqB`:

``eqB : (b₁ b₂ : B) → (b₁ ≡ b₂) ⊎ (¬ (b₁ ≡ b₂))``

where `_⊎_` is disjoint sum.

Now take some predicate `P : B → Set`. Let `A` be defined by:

``````A : (B → Set) → Set
A P = Σ B (λ b → P b)``````

That is, `A P` is the subset of `B` for which `P` holds. Each element of `A P` is a pair `(b, Pb)` where `Pb` is a proof of `P b`.

Finally, take

``````f : ∀ {P} → A P → B
f = proj₁``````

Thus `f (b, Pb) = b`.

The function `f` is injective if we assume proof irrelevance. That is, if `f (b, Pb) = b` and` f (b', Pb') = b`, we must have `b = b'` and (due to proof irrelevance) `Pb = Pb'`, and therefore `(b, Pb) = (b', Pb')`. Indeed, if we postulate proof irrelevance:

``postulate irr : (P : B → Set) → ∀ {b} → (p₁ : P b) → (p₂ : P b) → p₁ ≡ p₂``

We can construct a proof that `f` is injective:

``````f-inj : ∀ {P} → injective (fun (f {P}))
f-inj {P} {(.b , Pb₁)} {(.b , Pb₂)} {b} refl refl = cong (λ p → (b , p)) (irr P Pb₁ Pb₂)``````

Assume that we have proved `inv`. We can now apply `inv` and obtain some `f⁻¹`, the left inverse of `f`.

However, with this particular choice of `A`, `f`, and `f⁻¹`, we can construct a deciding procedure for `P`. That is, for any `P` and `b`, we can determine `P b` holds or not:

``em : {P : B → Set} → ∀ b → P b ⊎ ¬ (P b)``

This is how `em` works. Given some `b`, let’s apply `f⁻¹` to `b`. The result is a pair `(b', Pb')`. Let’s compare `b` and `b'` using `eqB`:

``````em {P} b with inv f (f-inj {P})
...      | (f⁻¹ , f⁻¹fa≡a) with inspect (f⁻¹ b)
...                        | (b' , Pb') with-≡ _          with eqB b b'``````

If `b ≡ b'`, `Pb'` is a proof of `P b'` and also a proof of `P b`. Let us just return it (after some type casting):

``````em {P} b | (f⁻¹ , f⁻¹fa≡a) | (b' , Pb') with-≡ _          | inj₁ b≡b'  =
inj₁ (subst P (sym b≡b') Pb')``````

Consider the case that `b` does not equal `b'`. We want to show that `P b` is not true. That is, a proof of `P b` leads to contradiction. Assume we have a proof `Pb` of `P b`. Since `f (b , P b) ≡ b`, we have `f⁻¹ b ≡ (b , Pb)`:

``````em {P} b | (f⁻¹ , f⁻¹fa≡a) | (b' , Pb') with-≡ b'Pb'≡f⁻¹b | inj₂ ¬b≡b' =
inj₂ (λ Pb →
let f⁻¹b≡bPb : f⁻¹ b ≡ (b , Pb)
f⁻¹b≡bPb = f⁻¹fa≡a (b , Pb)``````

The assumption says that `f⁻¹ b = (b' , Pb')`. By transitivity we have `(b , Pb) ≡ (b' , Pb')`.

``````               bPb≡b'Pb' : (b , Pb) ≡ (b' , Pb')
bPb≡b'Pb' = sym (trans b'Pb'≡f⁻¹b f⁻¹b≡bPb)``````

But if we take the first component of both sides, we get `b ≡ b'`. That contradicts our assumption that `b` does not equal `b'`:

``````               b≡b' : b ≡ b'
b≡b' = cong proj₁ bPb≡b'Pb'
in ¬b≡b' b≡b')``````

which completes the proof.

In retrospect, `f⁻¹` cannot exist because for it to work, it has to magically know whether `b` is in the range of `f`, that is, whether `P b` is true or not.

### Nakano’s Challenge

When I talked about this to Keisuke Nakano he posed me another related challenge. Set-theoretically, we understand that if there exists an injective function `f : A → B` and another injective function `g : B → A`, the sets `A` and `B` are of the same cardinality and there ought to be a bijection `A → B`. Can we construct this bijection? That is, can we prove this theorem?

``````nakano : {A B : Set} →
(f : A → B) → injective (fun f) →
(g : B → A) → injective (fun g) →
∃ {A → B} (λ h → injective (fun h) × surjective (fun h))``````

I do not yet have a solution. Any one wanna give it a try?

# Proof Irrelevance, Extensional Equality, and the Excluded Middle

It was perhaps in our first lesson in constructive logic when we learnt about the absence of the law of excluded middle, which in a constructive interpretation would imply a decision procedure for every proposition. Therefore I was puzzled by the fact, stated in a number of places including the Stanford Encyclopedia of Philosophy (SEP), that axiom of choice, proof irrelevance, and extensional equality (definitions to be given later) together entail the law of excluded middle. Since a constructive variant of axiom of choice is provable in intuitionistic logic, and both proof irrelevance and extensional equality are properties we would like to have in a type system, it is worrying that they lead to implausible consequences. Thus I was curious to find out what happened.

The observational type theory promises us the best of everything — extensional equality without losing canonicity (please do implement it in Agda soon!), and it does rely on proof irrelevance. There is even an Agda embedding (or, with the information given it is not hard to reconstruct one), so I conducted some experiments in the embedding. For this post, however, it is sufficient to do everything in plain Agda and postulate the missing properties.

### Decidable Equality for All Types?

The construction in SEP makes use of functions on propositions, which is not available in the Agda embedding of observational type theory. Instead I experimented with another construction from Benjamin Werner‘s On the Strength of Proof-Irrelevant Type Theories: with (a particular interpretations of) axiom of choice and proof irrelevance, one can construct a function that, given `a` and `b` of any type `A`, decides whether `a` equals `b`.

This could also lead to horrifying consequences — we could even compare whether two infinite structure, or two functions are equal, in a finite amount of time.

#### Axiom of Choice

The axiom of choice, as described on the very informative homepage for the axiom maintained by Eric Schechter, is considered by some the “last great controversy of mathematics.” The axiom comes in many forms, and one of the simplest could be:

Given a collection of non-empty sets, we can choose a member from each set in that collection.

A set of `B` is usually represented by a characteristic function `B → Set`. Let the collection of non-empty sets of `B`‘s be indexed by `A`, the collection can be modelled by a function mapping indexes in `A` to sets of `B`s, that is, a relation `A → B → Set`. The action of “choosing” is modelled by the existence of a function returning the choice, that is, a function taking an index in `A` and returning a element in `B` that is chosen. One possible formulation of the axiom of choice would thus be:

``````ac : {A B : Set} → (R : A → B → Set) →
(∀ x → ∃ (λ y → R x y)) →
(∃ {A → B} (λ f → ∀ x → R x (f x)))
``````

In words: given a collection of sets represented by `A → B → Set`, if `R x` is non-empty for every `x : A`, there exists a function `f : A → B`, such that `f x` is in `R x` for every `x : A`. Another way to see it is that the axiom simply says we can refine a relation `R : A → B → Set` to a function `A → B` provided that the former is total.

It is surprising to some that in constructive logic, the axiom of choice is in fact provable:

``ac R g = ((λ x → proj₁ (g x)) , λ x → proj₂ (g x))``

The technical reason is that a proof of `∀ x → ∃ (λ y → R x y)` contains a witness, which we can just pick as the result of the choice function. I will leave the mathematical and philosophical explanation and implications to better sources, such as the axiom of choice homepage, and the two pages in SEP on the axiom and the axiom in type theory.

#### Proof Irrelevance

We define a small language of propositions about equalities: `† A` is the language of propositions whose atoms are equalities between elements of type `A `(`_≐_`), and between Boolean values (`_≘_`):

``````data † (A : Set) : Set where
TT  : † A
FF  : † A
_≐_ : A → A → † A
_≘_ : Bool → Bool → † A
_∧_ : † A → † A → † A
_⇒_ : † A → † A → † A
_∨_ : † A → † A → † A``````
``````￢ : ∀ {A} → † A → † A
￢ P = P ⇒ FF
``````

The semantics is more or less what one would expect: `TT` is the unit type, `FF` the empty type, conjunction encoded by products, and implication by functions. In particular, disjunction is encoded by the sum type:

``````⌈_⌉ : ∀ {A} → † A → Set
⌈ TT ⌉ = ⊤
⌈ FF ⌉ = ⊥
⌈ a ≐ b ⌉ = a ≡ b
⌈ a ≘ b ⌉ = a ≡ b
⌈ P ∧ Q ⌉ = ⌈ P ⌉ × ⌈ Q ⌉
⌈ P ⇒ Q ⌉ = ⌈ P ⌉ → ⌈ Q ⌉
⌈ P ∨ Q ⌉ = ⌈ P ⌉ ⊎ ⌈ Q ⌉``````

We shall define the if-and-only-if relation `_⇔_` between terms of `† A`

``````_⇔_ : ∀ {A} → † A → † A → Set
P ⇔ Q = ⌈ (P ⇒ Q) ∧ (Q ⇒ P) ⌉``````

The current example, however, could have worked too if we simply take `_⇔_` to be `_≡_`. Its role will be more significant for the next example.

Proof irrelevance asserts that we do not distinguish between proofs of the same proposition. Let `p` and `q` be proofs of `P` and `Q` respectively. If `P` and `Q` turn out to be equivalent propositions, then `p` and `q` must be equal:

``````postulate irr : ∀ {A} (P Q : † A) → (p : ⌈ P ⌉)(q : ⌈ Q ⌉) → P ⇔ Q → p ≅ q
``````

where `_≅_` stands for heterogenous equality, needed here because `p` and `q` appear to have different types. Note that in Agda, if we assume uniqueness of identity proof (i.e. `refl` being the only proof of `a ≅ b`), the axiom `irr` holds for `TT`, `FF`, `_≐_`, `_≘_`, and `_∧_`, and would be true for `_⇒_` if we had extensional equality for functions, but not for disjunction `_∨_`.

#### Decidable Equality

For `a b : A`, let `oneof a b` be the type of things that are either `a` or `b`, paired with a proof of equality:

``````oneof : {A : Set} → (a b : A) → Set
oneof {A} a b = Σ A (λ x → ⌈ (x ≐ a) ∨ (x ≐ b) ⌉)``````

The relation `Ψ a b : oneof a b → Bool → Set` relates a `z : oneof a b` value and a Boolean `e` if `e` tells us which value `z` actually is:

``````Ψ : {A : Set} → (a b : A) → oneof a b → Bool → Set
Ψ {A} a b z e = ⌈ ((e ≘ false) ∧ (proj₁ z ≐ a)) ∨
((e ≘ true)  ∧ (proj₁ z ≐ b)) ⌉``````

Certainly `Ψ` is total: for any `z : oneof a b`, either `true` or `false` satisfies `Ψ z`. We can prove the totality:

``````Ψ-total : {A : Set} → (a b : A) →
(z : oneof a b) → ∃ (λ e → Ψ a b z e)
Ψ-total a b (x , inj₁ x≡a) = (false , inj₁ (refl , x≡a))
Ψ-total a b (x , inj₂ x≡b) = (true  , inj₂ (refl , x≡b))``````

and therefore extract, using the axiom of choice, a function that actually computes the Boolean:

``````Ψ-fun : {A : Set} → (a b : A) →
∃ {oneof a b → Bool}
(λ f → (z : oneof a b) → Ψ a b z (f z))
Ψ-fun a b = ac (Ψ a b) (Ψ-total a b)``````

Now we show how to construct a decision procedure for `a ≐ b`. Given `a` and `b`, we inject them to `oneof a b` and call the result `a'` and `b'` respectively:

``````eq_dec : {A : Set} → (a b : A) → ⌈ (a ≐ b) ∨ (￢ (a ≐ b)) ⌉
eq_dec {A} a b = body
where

a' : oneof a b
a' = (a , inj₁ refl)

b' : oneof a b
b' = (b , inj₂ refl)``````

In the function body, we extract a choice function `f` (and also get a proof that `f` satisfies the relation `Ψ`) and apply it to both `a'` and `b'`. The results could come in four combinations. If `f a'` is `true`, whatever `f b'` is, from the proof of `Ψ a b a' (f a')` we have a proof of `a ≡ b`. If `f a'` and `f b'` are both `false`, we also get a proof of `b ≡ a`:

``````   body : (a ≡ b) ⊎ (¬ (a ≡ b))
body with Ψ-fun a b
... | (f , f⊆Ψ) with f⊆Ψ a' | f⊆Ψ b'
...  | inj₂ (fa'≡tt , a≡b) | _ = inj₁ a≡b
...  | inj₁ (fa'≡ff , a≡a) | inj₁ (fb'≡ff , b≡a) = inj₁ (sym b≡a)
...  | inj₁ (fa'≡ff , a≡a) | inj₂ (fb'≡tt , b≡b) = inj₂ (cont fa'≡ff fb'≡tt)``````

For the case `f a'` is `false` but `f b'` is `true`, we call `cont` to create a contradiction if `a ≡ b`.

How do we get the contradition? If we can somehow conclude `a' ≡ b'`, by congruence we have `f a' ≡ f b'`. However, that gives us `false ≡ f a' ≡ f b' ≡ true`, which contradicts the assumption that `false` and `true` are two distinct elements in `Bool`:

``````     where cont : f a' ≡ false → f b' ≡ true → a ≢ b
cont fa'≡ff fb'≡tt a≡b =
let ....
fa'≡fb' : f a' ≡ f b'
fa'≡fb' = cong f a'≡b'
in let open EqR (PropEq.setoid Bool)
in ff≢tt (begin
false ≈⟨ sym fa'≡ff  ⟩
f a'  ≈⟨ fa'≡fb' ⟩
f b'  ≈⟨ fb'≡tt ⟩
true ∎)``````

So the task narrows down to showing that `a' ≡ b'` given `a ≡ b`, which is where proof irrelevance comes in. Recall that `a'` is `a` paired with a proof `P : (a ≐ a) ∨ (a ≐ b)`, and `b'` is `b` paired with a proof `Q : (b ≐ a) ∨ (b ≐ b)`. Now that `a ≡ b`, the two propositions are the same, which by proof irrelevance means that their proofs must be equal too. Here is the actual proof in Agda:

``````                 P≡Q : ((a ≐ a) ∨ (a ≐ b)) ≡ ((b ≐ a) ∨ (b ≐ b))
P≡Q = let open EqR (PropEq.setoid († A))
in begin
(a ≐ a) ∨ (a ≐ b)
≈⟨ cong (λ x → (a ≐ a) ∨ (a ≐ x)) (sym a≡b) ⟩
(a ≐ a) ∨ (a ≐ a)
≈⟨ cong (λ x → (x ≐ a) ∨ (x ≐ x)) a≡b ⟩
(b ≐ a) ∨ (b ≐ b)
∎

a'≡b' : a' ≡ b'
a'≡b' = cong-× a≡b
(irr ((a ≐ a) ∨ (a ≐ b)) ((b ≐ a) ∨ (b ≐ b))
(proj₂ a') (proj₂ b') P≡Q)``````

So, what happened here? When I posed the question to Conor McBride, his first question was “Which variant of the axiom of choice did you use?” (“The one that has a trivial proof,” I answered). The second question was “How did you encode disjunction?” Indeed, disjunction is where we “smuggled” something that should not be there. The two proofs `P : (a ≐ a) ∨ (a ≐ b)` and `Q : (b ≐ a) ∨ (b ≐ b)` can be casted to `(a ≐ a) ∨ (a ≐ a)` when `a ≡ b`. In Agda, we should have two proofs for `(a ≐ a) ∨ (a ≐ a)`. But proof irrelevance deploys some magic to unify `P` and `Q` to one proof only when `a ≡ b`, which reveals some information we can exploit.

Will we see the same phenomena in observational equality? “That’s why we don’t have disjunction in the propositions of observational type theory!” Conor replied. Or, when we do need disjunction in propositions, it must be done in a safe manner. Perhaps Conor or Thorsten Altenkirch knows how to do that?

### Excluded Middle

The construction in SEP deriving the law of excluded middle works in a similar way but in a different level — we will be reasoning about propositions and predicates. We need one more ingredient: extensional equality. Equality of propositions, as mentioned in the previous section, are based on the if-and-only-if relation `_⇔_`. Equality of functions, on the other hand, is defined pointwise — two functions are equal if they map (extensionally) equal values to (extensionally) equal results. In particular, equality of predicates (functions that return propositions) is given by:

``postulate ext : {A B : Set} → (f g : A → † B) → (∀ a → f a ⇔ g a) → f ≡ g``

Let `X` be the proposition for whose validity we are constructing a decision procedure. We define two predicate constants `U X` and `V X`:

``````U : ∀ {A} → † A → Bool → † A
U X b = X ∨ (b ≘ false)

V : ∀ {A} → † A → Bool → † A
V X b = X ∨ (b ≘ true)``````

And a type for predicates (on Booleans) that are known to be either `U X` or `V X`:

``````UorV : ∀ A → † A → Set
UorV A X = Σ (Bool → † A) (λ P → ⌈ (P ≐ U X) ∨ (P ≐ V X) ⌉)``````

Given a predicate `P : UorV A X` that is either `U X` or `V X`, the relation `Φ` relates `P` and Boolean `b` if `P` (precisely, `proj₁ P`) is true at `b`:

``````Φ :  ∀ {A X} → (P : UorV A X) → Bool → Set
Φ P b = ⌈ proj₁ P b ⌉``````

Again `Φ` can be shown to be total, and we may extract a choice function which, given a proposition `P : UorV A X`, returns a Boolean for which `P` is true.

``````Φ-total : ∀ {A X} → (P : UorV A X) → ∃ (λ b → Φ P b)
Φ-fun : ∀ {A X} → ∃ {UorV A X → Bool} (λ f → ∀ P → Φ P (f P))
``````

Now, like in the previous example, we inject `U X` and `V X` to `UorV A X`:

``````U' : ∀ {A} X → UorV A X
U' X = (U X , inj₁ refl)

V' : ∀ {A} X → UorV A X
V' X = (V X , inj₂ refl)``````

To determine the validity of `X`, we apply `f` to `U' X` and `V' X`:

``````ex_mid : ∀ {A} → (X : † A) → ⌈ X ∨ (￢ X) ⌉
ex_mid {A} X with Φ-fun {X = X}
... | (f , f⊆Φ) with f⊆Φ (U' X) | f⊆Φ (V' X)
...   | inj₁ pX      | _ =  inj₁ pX
...   | _            | inj₁ pX = inj₁ pX
...   | inj₂ fU'≡ff  | inj₂ fV'≡tt = inj₂ negX
``````

If either yields `inj₁`, we have a proof of `X`. For the last case, on the other hand, we have that `f (U' X)` is `false` and `f (V' X)` is `true`, and we try to create a contradiction given a proof of `X`. Like in the previous example, we do so by claiming `U' X ≡ V' X`, therefore `f (U' X) ≡ f (V' X)`, and therefore `false ≡ true`:

``````  where negX : ⌈ (￢ X) ⌉
negX pX = let ...

in ff≢tt (begin
false     ≈⟨ sym fU'≡ff   ⟩
f (U' X)  ≈⟨ fU'≡fV' ⟩
f (V' X)  ≈⟨ fV'≡tt ⟩
true ∎)``````

But this time we must first show that `U X ≡ V X`. Expending the definitions, we have `U X b = X ∨ (b ≘ false)` and `V X b = X ∨ (b ≘ true)`, and recall that we have a proof of `X`. We may prove a lemma that:

``lemma : ∀ {A}{P Q R : † A} → ⌈ P ⌉ → (P ∨ Q) ⇔ (P ∨ R)``

from which we may conclude `U X b ⇔ V X b` and, by extensional equality of functions, `U X ≡ V X`:

``````        negX pX = let U≡V : U X ≡ V X
U≡V = let UXb⇔VXb : ∀ b → U X b ⇔ V X b
UXb⇔VXb b = lemma {_}{X}{b ≘ false}{b ≘ true} pX
in ext (U X) (V X) UXb⇔VXb``````

The rest is the same as in the previous example.

# Algebra of programming in Agda: dependent types for relational program derivation

S-C. Mu, H-S. Ko, and P. Jansson. In Journal of Functional Programming, Vol. 19(5), pp. 545-579. Sep. 2009 [PDF]

Relational program derivation is the technique of stepwise refining a relational specification to a program by algebraic rules. The program thus obtained is correct by construction. Meanwhile, dependent type theory is rich enough to express various correctness properties to be verified by the type checker.

We have developed a library, AoPA, to encode relational derivations in the dependently typed programming language Agda. A program is coupled with an algebraic derivation whose correctness is guaranteed by the type system.

Two non-trivial examples are presented: an optimisation problem, and a derivation of quicksort where well-founded recursion is used to model terminating hylomorphisms in a language with inductive types.

This article extends the paper we published in Mathematics of Program Construction 2008. Code accompanying the paper has been developed into an Agda library AoPA.

# “General Recursion using Coinductive Monad” Got Right

In the previous post I talked about encoding general recursion in Agda using a coinductive monad. One oddity was that Agda does not automatically unfold coinductive definitions, and a solution was to manually unfold them when we need. Therefore, instead of proving, for example, for coinductively defined `f`:

```n<k⇒fkn↓=k : forall k n -> n<′′ k -> f k n ↓= k ```

we prove a pair of (possibly mutually recursive) properties:

```n<k⇒⟨fkn⟩↓=k : forall k n -> n<′′ k -> unfold (f k n) ↓= k n<k⇒fkn↓=k : forall k n -> n <′′ k -> f k n ↓= k ```

where `unfold` (a user-defined function) performs the unfolding, and the second property follows from the first by `≡-subst` and the fact that `unfold (f k n) ≡ f k n`.

While the use of `unfold` appears to be a direct work around, one may argue that this is merely treating the symptom. To resolve this and some other oddities, Anton Setzer argued in his letter “Getting codata right” (hence the title of this post) to the Agda mailing list that we should go back to a category theoretical view of codata, and Dan Doel soon successfully experimented the ideas. The following, however, are based on my current understanding of their work, which may not be matured yet.

### What `codata` Really Is

Inductive datatypes are defined by describing how a value is constructed. The following inductive definition:

```data List (a : Set) : Set where    [] : List a    _∷_ : a -> List a -> List a ```

states that `[]` is a list and, given an element `x` and a list `xs`, one can construct a list `x ∷ xs`.

Coinductive datatypes, on the other hand, are specified by how a covalue can be observed. The following definition of stream:

```codata Stream (a : Set) : Set where    [] : Stream a    _∷_ : a -> Stream a -> Stream a ```

while looking almost identical to `List` apart from the use of keyword `codata`, is in fact a definition very different from that of `List` in nature. Setzer suggests seeing it as a short-hand for the following collection of definitions (in Setzer’s proposed notation but my naming scheme):

`mutual`

```  coalg Stream (a : Set) : Set where     _* : Stream a -> StObserve a ```

``` data StObserve (a : Set) : Set where     empty : StObserve a     nonempty : a -> Stream a -> StObserve a ```

Notice that `Stream a` appears in the domain, rather than the range of `_*` because we are defining a coalgebra and `_*` is a deconstructor. My interpretation is that `_*` triggers an obserivation. Given a stream `xs`, an observation `xs *` yields two possible results. Either it is `empty`, or `nonempty x xs'`, in the latter case we get its head `x` and tail `xs'`. The stream “constructors”, on the other hand, can be defined by:

```[] : {a : Set} -> Stream a [] * = empty```

``` ```

```_∷_ : {a : Set} -> a -> Stream a -> Stream a (x ∷ xs) * = nonempty x xs ```

which states that `[]` is the stream which, when observed, yields `empty` and `x ∷ xs` is the stream whose observation is `nonempty x xs`.

A coinductive definition:

`f x ~ e`

is a shorthand for

`(f x)* = e *`

That is, `f x` defines an coinductive object whose observation shall be the same as that of `e`. Setzer proposes explicit notation (e.g. `~ (x ∷ xs)`) for pattern matching covalue. Back to current Agda, we may seen pattern matching covalue as implicitly applying `_*` and pattern-match the result. For example, the Agda program:

``` f : {a : Set} -> Stream a -> ... f [] = ... f (x ∷ xs) = ... ```

can be seen as syntax sugar for:

```f : {a : Set} -> Stream a -> ... f ys with ys * ... | empty = ... ... | nonempty x xs = ... ```

### Covalue-Indexed Types

What about (co)datatypes that are indexed by covalues? Take, for example, the codatatype `Comp a` for possibly non-terminating computation, define in my previous post:

```codata Comp (a : Set) : Set where    return : a -> Comp a    _⟩⟩=_ : Comp a -> (a -> Comp a) -> Comp a ```

and the datatype `_↓=_` claiming that certain computation terminates and yields a value:

```data _↓=_ {a : Set} : Comp a -> a -> Set where    ↓=-return : forall {x} -> (return x) ↓= x    ↓=-bind : forall {m f x y} ->          m ↓= x -> (f x) ↓= y -> (m ⟩⟩= f) ↓= y ```

What is the definition actually trying to say? Given `m : Comp a` and `x`, `m ↓= x` is a type. In the case that `m` is `return x`, there is an immediate proof, denoted by `↓=-return`, of `(return x) ↓= x` (that is, `return x` terminates with value `x`). For the case `m ⟩⟩= f`, we can construct its termination proof, denoted by constructor `↓=-bind`, from termination proofs of `m` and `f x`. Setzer suggests the following definition, which corresponds more literally to the verbal description above:

```mutual  _↓=_ : {a : Set} -> Comp a -> a -> Set  return x ↓= y = ↓=-Return x y  (m ⟩⟩= f) ↓= x = ↓=-Bind m f x ```

```  data ↓=-Return {a : Set} : a -> a -> Set where    ↓=-return : forall {x} -> ↓=-Return x x ```

``` data ↓=-Bind {a : Set} : Comp a -> (a -> Comp a) -> a -> Set where    ↓=-bind : forall {m f x y} ->        m ↓= x -> (f x) ↓= y -> ↓=-Bind m f y```

Now `↓=-return` and `↓=-bind` are the only constructors of their own types, and `_↓=_` maps `Comp a` typed covalues to their termination proofs according to their observations. Notice that `↓=-Return` is exactly the built-in identity type `_≡_`: `return x` terminates with `y` if and only if `x` equals `y`.

For more examples, compare the valueless termination type `_↓` in the previous post:

```data _↓ {a : Set} : Comp a -> Set where   ↓-return : forall {x} -> (return x) ↓   ↓-bind : forall {m f} ->       m ↓ -> (forall x -> m ↓= x -> f x ↓) -> (m ⟩⟩= f) ↓ ```

and its “correct” representation:

`mutual`

```  data _↓ : {a : Set} -> Comp a -> Set     return x ↓ = ↓-Return x     (m ⟩⟩= f) ↓ = ↓-Bind m f  data ↓-Return {a : Set} : a -> Set where     ↓-return : forall {x} -> ↓-Return x ```

``` data ↓-Bind {a : Set} : Comp a -> (a -> Comp a) -> Set where     ↓-bind : forall {m f} ->      m ↓ -> (forall x -> m ↓= x -> f x ↓) -> ↓-Bind m f```

Here is a non-recursive example. The datatype `Invoked-By` characterising sub-computation relation is written:

```data Invoked-By {a : Set} : Comp a -> Comp a -> Set where   invokes-prev : forall {m f} -> Invoked-By m (m ⟩⟩= f)   invokes-fun : forall {m f x} -> m ↓= x -> Invoked-By (f x) (m ⟩⟩= f) ```

Like above, we define a function for dispatching `Comp a` covalues according to their observation:

```data Invoked-By-Bind {a : Set} : Comp a -> Comp a -> (a -> Comp a) -> Set where   invokes-prev : forall {m f} -> Invoked-By-Bind m m f   invokes-fun : forall {m f x} -> m ↓= x -> Invoked-By-Bind (f x) m f```

``` ```

```invoked-by : {a : Set} -> Comp a -> Comp a -> Set invoked-by m (return x) = ⊥ invoked-by m (n ⟩⟩= f) = Invoked-By-Bind m n f ```

Indeed, `return x` consists of no sub-computations, therefore `invoked-by m (return x)` is an empty relation. On the other hand, there are two possibilities `invoked-by m (n ⟩⟩= f)`, represented by the two cases of `Invoked-By`.

### Termination Proof

The lengthy discussion above is of more than pedantic interest. While the mutually exclusive indexed type definitions may look rather confusing to some, the effort pays off when we prove properties about them. Redoing the exercises in the previous post using the new definitions, the first good news is that the definitions of `Safe`, `↓-safe`, `eval-safe`, and `eval` stay almost the same. Second, we no longer have to split the termination proofs into pairs.

Take for example the McCarthian Maximum function:

```f : ℕ -> ℕ -> Comp ℕ f k n with k ≤′′? n ... | inj₁ k≤n ~ return n ... | inj₂ n<k ~ f k (suc n) ⟩⟩= \x -> f k x ```

Its termination proof is based on two lemmas, together they state that `f k n` terminates with the maximum of `k` and `n`:

```k≤n⇒fkn↓=n : forall k n -> k ≤′′ n -> f k n ↓= n k≤n⇒fkn↓=n k n k≤n with k ≤′′? n ... | inj₁ _ = ↓=-return ... | inj₂ n<k = ⊥-elim (¬[m<n∧n≤m] (n<k , k≤n)) ```

``` ```

```n<k⇒fkn↓=k : forall k n -> n <′′ k -> f k n ↓= k n<k⇒fkn↓=k k n n<k with k ≤′′? n n<k⇒fkn↓=k k n n<k | inj₁ k≤n = ⊥-elim (¬[m<n∧n≤m] (n<k , k≤n)) n<k⇒fkn↓=k ._ n _ | inj₂ ≤′′-refl =     ↓=-bind {_}{f (suc n) (suc n)}{\x -> f (suc n) x}{suc n}       (k≤n⇒fkn↓=n (suc n) (suc n) ≤′′-refl)       (k≤n⇒fkn↓=n (suc n) (suc n) ≤′′-refl) n<k⇒fkn↓=k ._ n ≤′′-refl | inj₂ (≤′′-step 2+n≤n) =     ⊥-elim (¬1+n≤n 2+n≤n) n<k⇒fkn↓=k k n (≤′′-step 2+n≤k) | inj₂ (≤′′-step _) =     ↓=-bind {_}{f k (suc n)}{\x -> f k x}{k}       (n<k⇒fkn↓=k k (suc n) 2+n≤k)       (k≤n⇒fkn↓=n k k ≤′′-refl) ```

### Divergence Proof

For some reason (lack of space, perhaps?), Megacz did not talk about divergence in his PLPV paper. For completeness, let us give it a try.

A `return` command always terminates. On the other hand, `m ⟩⟩= f` diverges if either `m` does, or `m ↓= x` and `f x` diverges. This is expressed as:

```mutual   _↑ : {a : Set} -> Comp a -> Set   return x ↑ = ⊥   (m ⟩⟩= f) ↑ = ↑-Bind m f```

``` ```

``` codata ↑-Bind {a : Set} : Comp a -> (a -> Comp a) -> Set where    ↑-prev : forall {m f} -> m ↑ -> ↑-Bind m f    ↑-fun : forall {m f} ->         (forall x -> m ↓= x -> f x ↑) -> ↑-Bind m f ```

For an example, the following is a proof that `length` diverges when applied to an infinite stream:

```length : {a : Set} -> Stream a -> Comp ℕ length [] ~ return zero length (x ∷ xs) ~ length xs ⟩⟩= \n -> return (suc n)```

``` ones : Stream ℕ ones ~ 1 ∷ ones ```

```length-ones↑ : length ones ↑ length-ones↑ ~ ↑-prev {_}{length ones} length-ones↑ ```

The implicit argument `length ones` must be given. Otherwise `length-ones↑` fails to typecheck when the file is loaded, although it does typecheck when `length-ones↑` was stepwisely constructed using the “Give” command. I do not know whether it is a bug in Agda.

# General Recursion using Coindutive Monad

Agda, like many dependently typed functional languages, recognises a variation of structural recursion. We need some tricks to encode general recursion. With well-founded recursion, you add an extra argument to the general-recursive function you would like to define. The extra argument is a proof that one input of the function and the successive value passed to the recursive call are related by a well-founded ordering. Therefore, the recursion cannot go on forever. The function passes the termination check because it is structural recursive on the proof term.

I like well-founded recursion because it relates well with how, in my understanding, termination proofs are done in relational program derivation. For real programming, however, the extra argument clutters the code. For more complex cases, such as the McCarthian maximum function where the termination proof depends on the result of the function, the function has to be altered in ways making it rather incomprehensible. It would be nice to if we could write the program and prove its termination elsewhere.

Adam Megacz published an interesting paper: A Coinductive Monad for Prop-Bounded Recursion, in which he summarised a number of approaches to perform general recursion in a dependently typed language before presenting his own, an extension to Venanzio Capretta‘s General Recursion via Coinductive Types. As a practice, I tried to port his code to Agda.

### A Monad for Non-Terminating Computation

The codatatype `Comp`, representing a possibly non-terminating computation, is a monad-like structure:

```codata Comp (a : Set) : Set where    return : a -> Comp a    _⟩⟩=_ : Comp a -> (a -> Comp a) -> Comp a ```

While the monadic bind operator has type `M a -> (a -> M b) -> M b`, the `_⟩⟩=_` operator for `Comp` has a more specialised type. This suffices when we use it only at the point of recursive calls. I suspect that we may need to generalise the type when we construct mutually recursive functions.

General-recursive programs are written in monadic style. For example, the function `div`, for integral division, can be written as:

```div : ℕ -> ℕ -> Comp ℕ div zero n ~ return zero div (suc m) zero ~ return (suc m) div (suc m) (suc n) ~        div (suc m ∸ suc n) (suc n) ⟩⟩= \h ->        return (suc h) ```

An example harder to tackle would be McCarthy’s 91 function. To be consistent with my earlier blog entry (well, the real reason is that I am lazy. And Megacz has proved the termination of the 91 function anyway), consider the following simplification:

```f : ℕ -> ℕ -> Comp ℕ f k n with k ≤′′? n ... | inj₁ k≤n ~ return n ... | inj₂ n<k ~ f k (suc n) ⟩⟩= \x -> f k x ```

The function, seen in Bengt Nordström‘s Terminating General Recursion, returns the maximum of `k` and `n`, and will be referred to as the McCarthian Maximum function.

### Terminating Computation

In the definitions of `div` and `f`, we have merely built some (potentially infinite) codata objects representing the computations. Now we will build an interpreter that would actually carry out a computation, provided that it is terminating. The idea here is similar to what we did in well-founded recursion: construct a proof that the computation terminates, and the interpreter would be structurally recursive on the proof term.

The datatype `_↓` characterises terminating computations:

```data _↓ {a : Set} : Comp a -> Set where    ↓-return : forall {x} -> (return x) ↓    ↓-bind : forall {m f} ->          m ↓ -> (forall x -> m ↓= x -> f x ↓) -> (m ⟩⟩= f) ↓ ```

Certainly, `return x` for all `x` always terminates. For `m ⟩⟩= f` to terminate, both `m` and `f`, given the input from `m`, must terminate. It would be too strong to demand that `f x` terminates for all `x`. We only need `f x` to terminate for those `x` that could actually be returned by `m`. For that we define another datatype `_↓=_ ` such that `m ↓= x` states that a computation `m` terminates and yields a value `x`:

```data _↓=_ {a : Set} : Comp a -> a -> Set where    ↓=-return : forall {x} -> (return x) ↓= x    ↓=-bind : forall {m f x y} ->          m ↓= x -> (f x) ↓= y -> (m ⟩⟩= f) ↓= y ```

I also needed the following lemmas. The lemma `↓=-unique` states that if a computation yields only one value, if it terminates at all:

```↓=-unique : {a : Set} {m : Comp a} {x y : a} ->       m ↓= x -> m ↓= y -> x ≡ y ↓=-unique ↓=-return ↓=-return = ≡-refl ↓=-unique {x = x} (↓=-bind m↓=x fx↓=y) (↓=-bind m↓=x' fx'↓=y')    with ↓=-unique m↓=x m↓=x' ... | ≡-refl with ↓=-unique fx↓=y fx'↓=y' ... | ≡-refl = ≡-refl ```

The next lemma says that a computation that yields a value terminates:

```↓=⇒↓ : {a : Set} {m : Comp a} {x : a} -> m ↓= x -> m ↓ ↓=⇒↓ ↓=-return = ↓-return ↓=⇒↓ (↓=-bind {m}{f} m↓=x fx↓=y) =   ↓-bind (↓=⇒↓ m↓=x)     (\z m↓=z -> ≡-subst (\w -> f w ↓) (↓=-unique m↓=x m↓=z) (↓=⇒↓ fx↓=y)) ```

### Safe Computation

Now we construct a function `eval` evaluating terminating computations. Recall, however, that an important thing about well-founded recursion is that the datatype `Acc` consists of only one constructor. Thus, like the `newtype` construct in Haskell, the constructor is superfluous and can be optimised away by the compiler. For `eval`, we would like the proof of termination to be a one-constructor type too. We define the following datatype `Safe`. A computation is safe, if all its sub-computations are:

```data Safe {a : Set} : Comp a -> Set where    safe-intro : forall m ->       (forall m' -> InvokedBy m' m -> Safe m') -> Safe m ```

Given computations `m` and `m'`, the relation `InvokedBy m' m` holds if `m'` is a sub-computation of `m`. The relation `InvokedBy` can be defind by:

```data InvokedBy {a : Set} : Comp a -> Comp a -> Set where   invokes-prev : forall {m f} -> InvokedBy m (m ⟩⟩= f)   invokes-fun : forall {m f x} ->     m ↓= x -> InvokedBy (f x) (m ⟩⟩= f) ```

Notice that `Safe` is exactly a specialised version of `Acc`. `Safe m` states that `m` is accessible under relation `InvokedBy`. That is, there is no infinite chain of computations starting from `m`.

Programmers prove termination of programs by constructing terms of type `_↓`. The following lemma states that terminating computations are safe:

```↓-safe : {a : Set} -> (m : Comp a) -> m ↓ -> Safe m ↓-safe (return x) ↓-return = safe-intro _ fn   where fn : (m' : Comp _) -> InvokedBy m' (return x) -> Safe m'       fn _ () ↓-safe (m ⟩⟩= f) (↓-bind m↓ m↓=x⇒fx↓) = safe-intro _ fn   where fn : (m' : Comp _) -> InvokedBy m' (m ⟩⟩= f) -> Safe m'       fn .m invokes-prev = ↓-safe m m↓       fn ._ (invokes-fun m↓=x) = ↓-safe _ (m↓=x⇒fx↓ _ m↓=x) ```

We can then use the lemma to convert `m ↓` to `Safe m`, which will be used by `eval-safe`:

```eval-safe : {a : Set} -> (m : Comp a) -> Safe m -> ∃ (\x -> m ↓= x) eval-safe (return x) (safe-intro ._ safe) = (x , ↓=-return) eval-safe (m ⟩⟩= f) (safe-intro ._ safe)   with eval-safe m (safe m invokes-prev) ... | (x , m↓=x) with eval-safe (f x) (safe (f x) (invokes-fun m↓=x)) ...   | (y , fx↓y) = (y , ↓=-bind m↓=x fx↓y) ```

It is structurally recursive because `safe` is a sub-component of `safe-intro ._ safe`. Our `eval` function simply calls `eval-safe`:

```eval : {a : Set} (m : Comp a) -> m ↓ -> a eval m m↓ with eval-safe m (↓-safe m m↓) ... | (x , m↓=x) = x ```

### Example: Div

Recall the function `div`:

```div : ℕ -> ℕ -> Comp ℕ div zero n ~ return zero div (suc m) zero ~ return (suc m) div (suc m) (suc n) ~        div (suc m ∸ suc n) (suc n) ⟩⟩= \h ->        return (suc h) ```

To prove its termination we use strong induction: if `div k n` terminates for all `k < m`, we have `div m n` terminates as well. The strong induction principle is given by:

```strong-induction : (P : ℕ -> Set) ->   (forall n -> (forall m -> m <′ n -> P m) -> P n) ->     (forall n -> P n) strong-induction P Pind n = Pind n (ind n)   where ind : forall n -> forall m -> m <′ n -> P m       ind zero _ ()       ind (suc n) ._ ≤′-refl = strong-induction P Pind n       ind (suc n) m (≤′-step m<n) = ind n m m<n ```

Notice how it resembles the accessibility proof ℕ<-wf. Indeed, accessibility is strong induction. We are doing the same proof, but organised differently.

There is a little extra complication, however. In Agda, co-recursively defined functions do not automatically unfold. We have to unfold them manually, a trick mentioned byUlf Norell and Dan Doel on the Agda mailing list. The following function `unfold`, through pattern matching, expands the definition of a coinductively defined function:

```unfold : {a : Set} -> Comp a -> Comp a unfold (return x) = return x unfold (m ⟩⟩= f) = m ⟩⟩= f ```

To prove a property about `m`, we could prove the same property for `unfold m`, and then use the fact that `unfold m ≡ m`, proved below:

```≡-unfold : {a : Set} {m : Comp a} -> unfold m ≡ m ≡-unfold {_}{return x} = ≡-refl ≡-unfold {_}{m ⟩⟩= f} = ≡-refl ```

Here is the proof that `div m n` terminates for all `m`, using strong induction:

```div↓ : forall m n -> div m n ↓ div↓ m n = strong-induction (\m -> div m n ↓)             (\m f -> ≡-subst _↓ ≡-unfold (ind m f)) m   where    ind : forall m -> (forall k -> k <′ m -> div k n ↓) ->               unfold (div m n) ↓    ind zero f = ↓-return    ind (suc m) f with n    ... | zero = ↓-return    ... | (suc n') = ↓-bind (f (suc m ∸ suc n') (sm-sn<sm m n'))                       (\h _ -> ↓-return) ```

where `sm-sn<sm : forall m n -> (m ∸ n) <′ suc m`.

### Example: McCarthian Maximum

Now, let us consider the McCarthian Maximum:

```f : ℕ -> ℕ -> Comp ℕ f k n with k ≤′′? n ... | inj₁ k≤n ~ return n ... | inj₂ n<k ~ f k (suc n) ⟩⟩= \x -> f k x ```

Two peculiarities made this function special. Firstly, for `n < k`, the value of `n` gets closer to `k` at each recursive call. The function terminates immediately for `n ≥ k`. Therefore, to reason about its termination, we need a downward induction, where `n = k` is the base case and termination for `n = i` is inducted from `n = i+1, i+2, .. , k`. To make this induction easier, I created another datatype expressing "less than or equal to":

```data _≤′′_ : ℕ -> ℕ -> Set where   ≤′′-refl : forall {n} -> n ≤′′ n   ≤′′-step : forall {m n} -> suc m ≤′′ n -> m ≤′′ n```

``` ```

```_<′′_ : ℕ -> ℕ -> Set m <′′ n = suc m ≤′′ n ```

Secondly, to reason about the termination of `f` we need information about the value it returns. In the branch for `inj₂ n<k`, termination of the first recursive call follows from downward induction: `suc n` is closer to `k` than `n`. On the other hand, we know that the second call terminates only after we notice that `f` returns the maximum of its arguments. Therefore, `f k (suc n)` yields `k`, and `f k k` terminates.

I proved the termination of `f` in two parts, respectively for the case when `k ≤ n`, and `n < k`:

```k≤n⇒fkn↓=n : forall k n -> k ≤′′ n -> f k n ↓= n n<k⇒fkn↓=k : forall k n -> n <′′ k -> f k n ↓= k ```

It then follows immediately that:

```f↓ : forall k n -> f k n ↓ f↓ k n with k ≤′′? n ... | inj₁ k≤n = ↓=⇒↓ (k≤n⇒fkn↓=n k n k≤n) ... | inj₂ n<k = ↓=⇒↓ (n<k⇒fkn↓=k k n n<k) ```

Both `k≤n⇒fkn↓=n` and `n<k⇒fkn↓=k`, on the other hand, were proved in two stages. I needed to prove an unfolded version of the lemma, which may or may not be mutually recursive on the lemma to be proved. Take `n<k⇒fkn↓=k` for example:

```mutual n<k⇒⟨fkn⟩↓=k : forall k n -> n<′′ k -> unfold (f k n) ↓= k n<k⇒⟨fkn⟩↓=k k n n<k = ...{- tedious proof, I don't think you want to read it -}```

``` ```

``` n<k⇒fkn↓=k : forall k n -> n <′′ k -> f k n ↓= k n<k⇒fkn↓=k k n n<k =     ≡-subst (\m -> m ↓= k) (≡-unfold {_}{f k n}) (n<k⇒⟨fkn⟩↓=k k n n ```

As mentioned above, this two-stage pattern was discussed on the Agda mailing list. I do not know whether this will be a recommended programming pattern. Some follow up discussions seemed to show that this is not always necessary. I have got to read more.

# Typed λ Calculus Interpreter in Agda

Update (Oct. 5, 2010): updated the code to use the current version of Standard Library (with universe polymorphism); index the environment by typing context.

It has been a typical exercise for beginners to write an evaluator for some variation of λ calculus. In a recent meeting I talked to my assistants about using dependent type to guarantee that, in an evaluator for λ calculus using de Bruin indices, that variable look-up always succeeds. The basic idea is to represent the environment using the `Vec` datatype, and label each λ terms with the number of enclosing λ’s, which must match the length of the environment.

I thought it could also be an easy exercsie for them to write up such an evaluator. Yet again, things turned out to be just a little bit more harder than I expected, thus I am recording the code here for reference. I do not remember where I have seen similar code before, but I believe the following is more or less standard.

### Evaluating Simply-Typed λ Calculus

Take simply-typed λ calculus with natural numbers as the only base type. It is common in Haskell to use such a tagged value type:

```data Val = Num Int | Fun (Val -> Val) ```

However, `Val` is not a legistimate type in Agda, since recursion occurs in negative position in the case of `Fun`. Indeed, `Val -> Val` is a “larger” type than `Val`, and we would need a more complicated semantics to accommodate the former into the latter.

Oleg Kiselyov suggested defunctionalising `Val`. For example, represent the value as either

```data Val = Num Int | Closure [Val] Term ```

or

```data Val n = Num n Int | Closure [Val m] (Val (n+1)) ```

I may try these alternatives later. At the time when I wrote the code, the right thing to do seemed to be switching to a tagless representation for values and integrate type checking into the term type.

The type `Term Γ a` represents a term that, in the typing context `Γ`, evaluates to a value having type `a`. Since we are about to use the de Bruin notation, the typing context is simply a vector of types:

```Cxt : ℕ → Set1 Cxt n = Vec Set n ```

Note that it is a vector storing types (in `Set1`) rather than values (in `Set`). Before Agda introduced universe polymorphism, one would have to use `Vec₁`, a `Set1` variation of `Vec`.

The type `Term Γ a` can then be defined by:

```data Term {n : ℕ} (Γ : Cxt n) : Set → Set1 where    lit : ℕ → Term Γ ℕ    var : (x : Fin n) → Term Γ (lookup x Γ)    _·_ : ∀ {a b} → Term Γ (a → b) → Term Γ a → Term Γ b    add : Term Γ ℕ → Term Γ ℕ → Term Γ ℕ    ƛ : ∀ {a b} → Term (a ∷ Γ) b → Term Γ (a → b) ```

The type `Fin n` denotes natural numbers between `0` and `n-1`. Notice that in a (de Bruin) variable `var x`, the value of `x` is bounded by `n`, the length of the typing context. The `lookup` function is thus guaranteed to success. The typing context is extended with one more element within each `ƛ`. The type `Vec`, its look-up function, and the type `Fin` for bounded natural numbers, are all defined in the standard library:

```data Vec {a} (A : Set a) : ℕ → Set a where    [] : Vec A zero    _∷_ : ∀ {n} (x : A) (xs : Vec A n) → Vec A (suc n)```

``` data Fin : ℕ → Set where    zero : {n : ℕ} → Fin (suc n)    suc : {n : ℕ} (i : Fin n) → Fin (suc n) ```

```lookup : ∀ {a n} {A : Set a} → Fin n → Vec A n → A lookup zero (x ∷ xs) = x lookup (suc i) (x ∷ xs) = lookup i xs ```

Now the evaluator. We need an environment allowing us to store heterogeneous values. One possibility is to index the environment by typing contexts: given a typing context `Γ` of length `n`, `Env Γ` is essentially a vector of `n` values whose types match those in `Γ`:

```data Env : {n : ℕ} → Cxt n → Set1 where    [] : Env []    _∷_ : ∀ {a n} {Γ : Cxt n} → a → Env Γ → Env (a ∷ Γ) ```

Agda allows us to reuse the constructors `[]` and `_∷_`. For an example, `0 ∷ 1 ∷ (λ v → suc v) ∷ []` may have type `Env (ℕ ∷ ℕ ∷ (ℕ → ℕ) ∷ [])`.

We also need a function that looks up an environment:

```envLookup : ∀ {n}{Γ : Cxt n} → (i : Fin n) → Env Γ → lookup i Γ envLookup zero (v ∷ ρ) = v envLookup (suc n) (v ∷ ρ) = envLookup n ρ ```

Notice the resulting type `lookup i Γ` — the type of the `i`th element in an environment is the `i`th type in the typing context.

With the setting up, the evaluator looks rather standard:

```eval : ∀ {a n} {Γ : Cxt n} → (ρ : Env Γ) → Term Γ a → a eval ρ (lit m) = m eval ρ (var x) = envLookup x ρ eval ρ (e₁ · e₂) = (eval ρ e₁) (eval ρ e₂) eval ρ (add e₁ e₂) = (eval ρ e₁) + (eval ρ e₂) eval ρ (ƛ e) = λ v → eval (v ∷ ρ) e ```

### An Environment as a Vector of Type-Value Pairs

In my earlier attempt I tried to reuse the `Vec` type for the environment as well. After all, an environment can be seen as a list of (type, value) pairs:

```Env : ℕ → Set1 Env n = Vec (Σ Set (λ a → a)) n ```

where `Σ` is a pair where the type of its second component may depend on the first. The two projection functions of a pair are respectively `proj₁` and `proj₂`. In `Σ Set (λ a → a)`, the type of the second component is exactly the first, that is, we may have environments like `ρ = (ℕ , 0) ∷ (ℕ , 1) ∷ (ℕ → ℕ , (λ v → suc v)) ∷ []`. To extract the `i`th value, we simply apply `proj₂ (lookup i ρ)`, which by the definition of `Σ` has type `proj₁ (lookup i ρ)`. Applying `map proj₁` to `ρ` gives us the “typing context” `Env (ℕ ∷ ℕ ∷ (ℕ → ℕ) ∷ [])`. The `eval` function now has type:

`eval : ∀ {a n} → (ρ : Env n) → Term (map proj₁ ρ) a → a`

In dependently typed programming, however, it is not always easy to reuse general purpose datatypes. In this case, the price for reusing `Vec` and `lookup` is an additional proof obligation in the `var x` case of `eval`:

```eval : ∀ {a n} → (ρ : Env n) → Term (map proj₁ ρ) a → a eval ρ (lit m) = m eval ρ (var x) rewrite lookup-proj x ρ = proj₂ (lookup x ρ) eval ρ (e₁ · e₂) = eval ρ e₁ (eval ρ e₂) eval ρ (add e₁ e₂) = (eval ρ e₁) + (eval ρ e₂) eval ρ (ƛ e) = λ v → eval ((_ , v) ∷ ρ) e ```

The reason is that the right-hand side, `proj₂ (lookup x ρ)`, has type `proj₁ (lookup x ρ)`, while the type checker, having just matched the `var` constructor, expects something having type `lookup x (map proj₁ ρ)`. We thus have to convince Agda that they are actually the same type:

```lookup-proj : ∀ {n} (x : Fin n) (ρ : Env n) →      lookup x (map proj₁ ρ) ≡ proj₁ (lookup x ρ) lookup-proj zero (_ ∷ _) = refl lookup-proj (suc i) (_ ∷ xs) = lookup-proj i xs ```

### Higher-Order Abstract Syntax

My original purpose was to demonstrate the use of vectors to guarantee static properties. If we decide not to use de Bruin notation and switch to higher-order abstract syntax instead, we could have a much simpler evaluator:

```data Term : Set → Set1 where   lit : ℕ → Term ℕ   var : ∀ {a} → a → Term a   _·_ : ∀ {a b} → Term (a → b) → Term a → Term b   add : Term ℕ → Term ℕ → Term ℕ   ƛ : ∀ {a b} → (a → Term b) → Term (a → b)```

``` ```

```eval : ∀ {a} → Term a → a eval (lit n) = n eval (var v) = v eval (e₁ · e₂) = eval e₁ (eval e₂) eval (add e₁ e₂) = eval e₁ + eval e₂ eval (ƛ e) = λ x → eval (e x) ```

It is a bit odd to me that `var` and `lit` are actually the same thing. Nothing forbids you from writing `λ(\x -> lit x)` or `λ(\x -> var 3)`.

Oleg Kiselyov has written a lot on typed tagless interpreters. Besides the initial representation above, there is a dual final representation. The purpose is to experiment with staged interpreters. Their paper Finally Tagless, Partial Evaluated: Tagless Staged Interpreters for Simpler Typed Languages is a good read which I would recommend to people.

### Code

• Eval.agda: an early version, using `Vec` for the environment.
• Eval2.agda: using higher-order abstract syntax.
• Eval3.agda: the code presented in the main text of this post.

# Algebra of programming using dependent types

S-C. Mu, H-S. Ko, and P. Jansson. In Mathematics of Program Construction 2008, LNCS 5133, pp 268-283. July 2008. [PDF]

Dependent type theory is rich enough to express that a program satisfies an input/output relational specification, but it could be hard to construct the proof term. On the other hand, squiggolists know very well how to show that one relation is included in another by algebraic reasoning. We demonstrate how to encode functional and relational derivations in a dependently typed programming language. A program is coupled with an algebraic derivation from a specification, whose correctness is guaranteed by the type system.

Code accompanying the paper has been developed into an Agda library AoPA.

# Well-Foundedness and Reductivity

Before learning dependent types, all I knew about program termination was from the work of Henk Doornbos and Roland Backhouse [DB95, DB96]. I am curious about the connection between their formulation of well-foundedness by relational calculation and the type theoretic view.

It is generally accepted that a relation _<_ is well-founded if, for all x, there is no infinite chain x > x₁ > x₂ > x₃ > …. . A related concept is inducvitity. We say that a relation admits induction of we can use it to perform induction. For example, the less-than ordering on natural number is inductive.

Most of the type theoretical paper I have seen equate well-foundedness to the concept of accessibilty, which I talked about in a previous post. For example, Bengt Nordström [Nor88] introduced the datatype `Acc` as a “constructively more useful description” of well-foundedness.

Doornbos and Backhouse generalised well-foundedness and other related notions to an arbitrary F-algebra. Well-foundedness, however, was generalised to “having a unique solution.” A relation R : FA ← A is F-well-founded if there is a unique solution for X = S . FX . R.

### Reductivity

Inductivity, on the other hand, is generalised to F-reductivity. While F-reductivity is a generalisation of strong induction, Doornbos and Backhouse called it reductivity “in order to avoid confusion with existing notions of inductivity.” [DB95] (Nevertheless, an equivalent concept in Algebra of Programming modelled using membership relation rather than monotype factors, calls itself “inductivity”. Different usage of similar terminology can be rather confusing sometimes.)

A relation R : FA ← A is F-reductive if

$\mu \left(\lambda P . R⧷$FP) = id

where is the monotype factor defined by ran(R . A) ⊆ B ≡ A ⊆ R⧷B. The function λP . R⧷FP takes coreflexive (a relation that is a subset if id) P to coreflexive R⧷FP, and μ takes its least fixed-point.

Well, it is a definition that is very concise but also very hard to understand, unless you are one of the few who really like these sort of things. If we take P to be a predicate, R\FP expands to a predicate on x that is true if

$\forall y . y R x ⇒$FP y

Taking the fixed point means to look for the minimum P such that

$\forall x . \left(\forall y . y R x ⇒$FP y) ⇒ P x

The expression now resembles strong induction we all know very well. Computationally, how do we take the fixed point? Of course, by defining a recursive datatype:

```data Acc {a : Set}(R : a -> a -> Set) : a -> Set where   acc : (x : a) -> (forall y -> y < x -> f (Acc R) y) -> Acc R x```

where `f : (a -> Set) -> (F a -> Set)` is the arrow-operation of functor `F`. When we take `F` to be the identity functor, it is exactly the definition of accessibility in my previous post! Finally, the requirement that μ(λP . R⧷FP) = id means every element in `a` is accessible.

So, it seems that reductivity of Doornbos and Backhouse is in fact accessibility, which is often taken by type theorists to be an alternative definition of well-foundedness. Doornbos and Backhouse, however, showed that reductivity guarantees uniqueness solution of hylo equations (which is their definition of F-well-foundedness). The converse, however, is not true in general.

### New Reductive Relations from Old

Doornbos and Backhouse also developed some principles to build new reductive relations from existing ones. Let us try to see their Agda counterparts. For simplicity, we take `F` to be the identity functor in the discussion below.

All initial algebras are reductive. For example, given `Pred` relating `n` and `suc n`, one can show that `Pred` is reductive.

```data Pred : ℕ -> ℕ -> Set where   pre : (n : ℕ) -> Pred n (suc n)```

``` ```

```ℕPred-wf : well-found Pred ℕPred-wf n = acc n (access n)   where     access : (n : ℕ) -> (m : ℕ) -> Pred m n -> Acc Pred m     access .(suc n) .n (pre n) = acc n (access n) ```

We have seen in the previous post that the less-than relation is reductive.

If R is reductive and S ⊆ R, then S is reductive too:

```acc-⊑ : {a : Set}{S R : a -> a -> Set} ->       S ⊑ R-> (x : a) -> Acc R x -> Acc S x acc-⊑ {a}{S} S⊑R x (acc .x h) = acc x access   where     access : (y : a) -> (y S x) -> Acc S y     access y y≪x = acc-⊑ S⊑R y (h y (S⊑R y x ySx))```

If f . R . f˘ is reductive for some function f, so is R. This is how we often prove termination of loops using f as the variant.

```acc-fRf˘ : {a b : Set}{R : a -> a -> Set}{f : a -> b} ->     (x : a) -> Acc (fun f ○ R ○ ((fun f) ˘)) (f x) -> Acc R x acc-fRf˘ {a}{b}{R}{f} x (acc ._ h) = acc x access   where     access : (y : a) -> R y x -> Acc R y     access y xRy = acc-fRf˘ y       (h (f y) (exists x (exists y (≡-refl , xRy) , ≡-refl))) ```

where the operators for relational composition (`_○_`) and converse (`_˘`) is that defined in AoPA.

Finally, for the special case where F is the identity functor, R is reductive if and only if its transitive closure R⁺ is. Let us first define transitive closure:

```data _⁺ {a : Set}(R : a -> a -> Set) : a -> a -> Set where   ⁺-base : forall {x y} -> R x y -> (R ⁺) x y   ⁺-step : forall {x z} -> ∃ a (\y -> R y z × (R ⁺) x y) -> (R ⁺) x z```

The “if” direction is easy. Let us prove the “only-if” direction.

```acc-tc : {a : Set}{R : a -> a -> Set} ->     (x : a) -> Acc R x -> Acc (R ⁺) x acc-tc {a}{R} x ac = acc x (access x ac)   where     access : (x : a) -> Acc R x ->         (y : a) -> (R ⁺) y x -> Acc (R ⁺) y     access x (acc .x h) y (⁺-base yRx) = acc-tc y (h y yRx)     access x (acc .x h) y (⁺-step (exists z (zRx , yR⁺z))) =       access z (h z zRx) y yR⁺z ```

Combined with the proof that `Pred` is reductive, we have another way to show that the less-than relation is reductive. One may compare it with the proof of `well-found _<′_` in the previous post.

### References

[DB95] Henk Doornbos and Roland Backhouse. Induction and recursion on datatypes. B. Moller, editor, Mathematics of Program Construction, 3rd Internatinal Conference, volume 947 of LNCS, pages 242-256. Springer-Verslag, July 1995.

[DB96] Henk Doornbos and Roland Backhouse. Reductivity. Science of Computer Programming, 26, pp. 217--236, 1996.

[Nor88] Bengt Nordström. Terminating general recursion. BIT archive, Volume 28, Issue 3, pp 605-619. 1988.

# Well-Founded Recursion and Accessibility

Since Nils Anders Danielsson showed me how to use `Induction.WellFounded` in Agda’s Standard Library, I have been studying issues related to well-founded recursion.

In dependently typed programming, we would like to ensure that proof terms are terminating. Apart from the type checker, Agda employs a separate termination checker. How it exactly works is still at times mysterious to me, but basically it recognises a program as terminating if some of its arguments become “structurally” smaller after each recursive call. This is already far more expressive than I thought — quite a number of programs can be written by structural recursion. Still, we occasionally need some more flexibility. One of such occasion was when I wished to express terminating unfolds in AoPA. For another example, consider this function mentioned in the paper Terminating General Recursion by Bengt Nordström:

$f \left(n, k\right) = if k \le n then n else f\left(f \left(n + 1, k\right), k\right)$

It is a variation of McCarthy’s 91 function, used as a popular example due to its non-trivial recursion pattern. Can we express this function in Agda while informing the termination checker that it does terminate?

Given a relation < that has been shown to be well-founded, we know that a recursively defined function is terminating if its arguments becomes “smaller” with respect to < at each recursive call. We can encode such well-founded recursion in Agda’s style of structural recusion. The earliest paper I could find about such encoding is Terminating General Recursion by Bengt Nordström. It may have appeared in other forms, however.

### Accessibility

Given a binary relation `_<_ : a -> a -> Set`, an element `x` is in the well-founded subset of `<` if there is no infinite chain of `x₁, x₂, x₃ ... ` such that `x > x₁ > x₂ > x₃ > ...`. The relation is well-founded if all elements of `a` are in its well-founded subset. The less-than ordering on natural number is well-founded, while the greater-than ordering is not. A more constructive way to talk about well-foundedness is via accessibility. Given a relation `_<_ : a -> a -> Set`, the proposition `Acc _<_ x` states that `x` is accessible under relation `<`. An element `x : a` is accessible if all `y : a` "smaller" than `x` are accessible:

```data Acc {a : Set}(_<_ : a -> a -> Set) : a -> Set where   acc : (x : a) -> (forall y -> y < x -> Acc _<_ y) -> Acc _<_ x```

A relation `_<_ : a -> a -> Set` is well-founded if all elements in `a` are accessible:

```well-found : {a : Set} -> (a -> a -> Set) -> Set well-found _<_ = forall x -> Acc _<_ x```

For an example, consider the less-than-or-equal-to ordering on natural numbers. It is usually modelled by a datatype stating that `0 ≤ n` for all `n` and that `1+m ≤ 1+n` follows from `m ≤ n`. The strictly-less-than ordering is defined by `m < n = 1+m ≤ n`. For well-foundedness proofs, however, the following definitions turn out to be more convenient:

```data _≤′_ : ℕ -> ℕ -> Set where   ≤′-refl : forall {n} -> n ≤′ n   ≤′-step : forall {m n} -> m ≤′ n -> m ≤′ suc n```

``` ```

```_<′_ : ℕ -> ℕ -> Set m <′ n = suc m ≤′ n ```

Let's prove that `<′` is well-founded:

```ℕ<-wf : well-found _<′_ ℕ<-wf n = acc n (access n)   where     access : (n : ℕ) -> (m : ℕ) -> m <′ n -> Acc _<′_ m     access zero m ()     access (suc n) .n ≤′-refl = acc n (access n)     access (suc n) m (≤′-step m<n) = access n m m<n ```

An operational intuition for `access`: knowing that `m <′ n`, `access` demonstrates that it is possible to reach `m` from `n` in a finite number of steps.

### Well-Founded Recursion

As an inductively defined datatype, `Acc` naturally induces a fold operator:

```acc-fold : {a : Set} (_<_ : a -> a -> Set) {P : a -> Set} ->   ((x : a) -> (forall y -> y < x -> P y) -> P x) ->     (z : a) -> Acc _<_ z -> P z acc-fold _<_ Φ z (acc .z h) =   Φ z (\y y<z -> acc-fold _<_ Φ y (h y y<z))```

Its type says that if we can provide a function `Φ` that proves some property `P x` for all `x` provided that `P y` has been proved for all `y < x`, then `P` holds for all accessible elements in `a`. Recall strong induction.

When `<` is well-founded, all elements are accessible. Therefore we can define:

```rec-wf : {a : Set} {_<_ : a -> a -> Set} {P : a -> Set} ->   well-found _<_ ->     ((x : a) -> ((y : a) -> y < x -> P y) -> P x) ->       (x : a) -> P x rec-wf {a}{_<_} wf f x = acc-fold _<_ f x (wf x) ```

For an example, consider the function `div : ℕ -> ℕ -> ℕ` performing integral division (but letting `div zero zero` be `zero`):

```div : ℕ -> ℕ -> ℕ div zero n = zero div (suc m) zero = suc m div (suc m) (suc n) = suc (div (suc m ∸ suc n) (suc n)) ```

If we define `div` by well-founded recursion with explicit fixed-point, it looks like:

```div-wf : ℕ -> ℕ -> ℕ div-wf m n = rec-wf ℕ<-wf (body n) m   where     body : (n : ℕ) -> (m : ℕ) -> ((k : ℕ) -> k <′ m -> ℕ) -> ℕ     body n zero rec = zero     body zero (suc m) rec = suc m     body (suc n) (suc m) rec =         suc (rec (suc m ∸ suc n) (sm-sn<sm m n))```

where `sm-sn<sm : forall m n -> (m ∸ n) <′ suc m`. Notice how `rec-wf ℕ<-wf` behaves like a fixed-point operator taking the "fixed-point" of `body`. The recursive call is replaced by a function `rec`, having type `(m : ℕ) -> m <′ n -> ℕ`. To call `rec`, the programmer has to provide a proof that the argument is indeed getting smaller with respect to the relation `_<_`.

The function could also be written without explicit fixed-point, but with explicit pattern matching with the accessibility proof:

```div-wf' : forall m -> ℕ -> Acc _<′_ m -> ℕ div-wf' zero n _ = zero div-wf' (suc m) zero _ = suc m div-wf' (suc m) (suc n) (acc .(suc m) h) =   suc (div-wf' (suc m ∸ suc n) (suc n)         (h (suc m ∸ suc n) (sm-sn<sm m n))) ```

### Upper-Bounded Reverse Ordering

As a practice, let us consider another ordering. For a given natural number `k`, define `n ◁ m = m < n ≤ k`. That is, for `m` and `n` upper-bounded by `k`, the one closer to `k` is considered smaller. Apparently `◁` is well-founded. I tried to encode it in Agda by:

```_<_≤_ : ℕ -> ℕ -> ℕ -> Set m < n ≤ k = (m <′ n) × (n ≤′ k)```

``` ```

```◁ : ℕ -> ℕ -> ℕ -> Set (◁ k) m n = n < m ≤ k```

However, the accessibility proof became very difficult to construct. I ended up defining yet another variation of the less-than-or-equal-to ordering:

```data _≤′′_ : ℕ -> ℕ -> Set where   ≤′′-refl : forall {n} -> n ≤′′ n   ≤′′-step : forall {m n} -> suc m ≤′′ n -> m ≤′′ n```

and defined `m < n ≤ k = (m <′′ n) × (n ≤′′ k)`.

A number of properties can be proved about `≤′`:

```m≤n⇒m≤1+n : forall {m n} -> m ≤′′ n -> m ≤′′ (suc n) m<n⇒¬n≤m : forall {m n} -> m <′′ n -> ¬ (n ≤′′ m) ¬[m<n∧n≤m] : forall {m n} -> ¬ (m <′′ n × n ≤′′ m) ≤′′-trans : forall {m n k} -> m ≤′′ n -> n ≤′′ k -> m ≤′′ k ≤′′-antisym : forall {m n} -> m ≤′′ n -> n ≤′′ m -> m ≡ n```

Some of the properties above depends on the following:

` ¬1+n≤n : forall {n} -> ¬ (suc n ≤′′ n)`

which I have not yet proved.

Now let us show that `◁ k` is well-founded.

```ℕ◁-wf : forall k -> well-found (◁ k) ℕ◁-wf k m = acc m access   where    access : forall n -> (m < n ≤ k) -> Acc (◁ k) n    access n (m<n , n≤k) =     ◁-access k m n (≤′′-trans m<n n≤k) (m<n , n≤k) ```

The function `◁-access` needs, apart from proofs of `m < n` and `n ≤ k`, a proof of `m < k`, which can be established by transitivity of `≤′`. The definition is shown below:

```◁-access : forall k m n ->     (m <′′ k) -> (m < n ≤ k) -> Acc (◁ k) n ◁-access .(suc m) m .(suc m) ≤′′-refl (≤′′-refl , 1+m≤1+m) =   acc (suc m)     (\y 1+m<y∧y≤1+m -> ⊥-elim (¬[m<n∧n≤m] 1+m<y∧y≤1+m)) ◁-access k m .(suc m) (≤′′-step 1+m<k) (≤′′-refl , 1+m≤k) =   acc (suc m) (\y -> ◁-access k (suc m) y 1+m<k) ◁-access .(suc m) m n ≤′′-refl (≤′′-step 1+m<n , n≤1+m) =   acc n (⊥-elim (m<n⇒¬n≤m 1+m<n n≤1+m)) ◁-access k m n (≤′′-step 1+m<k) (≤′′-step 1+m<n , n≤k) =   ◁-access k (suc m) n 1+m<k (1+m<n , n≤k)```

It is perhaps quite hard to understand unless you try to construct it yourself. The reader only needs to notice that in the two recursive calls to `◁-access` (in the second and the last clause), the argument `m` is incremented to `suc m`. That is why we need to include a proof term of `m <′′ k` as an argument, which becomes structurally smaller in the recursive calls. This is how we inform the termination checker that, although `m` increases, the distance between `m` and `k` shrinks. The proof of `n ≤ k`, on the other hand, is used to establish the contradictions in the first and the third clause.

### A Variation of McCarthy's 91 Function

Back to the function f mentioned in the beginning, whose native Agda translation:

```f k n with k ≤? n ... | yes _ = n ... | no _ = f k (f k (suc n)) ```

certainly does not pass the termination check.

In fact, the second argument (`n`) to `f` becomes smaller with respect to `◁ k`. This is obvious for the inner recursive call (`f k (suc n)`) since `k ≰ n` implies `n < 1 + n ≤ k`. To prove the termination of the outer recursive call, we have to show that `f k n` actually computes the maximum of `k` and `n`. Since `k ≰ n`, we have `f k (suc n) = k`. Therefore, `f k (suc n)` is smaller than `n` with respect to `◁ k`.

To prove the termination of `f` we have to prove something about its result. One of the possible ways to do so is to define:

```P : ℕ -> ℕ -> Set P k n = ∃ ℕ (\fn -> n ≤′′ k -> fn ≡ k)```

and extend the type of `f` to:

`f : (k : ℕ) -> (n : ℕ) -> P k n `

Now that `f` returns an existential type, we have to pattern math its result. A definition of `f` may look like:

```f k = rec-wf (ℕ◁-wf k) body  where   body : (i : ℕ) -> ((j : ℕ) -> i < j ≤ k -> P k j) -> P k i   body n rec with (k ≤′′? n)   ... | inj₁ k≤n = exists n (\n≤k -> ≤′′-antisym n≤k k≤n)   ... | inj₂ n<k with rec (1 + n) (≤′′-refl , n<k)   ...  | exists m 1+n≤k⇒m≡k with rec m (n<m , m≤k)   ...    | exists h m≤k⇒h≡k = exists h (\_ -> h≡k)    where     m≡k = 1+n≤k⇒m≡k n<k     n<m = ≡-subst (\i -> n <′′ i) (≡-sym m≡k) n<k     m≤k = ≡-subst (\i -> m ≤′′ i) m≡k ≤′′-ref     h≡k = m≤k⇒h≡k m≤k```

However, Agda's scoping rule with the presence of both `with` and `where` still confuses me. Eventually I had to inline the subordinate proofs, which gives me:

```f : (k : ℕ) -> (n : ℕ) -> P k n f k = rec-wf (ℕ◁-wf k) body  where   body : (i : ℕ) -> ((j : ℕ) -> i < j ≤ k -> P k j) -> P k i   body n rec with (k ≤′′? n)   ... | inj₁ k≤n = exists n (\n≤k -> ≤′′-antisym n≤k k≤n)   ... | inj₂ n<k with rec (1 + n) (≤′′-refl , n<k)   ...  | exists m 1+n≤k⇒m≡k with 1+n≤k⇒m≡k n<k   ...   | m≡k with rec m (≡-subst (\i -> n <′′ i) (≡-sym m≡k) n<k ,         ≡-subst (\i -> m ≤′′ i) m≡k ≤′′-refl)   ...    | exists h m≤k⇒h≡k =           exists h (\_ -> m≤k⇒h≡k (≡-subst (\i -> m ≤′′ i) m≡k ≤′′-refl))```

Unfortunately it looks quite remote from the original `f`. Nevertheless, it does pass the termination checker of Agda.

# Terminating Unfolds (2)

After seeing our code, Nils Anders Danielsson suggested two improvements. Firstly, to wrap the bound in the seed. The terminating `unfoldr↓` would thus have a simpler type as well as a simpler implemenation:

```unfoldr↓ : {a : Set}(b : ℕ -> Set){n : ℕ} ->     (f : forall {i} -> b (suc i) -> ⊤ ⊎ (a × b i)) ->       b n -> [ a ] unfoldr↓ b {zero} f y = [] unfoldr↓ b {suc i} f y with f y ... | inj₁ _ = [] ... | inj₂ (x , y') = unfoldr↓ b {i} f y'```

The definition passes the termination check, apparently, because `unfoldr↓` is defined inductively on `n`.

To generate a descending a list, one may invent a datatype `Wrap` that wraps the seed, whose bound is simply the seed itself:

```data Wrap : ℕ -> Set where       W : (n : ℕ) -> Wrap n```

The list may then be generated by an unfold:

```dec : forall {i} -> Wrap (suc i) -> ⊤ ⊎ (ℕ × Wrap i) dec {i} (W .(suc i)) = inj₂ (i , W i)```

``` ```

```down↓ : ℕ -> [ ℕ ] down↓ n = unfoldr↓ Wrap dec (W n) ```

Of course, this would defeat my original goal of reusing the non-dependently typed `dec`, which is probably a bad idea anyway.

To show that the bound need not be exact, let’s try to generate a descending list whose elements are decremented by 2 in each step. We may use this slightly generalised wrapper:

```data Wrap2 : ℕ -> Set where       W2 : (x : ℕ) -> (bnd : ℕ) -> x ≤ bnd -> Wrap2 bnd```

and a function `dec2` that decrements a `suc i`-bounded seed by 2 but showing that the new seed is bounded by `i`:

```dec2 : forall {i} -> Wrap2 (suc i) -> ⊤ ⊎ (ℕ × WAlt i) dec2 {i} (W2 0 .(1 + i) _) = inj₁ tt dec2 {i} (W2 1 .(1 + i) _) = inj₁ tt dec2 {i} (W2 (suc (suc n)) .(1 + i) 2+n≤1+i) =     inj₂ (n , W2 n i (suc-≤-weaken-l (≤-pred 2+n≤1+i))) ```

The list can then be unfolded by:

```down↓2 : ℕ -> [ ℕ ] down↓2 n = unfoldr↓ Wrap2 dec2 (W2 n n ≤-refl)```

where `suc-≤-weaken-l` is a proof of `forall {m n} -> suc m ≤ n -> m ≤ n`.

### Unfolding a Tree

It is an easy exercise to come up with a tree version of the unfold above:

```unfoldT↓ : {a : Set}(b : ℕ -> Set){n : ℕ} ->   (f : forall {i} -> b (suc i) -> a ⊎ (b i × b i)) ->     B n -> Tree a unfoldT↓ b {0} f y = Nul unfoldT↓ b {suc i} f y with f y ... | inj₁ x = Tip x ... | inj₂ (y₁ , y₂) =     Bin (unfoldT↓ b {i} f y₁) (unfoldT↓ b {i} f y₂)```

To deal with the second task of building a roughly balanced binary tree, one may try this wrapper:

```data Split (a : Set): ℕ -> Set where   Sp : (xs : [ a ]) -> (bnd : ℕ) ->     length xs < bnd -> Split a bnd```

and try to code up a generator function `split↓` having this type:

```split↓ : forall {a i} -> Split a (suc i) -> a ⊎ (Split a i × Split a i) ```

The function `split↓` I eventually come up with, however, is much more complicated than I had wished. Even worse, it is now `split↓` that fails to pass the termination check.

Ulf Norell suggested some possible fixes. The difficulties, however, is probably a hint that there is something wrong in my approach in the first place. Rather than trying to fix it, Nils Anders showed me how he would tackle the problem from the beginning.

### Using Well-Founded Recursion

Nils Anders showed me how to define `unfoldT↓` using well-founded recursion. For a simplified explanation, the Standard Library provides a function `<-rec` having type (after normalisation):

```<-rec : (P : ℕ -> Set) ->   (f : (i : ℕ) -> (rec : (j : ℕ) -> j <′ i -> P j) -> P i) ->     (x : ℕ) -> P x ```

With `<-rec` one can define functions on natural numbers by recursion, provided that the argument strictly decreases in each recursive call. `P` is the type of the result, parameterised by the input. The function `f` is a template that specifies the body of the recursion which, given `i`, computes some result of type `P i`. The functional argument `rec` is supposed to be the recursive call. The constraint `j <′ i`, however, guarantees that it accepts only inputs strictly smaller than `i` (the ordering `<′ ` is a variation of `<` that is more suitable for this purpose). One may perhaps think of `<-rec` as a fixed-point operator computing the fixed-point of `f`, only that `f` has to take `i` before `rec` because the latter depends on the former.

Let us try to define an unfold on trees using `<-rec`. The "base-functor" of the datatype `Tree⁺ a` is `F b = a ⊎ b × b`. One of the lessons we have learnt is that it would be more convenient for the generating function to return the bound information. We could use a type like this:

`F a b k = a ⊎ ∃ (ℕ × ℕ) (\(i , j) -> b i × b j × i <′ k × j <′ k)`

But it is perhaps more reader-friendly to define the base functor as a datatype:

```data Tree⁺F (a : Set) (b : ℕ -> Set) : ℕ -> Set where   tip : forall {k} -> a -> Tree⁺F a b k   bin : forall {i j k} ->       b i -> b j -> i <′ k -> j <′ k -> Tree⁺F a b k ```

The generating function for the unfold should thus have type `forall {a b i} -> b i -> Tree⁺F a b i`.

The function `unfoldT↓F` is the template for `unfoldT↓`:

```unfoldT↓F : {a : Set} {b : ℕ -> Set} ->   (f : forall {i} -> b i -> Tree⁺F a b i) ->    (n : ℕ) -> ((i : ℕ) -> i <′ n -> b i -> Tree⁺ a) ->     b n -> Tree⁺ a unfoldT↓F f n rec y with f y ... | tip a = Tip⁺ a ... | bin {i} {j} y₁ y₂ i<n j<n =     Bin⁺ (rec i i<n y₁) (rec j j<n y₂) ```

Now `unfoldT↓` can be defined by:

```unfoldT↓ : {a : Set} {b : ℕ -> Set} ->   (f : forall {i} -> b i -> Tree⁺F a b i) ->     forall {n} -> b n -> Tree⁺ a unfoldT↓ {a}{b} f {n} y =     <-rec (\n -> b n -> Tree⁺ a) (unfoldT↓F f) n y ```

All the definition above makes no recursive calls. All the tricks getting us through the termination check are hidden in `<-rec`. How is it defined?

### Well-Founded Recursion Defined on <′

Currently, well-founded recursion are defined in `Induction` and its sub-modules. They are very interesting modules to study. The definitions there, however, are very abstract. To understand how `<-rec` works, let's try to implement our own.

This is the definition of `<′` from `Data.Nat`:

```data _≤′_ : Rel ℕ where ≤′-refl : forall {n} -> n ≤′ n ≤′-step : forall {m n} -> m ≤′ n -> m ≤′ suc n```

``` ```

```_<′_ : Rel ℕ m <′ n = suc m ≤′ n```

Recall that the recursion template `f` has type `forall i -> (forall j -> j <′ i -> P j) -> P i`. That is, given `i`, `f` computes `P i`, provided that we know of a method to compute `P j` for all `j <′ i`. Let us try to construct such a method using `f`. The function ` guard f i` computes `f`, provided that the input is strictly smaller than `i`:

```guard : {P : ℕ -> Set} ->    (forall i -> (forall j -> j <′ i -> P j) -> P i) ->       forall i -> forall j -> j <′ i -> P j guard f zero _ () guard f .(suc j) j ≤′-refl = f j (guard f j) guard f (suc i) j (≤′-step j<i) = guard f i j j<i ```

Observe that `guard` terminates because it is defined inductively on `i`. If we discard the type information, all what `guard f i j` is to make a call to `f j`. Before doing so, however, it checks through the proof to make sure that `j` is strictly smaller than `i`. In `f j (guard f j)`, the call to `guard f j` ensures that subsequent calls to `f` are given arguments smaller than `j`.

Now `<-rec` can be defined by:

```<-rec : (P : ℕ -> Set) ->    (forall i -> (forall j -> j <′ i -> P j) -> P i) ->       forall n -> P n <-rec P f n = f n (guard f n)```

In `Induction.Nat`, `<-rec` is an instance of well-founded recursion defined using the concept of accessibility, defined in `Induction.WellFounded`. I find them very interesting modules about which I hope to understand more.

To be continued...