# Terminating Unfolds (1)

The AoPA(Algebra of Programming in Agda) library allows one to derive programs in a style resembling that in the book Algebra of Programming in Agda. For example, the following is a derivation of insertion sort:

sort-der : ∃ ([ Val ] -> [ Val ]) (\f -> ordered? ○ permute ⊒ fun f )
sort-der = exists (
⊒-begin
ordered? ○ permute
⊒⟨ (\vs -> ·-monotonic ordered? (permute-is-fold vs)) ⟩
ordered? ○ foldR combine nil
⊒⟨ foldR-fusion ordered? ins-step ins-base ⟩
foldR (fun (uncurry insert)) nil
⊒⟨ foldR-to-foldr insert [] ⟩
fun (foldr insert [])
⊒∎)

isort : [ Val ] -> [ Val ]
isort = witness sort-der

The type of sort-der is a proposition that there exists a program of type [ Val ] → [ Val ] that is contained in ordered ? ◦ permute, a relation mapping a list to one of its ordered permutations. The combinator fun embeds a function in a relation. Relations on each line are related by relational inclusion (), while the proof of inclusion are given in angle brackets. In the final step, the relation foldR (fun (uncurry insert)) nil is refined to a function using the rule:

foldR-to-foldr : {A B : Set} -> (f : A -> B -> B) -> (e : B) ->
foldR (fun (uncurry f)) (singleton e) ⊒ fun (foldr f e)

which says that a relational fold is a functional fold if the algebra (f) is a function. When the proof is completed, an algorithm isort is obtained by extracting the witness of the proposition. It is an executable program that is backed by the type system to meet the speciﬁcation.

One may wonder: why insertion sort, rather than something more interesting such as mergesort or quicksort? Because we have not yet figured out how to properly deal with hylomorphism yet. One of the things still missing in AoPA is a treatment of unfolds. AoPA attempts to model the category of sets and relations. In this context one can talk about unfolds because they are simply the converses of folds. We perform the derivations in terms of folds, and refine the converse of a relational fold to a functional unfold provided that the coalgebra is a function:

foldR˘-to-unfoldr : {A B : Set} -> (f : B -> ⊤ ⊎ (A × B)) ->
(foldR ((fun f) ˘ ○ (fun inj₂)) (\b -> isInj₁ (f b))) ˘ ⊒ fun (unfoldr f)

where ˘ denotes relational converse and unfoldr is given by the usual definition (one that a Haskell programmer would expect, modulo syntax):

unfoldr : {A B : Set} -> (B -> ⊤ ⊎ (A × B)) -> B -> [ A ]
unfoldr f b with f b
... | inj₁ _ = []
... | inj₂ (a , b') = a ∷ unfoldr f b'

The only problem is that this unfoldr cannot pass the termination check of Agda.

My first thought was to wait until Agda supports codata and coinduction. However, the reason I wanted an unfold in AoPA in the first place was to compose it with a fold to form a hylomorphism, which is not allowed if the unfold returns codata. I then attempted to model unfolds in a way similar to what we did in relational program derivation. We know that the recursive equation:

$R = S ○$FR ○ T


has a unique solution R = ⦇S⦈ ○ ⦇T⦈˘ if and only if T is well-founded. In program derivation, the proof of well-foundedness is usually given separately. Therefore I hope to come up with a variation of unfoldr which, given a proof of well-foundedness, somehow passes the termination check.

During DTP 2008, I consulted a number of people and was suggested at least three solutions. I am going to summarise them in turn.

### Canonical Examples

With each approach I would like to tackle two tasks. Given the following function dec:

dec : ℕ -> ⊤ ⊎ (ℕ × ℕ)
dec zero = inj₁ tt
dec (suc n) = inj₂ (n , n)

One may attempt to define a function down using unfoldr such that down n produces a list from n to 0. As mentioned above, unfoldr cannot pass the termination check. In fact, even the following definition fails the termination check:

down : ℕ -> [ ℕ ]
down n with dec n
... | inj₁ _ = []
... | inj₂ (a , n') with down n'
...    | x = a ∷ x

The first task is to define down using some kind of unfold.

In the second tasks, given a non-empty list, we would like to distribute its elements to a roughly balanced binary tree. The list and the tree are define by:

data List⁺ (a : Set) : Set where
[_]⁺ : a -> List⁺ a
_∷⁺_ : a -> List⁺ a -> List⁺ a
data Tree⁺ (a : Set) : Set where
Tip⁺ : a -> Tree⁺ a
Bin⁺ : Tree⁺ a -> Tree⁺ a -> Tree⁺ a

If we apply, for example, foldT merge wrap, we get merge sort (I use non-empty lists to save the constructor for empty trees). In Haskell, one might first define a function that splits a list to roughly two halfs, something equivalent to this:

split⁺ : {a : Set} -> List⁺ a -> a ⊎ (List⁺ a × List⁺ a)
split⁺ [ x ]⁺ = inj₁ x
split⁺ (x ∷⁺ xs) with split⁺ xs
... | inj₁ y = inj₂ ([ y ]⁺ , [ x ]⁺)
... | inj₂ (ys , zs) = inj₂ (zs , x ∷⁺ ys)

and use an unfold on Tree⁺. The second task is to construct a function expand : {a : Set} -> List⁺ a -> Tree⁺ a that builds a roughly balanced tree, using some kind of unfold that makes use of split⁺, or at least some variation of it.

### Josh Ko’s Approach

Josh and I tried to extend unfolds with extra arguments representing a bound and proofs that the bound decreases with each call to f:

unfoldr↓ : {a b : Set} {_≾_ : b -> ℕ -> Set} ->
(f : b -> ⊤ ⊎ (a × b)) -> (y : b) -> (t : ℕ) ->
y ≾ t -> (forall {y} -> y ≾ zero -> f y ≡ inj₁ tt) ->
(forall {y} t' {x y'} ->
y ≾ suc t' -> f y ≡ inj₂ (x , y') -> y' ≾ t')
-> [ a ]

Apart from the generating function f and the seed b, the function unfoldr↓ takes a bound t, a natural number. The seed and the bound is related by , to be defined for each specific problem. Before producing the list, unfoldr↓ demands that the current seed is bounded by the bound (y ≾ t), a proof that f must yield inj₁ when the bound is zero, and a proof that if f returns inj₂ (x , y'), the new seed y' is bounded by a strictly smaller bound.

To define unfoldr↓ we need the inspect idiom (see the explanation in Relation.Binary.PropositionalEquality):

unfoldr↓ f y t y≾t p1 p2 with inspect (f y)
... | (inj₁ tt) with-≡ inj₁≡fy = []
... | (inj₂ (x , y')) with-≡ inj₂≡fy with t
...  | zero = contradiction (p1 y≾t)
(\fy≡inj₁ -> inj₁≢inj₂ (≡-sym (≡-trans inj₂≡fy fy≡inj₁)))
...  | suc t' = x ∷ unfoldr↓ f y' t'
(p2 t' y≾t (≡-sym inj₂≡fy)) p1 p2

Notice that unfoldr↓ passes the termination check because it pattern matches on t.

#### Descending List

For the first task, the seed is a natural number, and we may simply define to be :

_≾_ : ℕ -> ℕ -> Set
n ≾ c = n ≤ c

Then down↓ can be defined in terms of unfoldr↓ and dec by:

stop : forall {n} -> n ≾ zero -> dec n ≡ inj₁ tt
stop {.zero} z≤n = ≡-refl

proceed : forall {n} c' {x n'} -> n ≾ suc c' -> dec n ≡ inj₂ (x , n') -> n' ≾ c'
proceed {.zero} c' {x} {n'} z≤n dec-zero≡inj₂ = contradiction dec-zero≡inj₂ inj₁≢inj₂
proceed {suc m} c' {.m} {.m} (s≤s m≤c') ≡-refl = m≤c'

down↓ : ℕ -> [ ℕ ]
down↓ n = unfoldr↓ dec n n ≤-refl stop proceed

where ≤-refl : forall x -> x ≤ x.

#### Expanding a Binary Tree

The unfold above can be generalised to trees in the obvious way:

unfoldT↓ : {a b : Set} {_≼_ : b -> ℕ -> Set} ->
(f : b -> a ⊎ b × b) -> (y : b) -> (n : ℕ) ->
y ≼ n -> (forall y -> y ≼ zero -> ∃ a (\x -> f y ≡ inj₁ x)) ->
(forall y i y₁ y₂ ->
y ≼ suc i -> f y ≡ inj₂ (y₁ , y₂) -> (y₁ ≼ i × y₂ ≼ i))
-> Tree⁺ a
unfoldT↓ f y n y≼n p₁ p₂ with inspect (f y)
... | inj₁ x with-≡ _ = Tip⁺ x
... | inj₂ (y₁ , y₂) with-≡ inj₂≡fy with n
...  | zero = contradiction (proof (p₁ y y≼n))
(\fy≡inj₁ -> inj₁≢inj₂ (≡-sym (≡-trans inj₂≡fy fy≡inj₁)))
...  | suc m with p₂ y m y₁ y₂ y≼n (≡-sym inj₂≡fy)
...    | (y₁≼m , y₂≼m) = Bin⁺ (unfoldT↓ f y₁ m y₁≼m p₁ p₂)
(unfoldT↓ f y₂ m y₂≼m p₁ p₂)

One would wonder whether the condition (y₁ ≼ i × y₂ ≼ i) is too restrictive. When the proposition has to be proved inductively, we may need a stronger inductive hypothesis, for example, that both y₁ and y₂ are bounded by half of suc i. The current definition luckily works for our second task. We may need some generalisation later.

Repeated calls to split⁺ eventually terminates because the lengths of the lists are strictly decreasing. Therefore we define:

_#≤_ : {A : Set} -> List⁺ A -> ℕ -> Set
xs #≤ n = pred (length⁺ xs) ≤ n

With properly defined lemma1 and lemma2, we can expand the tree by unfoldT↓:

expand↓ : {a : Set} -> List⁺ a -> Tree⁺ a
expand↓ {a} xs = unfoldT↓ {a} {List⁺ a} {_#≤_} split⁺ xs
(pred (length⁺ xs)) ≤-refl lemma1 lemma2

where the two lemmas respectively have types:

lemma1 : {a : Set}(xs : List⁺ a) ->
xs #≤ zero -> ∃ a (\x -> split⁺ xs ≡ inj₁ x)
lemma2 : {a : Set} (xs : List⁺ a) (i : ℕ)
(ys : List⁺ a) (zs : List⁺ a) ->
xs #≤ suc i -> split⁺ xs ≡ inj₂ (ys , zs) -> ys #≤ i × zs #≤ i

Definition of lemma1 is a simple application of contradiction. On the other hand, lemma2 is very tedious to define. It appears to me that the definition inevitably becomes cumbersome because split⁺ and lemma2 are defined separately. It could be much easier if split⁺ is defined to return the computed results as well as some proofs about them. For my application in AoPA, however, I would prefer to be able to reuse the old split⁺. I don’t know a good solution yet.

Nils Anders Danielsson suggested that it would be nicer to integrate the bound into the type of the seed, while Conor McBride suggested yet another approach. I will try to summarise them next time.

To be continued…

# AoPA — Algebra of Programming in Agda

2011.06.01 Part of the library is updated to use universe polymorphism, and it now type checks with Agda 2.2.11. This is a temporary update yet to be finished. The unfinished parts are commented out in Everything.agda.

An Agda library accompanying the paper Algebra of Programming in Agda: Dependent Types for Relational Program Derivation, developed in co-operation with Hsiang-Shang Ko and Patrik Jansson.

Dependent type theory is rich enough to express that a program satisfies an input/output relational specification, but it could be hard to construct the proof term. On the other hand, squiggolists know very well how to show that one relation is included in another by algebraic reasoning. The AoPA library allows one to encode Algebra of Programming style program derivation, both functional and relational, in Agda.

### Example

The following is a derivation of insertion sort in progress:

isort-der : ∃ (\f → ordered? ○ permute ⊒ fun f )
isort-der = (_ , (
⊒-begin
ordered? ○ permute
⊒⟨ (\vs -> ·-monotonic ordered? (permute-is-fold vs)) ⟩
ordered? ○ foldR combine nil
⊒⟨ foldR-fusion ordered? ins-step ins-base ⟩
foldR (fun (uncurry insert)) nil
⊒⟨ { foldR-to-foldr insert []}0
{ fun (foldr insert [])
⊒∎ }1))

isort : [ Val ] -> [ Val ]
isort = proj₁ isort-der

The type of isort-der is a proposition that there exists a function f that is contained in ordered ? ◦ permute , a relation mapping a list to one of its ordered permutations. The proof proceeds by derivation from the speciﬁcation towards the algorithm. The ﬁrst step exploits monotonicity of and that permute can be expressed as a fold. The second step makes use of relational fold fusion. The shaded areas denote interaction points — fragments of (proof ) code to be completed. The programmer can query Agda for the expected type and the context of the shaded expression. When the proof is completed, an algorithm isort is obtained by extracting the witness of the proposition. It is an executable program that is backed by the type system to meet the speciﬁcation.

The complete program is in the Example directory of the code.

### The Code

The code consists of the following files and folders:

• AlgebraicReasoning: a number of modules supporting algebraic reasoning. At present we implement our own because the PreorderReasoning module in earlier versions of the Standard Library was not expressive enough for our need. We may adapt to the new Standard Library later.
• Data: defining relational fold, unfold, hylomorphism (using well-founded recursion), the greedy theorem, and the converse-of-a-function theorem, etc, for list and binary tree.
• Examples: currently we have prepared four examples: a functional derivation of the maximum segment sum problem, a relational derivation of insertion sort and quicksort (following the paper Functional Algorithm Design by Richard Bird), and solving an optimisation problem using the greedy theorem.
• Relations: modules defining various properties of relations.
• Sets: a simple encoding of sets, upon with Relations are built.

To grab the latest code, install darcs and check our the code from the repository:

darcs get http://pc-scm.iis.sinica.edu.tw/repos/AoPA

AoPA makes use of the Standard Library, to install which you will need darcs.

# Maximum Segment Sum, Agda Approved

To practise using the Logic.ChainReasoning module in Agda, Josh proved the foldr-fusion theorem, which he learnt in the program derivation lecture in FLOLAC where we used the maximum segment sum (MSS) as the main example. Seeing his proof, I was curious to know how much program derivation I can do in Agda and tried coding the entire derivation of MSS. I thought it would be a small exercise I could do over the weekend, but ended up spending the entire week.

As a reminder, given a list of (possibly negative) numbers, the MSS is about computing the maximum sum among all its consecutive segments. Typically, the specification is:

mss = max ○ map⁺ sum ○ segs

where segs = concat⁺ ○ map⁺ inits ○ tails computes all segments of a list.

A dependent pair is defined by:

data _∣_ (A : Set) (P : A -> Set) : Set where
sub : (x : A) -> P x -> A ∣ P

such that sub x p is a pair where the type of the second component p depends on the value of the first component x. The idea is to use a dependent pair to represent a derivation:

mss-der : (x : List Val) -> (Val ∣ \m -> (max ○ map⁺ sum ○ segs) x ≡ m)
mss-der x =
sub ?
(chain> (max ○ map⁺ sum ○ segs) x
=== ?)

It says that mss-der is a function taking a list x and returns a value of type Val, with the constraint that the value returned must be equal to (max ○ map⁺ sum ○ segs) x.

My wish was to use the interactive mechanism of the Agda Emacs mode to “derive” the parts marked by ?, until we come to an implementation:

mss-der : (x : List Val) -> (Val ∣ \m -> (max ○ map⁺ sum ○ segs) x ≡ m)
mss-der x =
sub RESULT
(chain> (max ○ map⁺ sum ○ segs) x
=== ...
=== ...
=== RESULT)

If it works well, we can probably use Agda as a tool for program derivation!

Currently, however, I find it harder to use than expected, perhaps due to my being unfamiliar with the way Agda reports type errors. Nevertheless, Agda does force me to make every details right. For example, the usual definition of max I would use in a paper would be:

max = foldr _↑_ -∞

But then I would have to define numbers with lower bound -∞. A sloppy alternative definition:

max = foldr _↑_ 0

led me to prove a base case 0 ↑ max x ≡ max x, which is not true. That the definition does work in practice relies on the fact that segs always returns empty list as one of the possible segment. Alternatively, I could define max on non-empty lists only:

max : List⁺ Val -> Val
max = foldr⁺ _↑_ id

where List⁺ A is defined by:

data List⁺ (A : Set) : Set where
[_]⁺ : A -> List⁺ A
_::⁺_ : A -> List⁺ A -> List⁺ A

and refine the types of inits, tails, etc, to return non-empty lists. This is the approach I took.

Eventually, I was able to give a derivation of mss this way:

mss-der : (x : List Val) -> (Val ∣ \m -> (max ○ map⁺ sum ○ segs) x ≡ m)
mss-der x =
sub ((max ○ scanr _⊗_ ø) x)
(chain> (max ○ map⁺ sum ○ segs) x
=== (max ○ map⁺ sum ○ concat⁺ ○ map⁺ inits ○ tails) x
by refl
=== (max ○ concat⁺ ○ map⁺ (map⁺ sum) ○ map⁺ inits ○ tails) x
by cong max (sym (concat⁺-map⁺ ((map⁺ inits ○ tails) x)))
=== (max ○ map⁺ max ○ map⁺ (map⁺ sum) ○ map⁺ inits ○ tails) x
by max-concat⁺ ((map⁺ (map⁺ sum) ○ map⁺ inits ○ tails) x)
=== (max ○ map⁺ max ○ map⁺ (map⁺ sum ○ inits) ○ tails) x
by cong (\xs -> max (map⁺ max xs)) (sym (map⁺-compose (tails x)))
=== (max ○ map⁺ (max ○ map⁺ sum ○ inits) ○ tails) x
by cong max (sym (map⁺-compose (tails x)))
=== (max ○ map⁺ (foldr _⊗_ ø) ○ tails) x
by cong max (map⁺-eq max-sum-prefix (tails x))
=== (max ○ scanr _⊗_ ø) x
by cong max (scanr-pf x) )
where _⊕_ : Val -> List⁺ Val -> List⁺ Val
a ⊕ y = ø ::⁺ map⁺ (_+_ a) y

_⊗_ : Val -> Val -> Val
a ⊗ b = ø ↑ (a + b)

where max-sum-prefix consists of two fold fusion:

max-sum-prefix : (x : List Val) -> max (map⁺ sum (inits x)) ≡ foldr _⊗_ ø
max-sum-prefix x =
chain> max (map⁺ sum (inits x))
=== max (foldr _⊕_ [ ø ]⁺ x)
by cong max (foldr-fusion (map⁺ sum) lemma1 x)
=== foldr _⊗_ ø x
by foldr-fusion max lemma2 x
where lemma1 : (a : Val) -> (xs : List⁺ (List Val)) ->
map⁺ sum (ini a xs) ≡ (a ⊕ (map⁺ sum xs))
lemma1 a xs = ?
lemma2 : (a : Val) -> (x : List⁺ Val) ->
max (a ⊕ x) ≡ (a ⊗ max x)
lemma2 a x = ?

The two lemmas are given in the files attached below. Having the derivation, mss is given by:

mss : List Val -> Val
mss x = depfst (mss-der x)

it is an executable program that is proved to be correct.

The complete Agda program is split into five files:

• MSS.agda: the main program importing all the sub-modules.
• ListProperties.agda: some properties I need about lists, such as fold fusion, concat ○ map (map f) = map f ○ concat, etc. Later in the development I realised that I should switch to non-empty lists, so not all of the properties here are used.
• List+.agda: non-empty lists and some of its properties.
• Derivation.agda: the main derivation of MSS. The derivation is parameterised over any numerical data and operators + and such that + is associative, and a + (b ↑ c) = (a ↑ b) + (a ↑ c). The reason of this parameterisation, however, was that I did not know how to prove the properties above, until Josh showed me the proof.
• IntRNG.agda: proofs from Josh that Data.Int actually satisfy the properties above. (Not quite complete yet.)

# Agda Exercise: Proving that Mergesort Returns Ordered Lists

Continuing with my Adga programming practice. This time I am implementing the example in another section of Why Dependent Types Matter by Altenkirch, McBride, and McKinna in Agda: proving that mergesort returns an ordered list.

> module Main where

> open import Prelude
> open import Data.Nat hiding (_≤_)
> open import Data.List

### Order

To begin with, a term of type m ≤ n is a proof that a natural number m is at most n:

> data _≤_ : Nat -> Nat -> Set where
>  le0 : {n : Nat} -> zero ≤ n
>  le1 : {m n : Nat} -> m ≤ n -> (1 + m) ≤ (1 + n)

An auxiliary function weaken, which I will need later, weakens a proof of (1+m) ≤ n to a proof of m ≤ n:

> weaken : {y x : Nat} -> (1 + y) ≤ x -> y ≤ x
> weaken (le1 le0) = le0
> weaken (le1 (le1 p)) = le1 (weaken (le1 p))

Given two natural numbers m and n, the function order, to be defined later, determines their order by returning a proof. The proof has to be wrapped in a datatype Order:

> data Order : Nat -> Nat -> Set where
>   leq : {m n : Nat} -> m ≤ n -> Order m n
>   gt : {m n : Nat} -> (1 + n) ≤ m -> Order m n

Agda does not provide a generic case expression, so I have to define my own for Order:

> caseord : {m n : Nat} {a : Set} -> Order m n
>             -> ((m ≤ n) -> a) -> (((1 + n) ≤ m) -> a) -> a
> caseord (leq p) f g = f p
> caseord (gt p) f g = g p

The function order can thus be defined by:

> order : (m : Nat) -> (n : Nat) -> Order m n
> order zero n = leq le0
> order (suc m) zero = gt (le1 le0)
> order (suc m) (suc n) =
>   caseord (order m n) (leq ∘ le1) (gt ∘ le1)

### Ordered List

There are many ways one can express the property that a list is sorted. The choice depends on what one needs for the proof. To prove that merge sorts the list, Altenkirch, McBride, and McKinna suggests the following “external bound” representation of sortedness:

> data OList : Nat -> Set where
>   Nil : {b : Nat} -> OList b
>   Cons : {b : Nat} -> (x : Nat) -> (b ≤ x) -> OList x -> OList b

The datatype OList b represents a sorted list of natural numbers whose elements are all greater than or equal to the bound b. The bound, however, is supplied externally and need not be precise. Every sorted list of natural numbers has type OList 0, for instance. The Cons constructor demands that the newly added element x is bounded by the externally supplied bound b, and that the tail can be bounded by x.

### Merging Sorted Lists

With OList, merge can be verified very easily:

> merge : {b : Nat} -> OList b -> OList b -> OList b
> merge Nil ys = ys
> merge xs Nil = xs
> merge (Cons x blex xs) (Cons y bley ys) =
>   caseord (order x y)
>     (\\ xley -> Cons x blex (merge {x} xs (Cons y xley ys)))
>     (\\ pylex -> Cons y bley (merge {y} (Cons x (weaken pylex) xs) ys))

Note, for example, that Cons y bley ys : OList b is updated to Cons y xley ys : OList x simply by using a different proof.

### Mergesort

The rest are just some routine code: dealing elements to a tree, and perform successive merging by a fold on the tree. Mergesort is defined by mergeT ∘ dealT. The code below is similar to that in my previous blog entry, but without the size constraint.

> data Parity : Set where
>   Even : Parity
>   Odd : Parity

> data Tree (a : Set) : Set where
>   Nul : Tree a
>   Tip : a -> Tree a
>   Bin : Parity -> Tree a -> Tree a -> Tree a

> foldT : {a b : Set}
>           -> (Parity -> b -> b -> b)
>           -> (a -> b)
>           -> b
>           -> Tree a -> b
> foldT f g e Nul = e
> foldT f g e (Tip x) = g x
> foldT f g e (Bin p t u) = f p (foldT f g e t) (foldT f g e u)

> insert : {a : Set} -> a -> Tree a -> Tree a
> insert x Nul = Tip x
> insert x (Tip y) = Bin Even (Tip x) (Tip y)
> insert x (Bin Even t u) =
>    Bin Odd (insert x t) u
> insert x (Bin Odd t u) =
>    Bin Even t (insert x u)

> dealT : List Nat -> Tree Nat
> dealT = foldr insert Nul

> mergeT : Tree Nat -> OList 0
> mergeT = foldT (\p xs ys -> merge xs ys)
>                (\x -> Cons x le0 Nil)
>                Nil

> msort : List Nat -> OList 0
> msort = mergeT ∘ dealT

# Agda Exercise: Sized Mergesort

Continuing with my Adga programming practice. Part of the mergesort example became easy once Max showed me how I should perform the pattern matching in insert.

> open import Prelude
> open import Logic.Identity
> open import Data.Bool
> open import Data.Nat
> open import Data.Nat.Properties

> module Main where

### Length-Indexed Lists

Firstly, define length-indexed lists and its fold function.

> data List (a : Set) : Nat -> Set where
>   [] : List a zero
>   _::_ : {n : Nat} -> a -> List a n -> List a (suc n)

> foldr : {a : Set} -> {n : Nat} -> (b : {m : Nat} -> List a m -> Set)
>           -> ({m : Nat} -> {xs : List a m}
>                   -> (x : a) -> b xs -> b (x :: xs))
>           -> b []
>           -> (xs : List a n) -> b xs
> foldr b f e [] = e
> foldr b f e (x :: xs) = f x (foldr b f e xs)

Apart from the extra parameter b, the body of foldr is not too different from the ordinary fold on lists. For genericity, however, its type is more complicated. I wonder whether it is still appropriate to call it a “fold”, now that xs appears as a parameter. Perhaps that is why people talk about “eliminators” rather than folds or catamorphisms.

### Size-Indexed Balanced Trees

This is one of the possible ways to define sized-indexed balanced trees. The parity bit keep notes of whether the number of the elements is even or odd.

> data Parity : Set where
>   Even : Parity
>   Odd : Parity

> parity : Parity -> Nat
> parity Even = 0
> parity Odd = 1

The Bin constructor enforces the constraint that if the size is even, the two subtrees must have the same size. Otherwise the left subtree contains exactly one more element than the right.

> data Tree (a : Set) : Nat -> Set where
>   Nul : Tree a 0
>   Tip : a -> Tree a 1
>   Bin : {n : Nat} -> (p : Parity) ->
>           Tree a (parity p + n) -> Tree a n -> Tree a (parity p + (n + n))

Fold on trees is defined in a way similar to fold on lists.

> foldT : {a : Set} -> {n : Nat} -> (b : {m : Nat} -> Tree a m -> Set)
>           -> ({m : Nat} -> (p : Parity)
>                 -> {t : Tree a (parity p + m)} -> {u : Tree a m}
>                 -> b t -> b u -> b (Bin p t u))
>           -> ((x : a) -> b (Tip x))
>           -> b Nul
>           -> (xs : Tree a n) -> b xs
> foldT b f g e Nul = e
> foldT b f g e (Tip x) = g x
> foldT b f g e (Bin p t u) = f p (foldT b f g e t) (foldT b f g e u)

### Mergesort

The function insert adds one element to a balanced tree. I did not know how to get the case for Bin Odd t u type-check until Max pointed out that I could just pattern match the implicit argument {n} of Bin.

> insert : {a : Set} -> {n : Nat} ->
>            a -> Tree a n -> Tree a (suc n)
> insert x Nul = Tip x
> insert x (Tip y) = Bin Even (Tip x) (Tip y)
> insert x (Bin Even t u) =
>    Bin Odd (insert x t) u
> insert {a} x (Bin {n} Odd t u) =
>    subst (\\ i -> Tree a (suc i)) (sym (+suc n n))
>      (Bin Even t (insert x u))

The function dealT, dealing elements of the input list to a balanced tree, can thus be defined as a fold:

> dealT : {a : Set} -> {n : Nat} -> List a n -> Tree a n
> dealT {a} = foldr (\{m} xs -> Tree a m) insert Nul

The function merge has been discussed before:

> merge : {m n : Nat} ->
>           List Nat m -> List Nat n -> List Nat (m + n)
> merge [] ys = ys
> merge {suc m} {zero} (x :: xs) Nil =
>   subst (List Nat) (+zero (suc m)) (x :: xs)
> merge {suc m} {suc n} (x :: xs) (y :: ys) =
>    if x < y then
>        x :: merge xs (y :: ys)
>      else (y ::
>            subst (List Nat) (+suc m n) (merge (x :: xs) ys))

The merging phase of mergesort can be seen as a fold on a binary tree:

> mergeT : {n : Nat} -> Tree Nat n -> List Nat n
> mergeT  = foldT (\\ {m} t -> List Nat m)
>                 (\\ {m} p xs ys ->
>                   subst (List Nat) (+assoc (parity p) m m) (merge xs ys))
>                 (\\ x -> x :: [])
>                 []

I needed +assoc to type-cast (parity p + m) + m into parity p + (m + m).

Finally, here comes mergesort.

> msort : {n : Nat} -> List Nat n -> List Nat n
> msort = mergeT  ∘ dealT

Of course, when it comes to sorting, one would expect that we prove that it actually sorts the list than merely that it preserves the length. It will perhaps be left to the next exercise.

# My First Agda Program: Append, Reverse, and Merge

I have been looking for a programming language to practice dependent type with. During the WG 2.1 meeting, my attention was brought to Agda (in fact, Adga 2). The word Agda may refer to both the proof assistant and its input language. It has been under long development, is relatively matured, has good library support. It supports real dependent type, and, pardon me for my personal preference, it is Haskell-like. So far it seems like a perfect choice.

So, here is my first Agda program. I am merely repeating some simple exercises on lists, which I tried using Haskell in a previous entry Developing Programs and Proofs Spontaneously using GADT.

> open import Logic.Identity
> open import Data.Bool
> open import Data.Nat
> open import Data.Nat.Properties
> module Main where

### List Concatenation

Adga has natural numbers defined in Data.Nat. With Nat, one can define lists indexed by their lengths as:

> data List (a : Set) : Nat -> Set where
>  [] : List a zero
>  _::_ : {n : Nat} -> a -> List a n -> List a (suc n)

The definition of append naturally follows:

> append : {a : Set} -> {m , n : Nat} ->
>             List a m -> List a n -> List a (m + n)
> append [] ys = ys
> append (x :: xs) ys = x :: append xs ys

### List Reversal

Unlike in Haskell, identifiers in Adga have to be defined earlier in the file before they can be referred to. In the usual definition of linear-time list reversal using an accumulating parameter, the recursive call yields a list of length m + suc n. I have got to somehow provide a proof that it equals suc m + n.

How does one provide such a proof? In Omega, one would make a list of all lemmas needed in a where clause and the compiler would mysteriously figure out a way to use them. (Well, I am sure it is well documented. I just have not understood its algorithm.) I would like to have more control over what is happening, and therefore would prefer to present the proof as a type cast from f (m + suc n) to f (suc m + n) for arbitrary type functor f. This is what one would do in Epigram and Adga.

psuc : {m : Nat} -> {n : Nat} -> {f : Nat -> Set} ->
f (m + suc n) -> f (suc m + n)
psuc {zero} {n} {f} xs = xs
psuc {suc m} {n} {f} xs = psuc {m} {n} {f'} xs
where f' : Nat -> Set
f' n = f (suc n)

At this point I did not yet know that I could use the lemmas defined in Data.Nat.Properties instead of rolling my own. The function revcat can then be defined as below:

revcat : {a : Set} -> {m n : Nat} ->
List a m -> List a n -> List a (m + n)
revcat {a} {zero} {n} [] ys = ys
revcat {a} {suc m} {n} (x :: xs) ys =
psuc {m} {n} {List a} (revcat xs (x :: ys))

One thing I have yet to figure out is how much of the type information can be inferred. The first clause can be simplified to revcat [] ys = ys. If I omit the type information in the right-hand side of the second clause:

revcat {a} {suc m} {n} (x :: xs) ys =
psuc (revcat xs (x :: ys))

I would expect Agda to tell me that it cannot infer the types. However, the program still appears to typecheck. When I try to execute revcat (1 :: (3 :: [])) (2 :: (4 :: [])) in the interpreter, I get _86 1 1 (3 :: []) (2 :: (4 :: [])), which seems to be an unevaluated expression. Does it not violate the slogan that “well-typed programs don’t go wrong?”

Using the predefined lemma in Data.Nat.Properties:

+suc : (n m : Nat) -> n + suc m ≡ suc (n + m)
+suc  zero   m = refl
+suc (suc n) m = cong suc (+suc n m)

The type cast in revcat can be performed using subst:

>  revcat : {a : Set} -> {m n : Nat} ->
>              List a m -> List a n -> List a (m + n)
>  revcat {a} {zero} {n} [] ys = ys
>  revcat {a} {suc m} {n} (x :: xs) ys =
>     subst (List a) (sym (+suc m n)) (revcat xs (x :: ys))

### Merging

Once I got revcat working, merge is not too difficult.

> merge : {m n : Nat} ->
>           List Nat m -> List Nat n -> List Nat (m + n)
> merge [] ys = ys
> merge {suc m} {zero} (x :: xs) Nil =
>   subst (List Nat) (+zero (suc m)) (x :: xs)
> merge {suc m} {suc n} (x :: xs) (y :: ys) =
>    if x < y then
>        x :: merge xs (y :: ys)
>      else (y ::
>            subst (List Nat) (+suc m n) (merge (x :: xs) ys))

Curiously, the parentheses around the else branch can not be omitted. Without them the expression would be parsed as (if ... then .. else y) :: subst.... The precedence for _::_, which is not declared, is assumed to be higher than if_then_else_.

### Balanced Trees

I tried to go on to define balanced trees, like in Altenkirch, McBride, and McKinna’s introduction to Epigram:

data Parity : Set where
Even : Parity
Odd : Parity

parity : Parity -> Nat
parity Even = 0
parity Odd = 1

data Tree (a : Set) : Nat -> Set where
Nul : Tree a zero
Tip : a -> Tree a (suc zero)
Bin : {m n : Nat} -> (p : Parity) ->
Tree a (parity p + n) -> Tree a n -> Tree a (parity p + (n + n))

The Bin constructor needs a parity bit. If the parity bit is Even, the two subtrees have the same size. Otherwise they differ by one.

However, I have not figured out how to pattern match the size parameter: when defining a function f : {a :Set} -> {n : Nat} -> Tree a n-> ..., Agda complained when I tried this pattern f {a} {n} (Bin Odd t u) = ... because parity p + (n' + n') != n. Perhaps I should start reading some more documents.