Terminating Unfolds (1)

The AoPA(Algebra of Programming in Agda) library allows one to derive programs in a style resembling that in the book Algebra of Programming in Agda. For example, the following is a derivation of insertion sort:

sort-der : ∃ ([ Val ] -> [ Val ]) (\f -> ordered? ○ permute ⊒ fun f )
sort-der = exists (
      ordered? ○ permute
  ⊒⟨ (\vs -> ·-monotonic ordered? (permute-is-fold vs)) ⟩
      ordered? ○ foldR combine nil
  ⊒⟨ foldR-fusion ordered? ins-step ins-base ⟩
      foldR (fun (uncurry insert)) nil
  ⊒⟨ foldR-to-foldr insert [] ⟩
      fun (foldr insert [])

isort : [ Val ] -> [ Val ]
isort = witness sort-der

The type of sort-der is a proposition that there exists a program of type [ Val ] → [ Val ] that is contained in ordered ? ◦ permute, a relation mapping a list to one of its ordered permutations. The combinator fun embeds a function in a relation. Relations on each line are related by relational inclusion (), while the proof of inclusion are given in angle brackets. In the final step, the relation foldR (fun (uncurry insert)) nil is refined to a function using the rule:

foldR-to-foldr : {A B : Set} -> (f : A -> B -> B) -> (e : B) ->
    foldR (fun (uncurry f)) (singleton e) ⊒ fun (foldr f e)

which says that a relational fold is a functional fold if the algebra (f) is a function. When the proof is completed, an algorithm isort is obtained by extracting the witness of the proposition. It is an executable program that is backed by the type system to meet the specification.

One may wonder: why insertion sort, rather than something more interesting such as mergesort or quicksort? Because we have not yet figured out how to properly deal with hylomorphism yet. One of the things still missing in AoPA is a treatment of unfolds. AoPA attempts to model the category of sets and relations. In this context one can talk about unfolds because they are simply the converses of folds. We perform the derivations in terms of folds, and refine the converse of a relational fold to a functional unfold provided that the coalgebra is a function:

foldR˘-to-unfoldr : {A B : Set} -> (f : B -> ⊤ ⊎ (A × B)) ->
    (foldR ((fun f) ˘ ○ (fun inj₂)) (\b -> isInj₁ (f b))) ˘ ⊒ fun (unfoldr f)

where ˘ denotes relational converse and unfoldr is given by the usual definition (one that a Haskell programmer would expect, modulo syntax):

unfoldr : {A B : Set} -> (B -> ⊤ ⊎ (A × B)) -> B -> [ A ]
unfoldr f b with f b
... | inj₁ _ = []
... | inj₂ (a , b') = a ∷ unfoldr f b'

The only problem is that this unfoldr cannot pass the termination check of Agda.

My first thought was to wait until Agda supports codata and coinduction. However, the reason I wanted an unfold in AoPA in the first place was to compose it with a fold to form a hylomorphism, which is not allowed if the unfold returns codata. I then attempted to model unfolds in a way similar to what we did in relational program derivation. We know that the recursive equation:

R = S ○ FR ○ T

has a unique solution R = ⦇S⦈ ○ ⦇T⦈˘ if and only if T is well-founded. In program derivation, the proof of well-foundedness is usually given separately. Therefore I hope to come up with a variation of unfoldr which, given a proof of well-foundedness, somehow passes the termination check.

During DTP 2008, I consulted a number of people and was suggested at least three solutions. I am going to summarise them in turn.

Canonical Examples

With each approach I would like to tackle two tasks. Given the following function dec:

dec : ℕ -> ⊤ ⊎ (ℕ × ℕ)
dec zero = inj₁ tt
dec (suc n) = inj₂ (n , n)

One may attempt to define a function down using unfoldr such that down n produces a list from n to 0. As mentioned above, unfoldr cannot pass the termination check. In fact, even the following definition fails the termination check:

down : ℕ -> [ ℕ ]
down n with dec n
... | inj₁ _ = []
... | inj₂ (a , n') with down n'
...    | x = a ∷ x

The first task is to define down using some kind of unfold.

In the second tasks, given a non-empty list, we would like to distribute its elements to a roughly balanced binary tree. The list and the tree are define by:

data List⁺ (a : Set) : Set where
    [_]⁺ : a -> List⁺ a
    _∷⁺_ : a -> List⁺ a -> List⁺ a
data Tree⁺ (a : Set) : Set where
    Tip⁺ : a -> Tree⁺ a
    Bin⁺ : Tree⁺ a -> Tree⁺ a -> Tree⁺ a

If we apply, for example, foldT merge wrap, we get merge sort (I use non-empty lists to save the constructor for empty trees). In Haskell, one might first define a function that splits a list to roughly two halfs, something equivalent to this:

split⁺ : {a : Set} -> List⁺ a -> a ⊎ (List⁺ a × List⁺ a)
split⁺ [ x ]⁺ = inj₁ x
split⁺ (x ∷⁺ xs) with split⁺ xs
... | inj₁ y = inj₂ ([ y ]⁺ , [ x ]⁺)
... | inj₂ (ys , zs) = inj₂ (zs , x ∷⁺ ys)

and use an unfold on Tree⁺. The second task is to construct a function expand : {a : Set} -> List⁺ a -> Tree⁺ a that builds a roughly balanced tree, using some kind of unfold that makes use of split⁺, or at least some variation of it.

Josh Ko’s Approach

Josh and I tried to extend unfolds with extra arguments representing a bound and proofs that the bound decreases with each call to f:

unfoldr↓ : {a b : Set} {_≾_ : b -> ℕ -> Set} ->
  (f : b -> ⊤ ⊎ (a × b)) -> (y : b) -> (t : ℕ) ->
  y ≾ t -> (forall {y} -> y ≾ zero -> f y ≡ inj₁ tt) ->
    (forall {y} t' {x y'} ->
         y ≾ suc t' -> f y ≡ inj₂ (x , y') -> y' ≾ t')
      -> [ a ]

Apart from the generating function f and the seed b, the function unfoldr↓ takes a bound t, a natural number. The seed and the bound is related by , to be defined for each specific problem. Before producing the list, unfoldr↓ demands that the current seed is bounded by the bound (y ≾ t), a proof that f must yield inj₁ when the bound is zero, and a proof that if f returns inj₂ (x , y'), the new seed y' is bounded by a strictly smaller bound.

To define unfoldr↓ we need the inspect idiom (see the explanation in Relation.Binary.PropositionalEquality):

unfoldr↓ f y t y≾t p1 p2 with inspect (f y)
... | (inj₁ tt) with-≡ inj₁≡fy = []
... | (inj₂ (x , y')) with-≡ inj₂≡fy with t
...  | zero = contradiction (p1 y≾t)
        (\fy≡inj₁ -> inj₁≢inj₂ (≡-sym (≡-trans inj₂≡fy fy≡inj₁)))
...  | suc t' = x ∷ unfoldr↓ f y' t'
            (p2 t' y≾t (≡-sym inj₂≡fy)) p1 p2

Notice that unfoldr↓ passes the termination check because it pattern matches on t.

Descending List

For the first task, the seed is a natural number, and we may simply define to be :

_≾_ : ℕ -> ℕ -> Set
n ≾ c = n ≤ c

Then down↓ can be defined in terms of unfoldr↓ and dec by:

stop : forall {n} -> n ≾ zero -> dec n ≡ inj₁ tt
stop {.zero} z≤n = ≡-refl

proceed : forall {n} c' {x n'} -> n ≾ suc c' -> dec n ≡ inj₂ (x , n') -> n' ≾ c'
proceed {.zero} c' {x} {n'} z≤n dec-zero≡inj₂ = contradiction dec-zero≡inj₂ inj₁≢inj₂
proceed {suc m} c' {.m} {.m} (s≤s m≤c') ≡-refl = m≤c'

down↓ : ℕ -> [ ℕ ]
down↓ n = unfoldr↓ dec n n ≤-refl stop proceed

where ≤-refl : forall x -> x ≤ x.

Expanding a Binary Tree

The unfold above can be generalised to trees in the obvious way:

unfoldT↓ : {a b : Set} {_≼_ : b -> ℕ -> Set} ->
  (f : b -> a ⊎ b × b) -> (y : b) -> (n : ℕ) ->
   y ≼ n -> (forall y -> y ≼ zero -> ∃ a (\x -> f y ≡ inj₁ x)) ->
    (forall y i y₁ y₂ ->
        y ≼ suc i -> f y ≡ inj₂ (y₁ , y₂) -> (y₁ ≼ i × y₂ ≼ i))
    -> Tree⁺ a
unfoldT↓ f y n y≼n p₁ p₂ with inspect (f y)
... | inj₁ x with-≡ _ = Tip⁺ x
... | inj₂ (y₁ , y₂) with-≡ inj₂≡fy with n
...  | zero = contradiction (proof (p₁ y y≼n))
      (\fy≡inj₁ -> inj₁≢inj₂ (≡-sym (≡-trans inj₂≡fy fy≡inj₁)))
...  | suc m with p₂ y m y₁ y₂ y≼n (≡-sym inj₂≡fy)
...    | (y₁≼m , y₂≼m) = Bin⁺ (unfoldT↓ f y₁ m y₁≼m p₁ p₂)
      (unfoldT↓ f y₂ m y₂≼m p₁ p₂)

One would wonder whether the condition (y₁ ≼ i × y₂ ≼ i) is too restrictive. When the proposition has to be proved inductively, we may need a stronger inductive hypothesis, for example, that both y₁ and y₂ are bounded by half of suc i. The current definition luckily works for our second task. We may need some generalisation later.

Repeated calls to split⁺ eventually terminates because the lengths of the lists are strictly decreasing. Therefore we define:

_#≤_ : {A : Set} -> List⁺ A -> ℕ -> Set
xs #≤ n = pred (length⁺ xs) ≤ n

With properly defined lemma1 and lemma2, we can expand the tree by unfoldT↓:

expand↓ : {a : Set} -> List⁺ a -> Tree⁺ a
expand↓ {a} xs = unfoldT↓ {a} {List⁺ a} {_#≤_} split⁺ xs
      (pred (length⁺ xs)) ≤-refl lemma1 lemma2

where the two lemmas respectively have types:

lemma1 : {a : Set}(xs : List⁺ a) ->
   xs #≤ zero -> ∃ a (\x -> split⁺ xs ≡ inj₁ x)
lemma2 : {a : Set} (xs : List⁺ a) (i : ℕ)
  (ys : List⁺ a) (zs : List⁺ a) ->
  xs #≤ suc i -> split⁺ xs ≡ inj₂ (ys , zs) -> ys #≤ i × zs #≤ i

Definition of lemma1 is a simple application of contradiction. On the other hand, lemma2 is very tedious to define. It appears to me that the definition inevitably becomes cumbersome because split⁺ and lemma2 are defined separately. It could be much easier if split⁺ is defined to return the computed results as well as some proofs about them. For my application in AoPA, however, I would prefer to be able to reuse the old split⁺. I don’t know a good solution yet.

Nils Anders Danielsson suggested that it would be nicer to integrate the bound into the type of the seed, while Conor McBride suggested yet another approach. I will try to summarise them next time.

To be continued…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>