Tag Archives: Haskell

Developing Programs and Proofs Spontaneously using GADT

This entry was also posted to the Haskell-Cafe mailing list.

I am curious about the possibility of developing Haskell programs spontaneously with proofs about their properties and have the type checker verify the proofs for us, in a way one would do in a dependently typed language.

In the exercise below, I tried to redo part of the merge-sort example in Altenkirch, McBride, and McKinna’s introduction to Epigram: deal the input list into a binary tree, and fold the tree by the function merging two sorted lists into one. The property I am going to show is merely that the length of the input list is preserved.

Given that dependent types and GADTs are such popular topics, I believe the same must have been done before, and there may be better ways to do it. If so, please give me some comments or references. Any comments are welcomed.

> {-# OPTIONS_GHC -fglasgow-exts #-}

To begin with, we define the usual type-level representation of natural numbers and lists indexed by their lengths.

> data Z = Z       deriving Show
> data S a = S a   deriving Show

> data List a n where
>   Nil :: List a Z
>   Cons :: a -> List a n -> List a (S n)


To warm up, let us see the familiar “append” example. Unfortunately, unlike Omega, Haskell does not provide type functions. I am not sure which is the best way to emulate type functions. One possibility is to introduce the following GADT:

> data Plus m n k where    --- m + n = k
>   PlusZ :: Plus Z n n
>   PlusS :: Plus m n k -> Plus (S m) n (S k)

such that Plus m n k represents a proof that m + n = k.

Not having type functions, one of the possible ways to do append is to have the function, given two lists of lengths m and n, return a list of length k and a proof that m + n = k. Thus, the type of append would be:

   append :: List a m -> List a n
               -> exists k. (List a k, Plus m n k)

In Haskell, the existential quantifier is mimicked by forall. We define:

> data DepSum a p = forall i . DepSum (a i) (p i)

The term “dependent sum” is borrowed from the Omega tutorial of Sheard, Hook, and Linger. Derek Elkins and Dan Licata explained to me why they are called dependent sums, rather than products.

The function append can thus be defined as:

> append :: List a m -> List a n -> DepSum (List a) (Plus m n)
> append Nil ys = DepSum ys PlusZ
> append (Cons x xs) ys =
>    case (append xs ys) of
>      DepSum zs p -> DepSum (Cons x zs) (PlusS p)

Another possibility is to provide append a proof that m + n = k. The type and definition of append would be:

< append :: Plus m n k -> List a m -> List a n -> List a k
< append PlusZ Nil ys = ys
< append (PlusS pf) (Cons x xs) ys = Cons x (append pf xs ys)

I thought the second append would be more difficult to use: to append two lists, I have to provide a proof about their lengths! It turns out that this append actually composes easier with other parts of the program. We will come to this later.

Some Lemmas

Here are some lemmas represented as functions on terms. The function incAssocL, for example, converts a proof of m + (1+n) = k to a proof of (1+m) + n = k.

> incAssocL :: Plus m (S n) k -> Plus (S m) n k
> incAssocL PlusZ = PlusS PlusZ
> incAssocL (PlusS p) = PlusS (incAssocL p)

> incAssocR :: Plus (S m) n k -> Plus m (S n) k
> incAssocR (PlusS p) = plusMono p

> plusMono :: Plus m n k -> Plus m (S n) (S k)
> plusMono PlusZ = PlusZ
> plusMono (PlusS p) = PlusS (plusMono p)

For example, the following function revcat performs list reversal by an accumulating parameter. The invariant we maintain is m + n = k. To prove that the invariant holds, we have to use incAssocL.

> revcat :: List a m -> List a n -> DepSum (List a) (Plus m n)
> revcat Nil ys = DepSum ys PlusZ
> revcat (Cons x xs) ys =
>     case revcat xs (Cons x ys) of
>        DepSum zs p -> DepSum zs (incAssocL p)


Apart from the proof manipulations, the function merge is not very different from what one would expect:

> merge :: Ord a => List a m -> List a n -> DepSum (List a) (Plus m n)
> merge Nil ys = DepSum ys PlusZ
> merge (Cons x xs) Nil = append (Cons x xs) Nil
> merge (Cons x xs) (Cons y ys)
>   | x <= y = case merge xs (Cons y ys) of
>                  DepSum zs p -> DepSum (Cons x zs) (PlusS p)
>   | otherwise = case merge (Cons x xs) ys of
>                   DepSum zs p -> DepSum (Cons y zs) (plusMono p)

The lemma plusMono is used to convert a proof of m + n = k to a proof of m + (1+n) = 1+k.

Sized Trees

We also index binary trees by their sizes:

> data Tree a n where
>   Nul :: Tree a Z
>   Tip :: a -> Tree a (S Z)
>   Bin :: Tree a n1 -> Tree a n ->
>             (Plus p n n1, Plus n1 n k) -> Tree a k

The two trees given to the constructor Bin have sizes n1 and n respectively. The resulting tree, of size k, comes with a proof that n1 + n = k. Furthermore, we want to maintain an invariant that n1 either equals n, or is bigger than n by one. This is represented by the proof Plus p n n1. In the definition of insertT later, p is either PlusZ or PlusS PlusZ.

Lists to Trees

The function insertT inserts an element into a tree:

> insertT :: a -> Tree a n -> Tree a (S n)
> insertT x Nul = Tip x
> insertT x (Tip y) = Bin (Tip x) (Tip y) (PlusZ, PlusS PlusZ)
> insertT x (Bin t u (PlusZ, p)) =
>      Bin (insertT x t) u (PlusS PlusZ, PlusS p)
> insertT x (Bin t u (PlusS PlusZ, p)) =
>      Bin t (insertT x u) (PlusZ, PlusS (incAssocR p))

Note that whenever we construct a tree using Bin, the first proof, corresponding to the difference in size of the two subtrees, is either PlusZ or PlusS PlusZ.

The counterpart of foldr on indexed list is defined by:

> foldrd :: (forall k . (a -> b k -> b (S k))) -> b Z
>               -> List a n -> b n
> foldrd f e Nil = e
> foldrd f e (Cons x xs) = f x (foldrd f e xs)

The result is also an indexed type (b n).

The function deal :: List a n -> Tree a n, building a tree out of a list, can be defined as a fold:

> deal :: List a n -> Tree a n
> deal = foldrd insertT Nul

Trees to Lists, and Merge Sort

The next step is to fold through the tree by the function merge. The first two clauses are simple:

> mergeT :: Ord a => Tree a n -> List a n
> mergeT Nul = Nil
> mergeT (Tip x) = Cons x Nil

For the third clause, one would wish that we could write something as simple as:

   mergeT (Bin t u (_,p1)) =
      case merge (mergeT t) (mergeT u) of
        DepSum xs p -> xs

However, this does not type check. Assume that t has size n1, and u has size n. The DepSum returned by merge consists of a list of size i, and a proof p of type Plus m n i, for some i. The proof p1, on the other hand, is of type P m n k for some k. Haskell does not know that Plus m n is actually a function and cannot conclude that i=k.

To explicitly state the equality, we assume that there is a function plusFn which, given a proof of m + n = i and a proof of m + n = k, yields a function converting an i in any context to a k. That is:

   plusFn :: Plus m n i -> Plus m n k
               -> (forall f . f i -> f k)

The last clause of mergeT can be written as:

mergeT (Bin t u (_,p1)) =
   case merge (mergeT t) (mergeT u) of
     DepSum xs p -> plusFn p p1 xs

How do I define plusFn? On Haskell-Cafe, Jim Apple suggested this implementation (by the way, he maintains a very interesting blog on typed programming):

  plusFn :: Plus m n h -> Plus m n k -> f h -> f k
  plusFn PlusZ PlusZ x = x
  plusFn (PlusS p1) (PlusS p2) x =
      case plusFn p1 p2 Equal of
        Equal -> x

  data Equal a b where
      Equal :: Equal a a

Another implementation, which looks closer to the previous work on type equality [3, 4, 5], was suggested by apfelmus:

> newtype Equal a b = Proof { coerce :: forall f . f a -> f b }

> newtype Succ f a  = InSucc { outSucc :: f (S a) }

> equalZ :: Equal Z Z
> equalZ = Proof id

> equalS :: Equal m n -> Equal (S m) (S n)
> equalS (Proof eq) = Proof (outSucc . eq . InSucc)

> plusFnEq :: Plus m n i -> Plus m n k -> Equal i k
> plusFnEq PlusZ     PlusZ     = Proof id
> plusFnEq (PlusS x) (PlusS y) = equalS (plusFn x y)

> plusFn :: Plus m n i -> Plus m n k -> f i -> f k
> plusFn p1 p2 = coerce (plusFnEq p1 p2)

Now that we have both deal and mergeT, merge sort is simply their composition:

> msort :: Ord a => List a n -> List a n
> msort = mergeT . deal

The function mergeT can be defined using a fold on trees as well. Such a fold might probably look like this:

> foldTd :: (forall m n k . Plus m n k -> b m -> b n -> b k)
>           -> (a -> b (S Z)) -> b Z
>           -> Tree a n -> b n
> foldTd f g e Nul = e
> foldTd f g e (Tip x) = g x
> foldTd f g e (Bin t u (_,p)) =
>    f p (foldTd f g e t) (foldTd f g e u)

   mergeT :: Ord a => Tree a n -> List a n
   mergeT = foldTd merge' (\x -> Cons x Nil) Nil
    where merge' p1 xs ys =
            case merge xs ys of
              DepSum xs p -> plusFn p p1 xs

I am not sure whether this is a "reasonable" type for foldTd.

Passing in the Proof as an Argument

Previously I thought the second definition of append would be more difficult to use, because we will have to construct a proof about the lengths before calling append. In the context above, however, it may actually be more appropriate to use this style of definitions.

An alternative definition of merge taking a proof as an argument can be defined by:

< merge :: Ord a => Plus m n k -> List a m ->
<                       List a n -> List a k
< merge PlusZ Nil ys = ys
< merge pf (Cons x xs) Nil = append pf (Cons x xs) Nil
< merge (PlusS p) (Cons x xs) (Cons y ys)
<   | x <= y    = Cons x (merge p xs (Cons y ys))
<   | otherwise = Cons y (merge (incAssocL p) (Cons x xs) ys)

A definition of mergeT using this definition of merge follows immediately because we can simply use the proof coming with the tree:

< mergeT :: Ord a => Tree a n -> List a n
< mergeT Nul = Nil
< mergeT (Tip x) = Cons x Nil
< mergeT (Bin t u (_,p1)) =
<   merge p1 (mergeT t) (mergeT u)

I don't know which approach can be called more "natural". However, both Jim Apple and apfelmus pointed out that type families, a planned new feature of Haskell, may serve as a kind of type functions. With type families, the append example would look like ( code from apfelmus):

  type family   Plus :: * -> * -> *
  type instance Plus Z n     = n
  type instance Plus (S m) n = S (Plus m n)

  append :: (Plus m n ~ k) => List a m -> List a n -> List a k
  append Nil         ys = ys
  append (Cons x xs) ys = Cons x (append xs ys)

apfelmus commented that "viewed with the dictionary translation for type classes in mind, this is probably exactly the alternative type of append you propose: append :: Plus m n k -> List a m -> List a n -> List a k".


  • [1] Thorsten Altenkirch, Conor McBride, and James McKinna. Why Dependent Types Matter.
  • [2] Tim Sheard, James Hook, and Nathan Linger. GADTs + Extensible Kinds = Dependent Programming.
  • [3] James Cheney, Ralf Hinze. A lightweight implementation of generics and dynamics.
  • [4] Stephanie Weirich, Type-safe cast: (functional pearl), ICFP 2000.
  • [5]Arthur I. Baars , S. Doaitse Swierstra, Typing dynamictyping, ACM SIGPLAN Notices, v.37 n.9, p.157-166, September 2002

Appendix. Auxiliary Functions

> instance Show a => Show (List a n) where
>  showsPrec _ Nil = ("[]"++)
>  showsPrec _ (Cons x xs) = shows x . (':':) . shows xs

> instance Show (Plus m n k) where
>   showsPrec _ PlusZ = ("pZ"++)
>   showsPrec p (PlusS pf) = showParen (p>=10) (("pS " ++) .
>                              showsPrec 10 pf)

> instance Show a => Show (Tree a n) where
>   showsPrec _ Nul = ("Nul"++)
>   showsPrec p (Tip x) = showParen (p >= 10) (("Tip " ++) . shows x)
>   showsPrec p (Bin t u pf) =
>     showParen (p>=10)
>      (("Bin "++) . showsPrec 10 t . (' ':) . showsPrec 10 u .
>       (' ':) . showsPrec 10 pf)

Encoding Inductive and Coinductive Types in Polymorphic Lambda Calculus

This was written about two years ago. Mostly inspired by Philip Wadler‘s note Recursive types for free.

Inductive types

Inductive types are least fixed-points. An indcutive type μF is simulated by

 forall x . (F x -> x) -> x

In Haskell, we need a constructor to wrap a “forall” quantifier.

> newtype LFix f = LF (forall x. (f x -> x) -> x)
> instance Show (LFix f) where
>   show _ = ""

> foldL :: Functor f => (f x -> x) -> LFix f -> x
> foldL f (LF x) = x f

> inL :: Functor f => f (LFix f) -> LFix f
> inL fx = LF (\\f -> f . fmap (foldL f) $ fx)

> outL :: Functor f => LFix f -> f (LFix f)
> outL = foldL (fmap inL)

Example: lists.

> data F a x = L | R a x deriving Show
> instance Functor (F a) where
>   fmap f L = L
>   fmap f (R a x) = R a (f x)

> type ListL a = LFix (F a)

A ListL can only be constructed out of nil and cons.

    Main> consL 1 (consL 2 (consL 3 nilL))
> nilL = inL L
> consL a x = inL (R a x)

We can convert a ListL to a ‘real’ Haskell list.

   Main> lList $ consL 1 (consL 2 (consL 3 nilL))
> lList :: LFix (F a) -> [a]
> lList = foldL ll 
>   where ll L = []
>         ll (R a x) = a : x

However, a ListL has to be explicitly built, so there is no infinite ListL.

Coinductive types

A coinductive type νF is simulatd by

 exists x . (x -> F x, x)

We represent it in Haskell using the property exists x . F x == forall y . (forall x . F x -> y) -> y:

> data GFix f = forall x . GF (x -> f x, x)
> instance Show (GFix f) where
>   show _ = ""

> unfoldG :: Functor f => (x -> f x) -> x -> GFix f
> unfoldG g x = GF (g,x)

> outG :: Functor f => GFix f -> f (GFix f)
> outG (GF (g,x)) = fmap (unfoldG g) . g $ x

> inG :: Functor f => f (GFix f) -> GFix f
> inG = unfoldG (fmap outG) 


> type ListG a = GFix (F a)

ListG can be constructed out of nil and cons, as well as an unfold.

> nilG = inG L
> consG a x = inG (R a x)

> fromG :: Int -> ListG Int
> fromG m = unfoldG ft m
>    where ft m = R m (m+1)

However, one can perform only a finite number of outG.

  Main> fmap outG . outG . fromG $ 1 
  R 1 (R 2 )


The function force witnesses the isomorphism between LFix and GFix.

> force :: Functor f => GFix f -> LFix f
> force = inL . fmap force . outG     -- recursion!

If LFix and GFix are isomorphic, we are allowed to build hylomorphisms:

> fromTo :: Int -> Int -> LFix (F Int)
> fromTo m n = takeLessThan n . force . fromG $ m

> takeLessThan :: Int -> LFix (F Int) -> LFix (F Int)
> takeLessThan n = foldL (tlt n)
>   where tlt n L = nilL
>         tlt n (R m x) | n <= m = nilL
>                       | otherwise = consL m x

  Main> lList (fromTo 1 10)

However, force has to be defined using general recursion. The price is that it is now possible to write programs that do not terminate.

Polymorphic Types in Haskell without Constructors?

I wrote this during Feb. 2005. Since it occurs to me every once in a while and I kept forgetting the details, I’d better keep a note.

I was trying to simulate church numerals and primitive recursion in second rank polymorphism of Haskell, and I thought I could do it without introducing a constructor by newtype. Church’s encoding of natural numbers can be expressed by the type:

> type N = forall a . (a -> a) -> a -> a


We first define zero, succ, and some constants:

> zero :: N 
> zero = \f -> \a ->  a

> succ :: N -> N 
> succ n = \f -> \a -> f (n f a)

> toInt :: N -> Int
> toInt n = n (1+) 0

> one, two, three :: N
> one = succ zero
> two = succ one
> three = succ two

Addition and multiplication can be defined as usual:

> add, mul :: N -> N -> N
> add m n = \f -> \a -> m f (n f a)
> mul m n = \f -> \a -> m (n f) a


Even pred typechecks — at least this particular version of pred:

> pred :: N -> N
> pred n f x = n (\g h -> h (g f)) (const x) id

I am more familiar with another way to implement pred, which uses pairs. This pred, on the other hand, can be understood by observing that the “fold” part returns a function that abstracts away the outermost f:

  pred n f x = dec n id
    where dec 0 = \h -> x
          dec (1+n) = \h -> h (f^n x)


I know of two ways to define exp. However, only this one typechecks:

> exp :: N -> N -> N
> exp m n = n m

Another definition, making use of mul, does not typecheck!

  exp :: N -> N -> N
  exp m n = n (mul m) one

GHC returns a rather cryptic error message “Inferred type is less polymorphic than expected. Quantified type variable `a’ escapes.” The real reason is that polymorphic types in Haskell can only be instantiated with monomorphic types. If we spell out the explicit type applications, we get:

   exp  (m::N) (n::N) = n [N] ((mul m)::N->N) (one::N)

which is not allowed. On the other hand, the previous definition of exp does not instantiate type variables to polymorphic types:

   exp (m::N) (n::N) = /\b -> \(f::(b->b)) -> \(x::b) -> n[b->b] (m[b]) f x

For the same reason, primitive recursion does not type check:

 primrec :: (N -> a -> a) -> a -> N -> a
 primrec f c n =
   snd (n (\z -> pair (succ (fst z)) 
                      (f (fst z) (snd z)))
          (pair zero c))

nor does the pair-based definition of pred.

To overcome the problem we have to wrap N in newtype. The solution was presented by Oleg Kiselyov.

Zipping Half of a List with Its Reversal

I found this note with one page of Haskell code, without comments, when I was cleaning up my files. It took me quite a while to figure out what it might probably be about: given a list, zip its first half with the reverse of its second half, in only “one and a half” traversals of the list. It gives us a fast algorithm to, for example, check whether a list is a palindrome. I thought it was probably something we discussed in AoP, but it turns out that I must have taken the note when people in IPL were studying There and Back Again of Danvy and Goldberg. Recorded in the note was my attempt to understand it.

Let convolute (xs,ys) = zip xs (reverse ys) and half be a function cutting a list in half, the problem is simply specified by revzip = convolute . half. The first trick is that convolute can be expressed in terms of its generalisation:

> convolute :: ([a],[b]) -> [(a,b)]
> convolute = fstn . uncurry walk
>   where fstn (xs,[]) = xs

where walk is defined by:

  walk xs ys = (zip xs (reverse (take n ys)), drop n ys)
      where n = length xs

The reason introducing walk is that it can be defined as a fold:

> walk = foldr ((.).shift) (split (nil, id))
> shift x (zs,y:ys) = ((x,y):zs, ys)
> nil _ = []

But the definition below is probably a lot more comprehensible:

  walk [] ys = ([],ys)
  walk (x:xs) ys = shift x (walk xs ys)

Some of these auxiliary functions should be part of the Haskell Prelude:

> prod (f,g) (a,b) = (f a, g b)
> split (f,g) a = (f a, g a)

The following implementation of half splits a list into two halfs using two pointers:

> half :: [a] -> ([a],[a])
> half = race . dup 
> race (xs,[]) = ([],xs)
> race (_:xs,[_]) = ([],xs)
> race (x:xs,_:_:ys) = cross ((x:),id) (race (xs,ys))
> dup a = (a,a)

Furthermore, uncurry f = apply . prod (f,id) where

> apply (f,a) = f a

Therefore the specification has become:

  revzip = fstn . apply . prod (walk,id) . race . dup

Now we try to fuse prod (walk,id) . race and get

> wrace (xs,[]) = (split (nil, id), xs)
> wrace (_:xs, [_]) = (split (nil, id), xs)
> wrace (x:xs, _:_:ys) = prod ((shift x .), id) (wrace (xs,ys))

Hence the result:

> revzip = fstn . apply . wrace . dup

Countdown: a case study in origami programming

R. S. Bird and S-C. Mu. In Journal of Functional Programming Vol. 15(5), pp. 679-702, 2005.
[GZipped Postscript]

Countdown is the name of a game in which one is given a list of source numbers and a target number, with the aim of building an arithmetic expression out of the source numbers to get as close to the target
as possible. Starting with a relational specification we derive a number of functional programs for solving Countdown. These programs are obtained by exploiting the properties of the folds and unfolds of various data types, a style of programming Gibbons has aptly called origami programming. Countdown is attractive as a case study in origami programming both as an illustration of how different algorithms can emerge from a single specification, as well as the space and time trade-offs that have to be taken into account in comparing functional programs.

Quantum functional programming

S-C. Mu and R. S. Bird. In 2nd Asian Workshop on Programming Languages and Systems , KAIST, Dajeaon, Korea, December 17-18, 2001.
[GZipped Postscript]

It has been shown that non-determinism, both angelic and demonic, can be encoded in a functional language in different representation of sets. In this paper we see quantum programming as a special kind of non-deterministic programming where negative probabilities are allowed. The point is demonstrated by coding two simple quantum algorithms in Haskell. A monadic style of quantum programming is also proposed. Programs are written in an imperative style but the programmer is encouraged to think in terms of values rather than quantum registers.

A Haskell Quine

A Haskell quine. That is, a program whose output is itself.

main = putStr (s ++ [';',' ','s',' ','=',' '] ++ show s);
s = "main = putStr (s ++ [';',' ','s',' ','=',' '] ++ show s)"

To deal with string quoting, it is difficult to write a quine not assuming a particular encoding (e.g. ASCII). In Haskell, however, it is side-stepped by using the built-in show function for strings. See also the quine page.

Functional programmers would notice that such programs resembles the lambda expression (\x -> x x) (\x -> x x), which reduces to itself. As a simple extension, this self-expanding program inserts to itself one line of empty comment each time it is run. It resembles the Y combinator.

Yokoyama later came up with a more elegant quine.