Difference lists

23 Sep 2014

This post is an extended version of a lightning talk I gave at the Brisbane Functional Programming Group. It introduces the difference list, a simple trick for improving the performance of certain kinds of list-building functions in Haskell, and goes on to explore the connections to monoids and folds.

The first half is aimed at Haskell novices, while the latter parts might be interesting to intermediates.

A first example

Let’s reverse some lists. I’m sure you’ve seen this before:

reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]

It’s a simple and clear definition, but it performs badly. List concatenation, (++), traverses its left argument, and since that’s a recursive call, this version takes quadratic time overall.

We can get to linear time by introducing an accumulator:

reverse :: [a] -> [a]
reverse xs = go xs []
  where
    go :: [a] -> [a] -> [a]
    go [] acc = acc
    go (x:xs) acc = go xs (x:acc)

This is a worker-wrapper, where the wrapper initialises the accumulator to the empty list, []. The worker, go, is tail recursive, and accumulates the result in reverse order as it forward-traverses the input.

Although efficient, this is less clear. Let’s tidy up the worker by eliminating the accumulator parameter, acc:

reverse :: [a] -> [a]
reverse xs = go xs []
  where
    go :: [a] -> [a] -> [a]
    go [] = \acc -> acc
    go (x:xs) = \acc -> go xs (x:acc)

The first case (line 5) is just the identity function, id, while we can write the second case as a function composition:

reverse :: [a] -> [a]
reverse xs = go xs []
  where
    go :: [a] -> [a] -> [a]
    go [] = id
    go (x:xs) = go xs . (x:)

If we now compare the worker, go, with our original version of reverse, we see a similar structure, with some differences:

The empty list, [], becomes the identity function, id.
List concatenation, (++), becomes function composition, (.).
The singleton list, [x], becomes a function, (x:), which prepends x to its argument.
The list type, [a], becomes the function type, [a] -> [a], in the result.

For the last point, recall that a -> b -> c parses as a -> (b -> c).

Tree flattening

For a second example, let’s take a binary tree and flatten it to a list of elements in left-to-right order. Again, this version is clear, but since there is a recursive call to flatten on the left of list append, (++), it takes quadratic time in the worst case:

data Tree a = Leaf | Branch (Tree a) a (Tree a)

flatten :: Tree a -> [a]
flatten Leaf = []
flatten (Branch l x r) = flatten l ++ [x] ++ flatten r

As before, the linear-time version uses an accumulator to build the result in reverse:

flatten :: Tree a -> [a]
flatten t = go t []
  where
    go :: Tree a -> [a] -> [a]
    go Leaf acc = acc
    go (Branch l x r) acc = go l (x : go r acc)

If we rearrange, and compare the worker, go, with the original, inefficient version, we can see the same patterns we saw in the reverse function:

flatten :: Tree a -> [a]
flatten t = go t []
  where
    go :: Tree a -> [a] -> [a]
    go Leaf = id
    go (Branch l x r) = go l . (x:) . go r

Difference lists

Having identified the pattern, we can define the type of difference lists, together with its basic operations:

type DList a = [a] -> [a]

-- O(1)
empty :: DList a
empty = id

-- O(1)
singleton :: a -> DList a
singleton = (:)

-- O(n)
fromList :: [a] -> DList a
fromList = (++)

-- O(1)
append :: DList a -> DList a -> DList a
append = (.)

-- O(1)
toList :: DList a -> [a]
toList d = d []

This means a difference list is a function which prepends some number of items to its argument, without inspecting that argument. For example, this is a difference list:

myDList :: DList Int
myDlist r = 1:2:3:r

But the following is not a difference list, because the reverse function inspects its argument, and because it does not incorporate the argument in its result:

notDList :: DList a
notDlist = reverse

Comparing the difference list operations with the corresponding ones on ordinary lists, we see that that difference list append takes constant time, whereas for ordinary lists, (++) takes time linear in the size of the left argument. This means difference lists are good for building lists using left-associated concatenation.

But there’s a catch!

The time complexities are amortised, and they are only valid when the difference list is used ephemerally. If you attempt to use a difference list persistently, toList may degrade to linear time.

The reason for this is that there are two costs associated with append. The first is paid when append builds a closure for the function composition. The second is paid every time the function composition is applied to an argument. That is, whenever toList is evaluated.

This means that toList actually takes time that is linear in the number of function compositions it has to apply. However, if we are only interested in the overall time taken by a sequence of operations, we can assign the cost of evaluating the function composition to append, even though the evaluation occurs in toList. And so we can say that both append and toList take amortised constant time.

Of course, that analysis assumes that toList is only evaluated once. This is what it means to restrict ourselves to ephemeral usage.

A bad example

To show that difference lists are not magical, let’s see how we might use them inappropriately. Since difference lists are good for appending, perhaps they could be used to implement a FIFO queue:

type Queue a = DList a

put :: Queue a -> a -> Queue a
put q x = append q (singleton x)

-- No!
take :: Queue a -> Maybe (a, Queue a)
take q = case toList q of
  [] -> Nothing
  x:xs -> Just (x, fromList xs)

The put operation seems fine, provided we don’t attempt to use the Queue persistently. For take, we find that we need to use toList before we can inspect the contents of the queue. Worst of all, we must use the linear-time fromList to construct the new Queue.

Since switching back and forth between difference lists and ordinary lists is inefficient, difference lists are primarily useful in build-then-consume algorithms. We can use a difference list to efficiently build a list, before converting to an ordinary list that is then consumed.

Some fun with folds

While it’s good to know when and how to use difference lists, it’s even nicer to understand the connections to folds and monoids. Let’s reexamine our difference-list-based reverse:

reverse :: [a] -> [a]
reverse xs = go xs []
  where
    go :: [a] -> [a] -> [a]
    go [] = id
    go (x:xs) = go xs . (x:)

The only places we actually construct any lists are the empty list, [], at the end of line 2, and the singleton difference list, (x:), at the end of line 6. Let’s abstract over those, calling them z and f x respectively:

reverse :: (a -> [a] -> [a]) -> [a] -> [a] -> [a]
reverse f z xs = go xs z
  where
    go :: [a] -> [a] -> [a]
    go [] = id
    go (x:xs) = go xs . f x

We can then generalise the type without touching the implementation, and arrive at a familiar function:

foldl :: forall a b. (a -> b -> b) -> b -> [a] -> b
foldl f z xs = go xs z
  where
    go :: [a] -> b -> b
    go [] = id
    go (x:xs) = go xs . f x

The type of argument f is flipped with respect to the Prelude version, but the meaning is the same. So we can say that foldl is just reverse with the constructors abstracted.

Similarly, we can say that foldr is just a list traversal with the constructors abstracted. In fact, we can get the right-fold on Tree using the same generalisation on flatten. For lists, we can get foldr by simply flipping the function composition in line 6:

foldr :: forall a b. (a -> b -> b) -> b -> [a] -> b
foldr f z xs = go xs z
  where
    go :: [a] -> b -> b
    go [] = id
    go (x:xs) = f x . go xs

Monoids

To make precise the comparison between the two versions of each of our examples, we need to recognise that we are using the corresponding operations of two different monoids, the list monoid and the Endo monoid:

instance Monoid [a] where
  mempty = []
  (<>) = (++)

newtype Endo b = Endo { appEndo :: b -> b }

instance Monoid (Endo a) where
  mempty = Endo id
  Endo f <> Endo g = Endo (f . g)

Notice that in the Endo monoid, we’ve generalised from the type [a] -> [a] that we used for difference lists, to b -> b, just as we did when we generalised from reverse to foldl.

We also need a way to inject a single element into a monoid. We use the unit method of the Reducer class, from the reducers package. We can then rewrite reverse and flatten like this:

reverse' :: (Monoid m, Reducer a m) => [a] -> m
reverse' [] = mempty
reverse' (x:xs) = reverse' xs <> unit x

flatten' :: (Monoid m, Reducer a m) => Tree a -> m
flatten' Leaf = mempty
flatten' (Branch l x r) = flatten' l <> unit x <> flatten' r

If we treat the result as a list, we get the inefficient versions:

reverse :: [a] -> [a]
reverse = reverse'

flatten :: Tree a -> [a]
flatten = flatten'

But if we add an appropriate wrapper, we get the efficient versions:

extract :: Endo [a] -> [a]
extract xs = appEndo xs []

reverse :: [a] -> [a]
reverse = extract . reverse'

flatten :: Tree a -> [a]
flatten = extract . flatten'

More fun with folds

Now that we know about Endo, let’s reexamine foldr:

foldr :: forall a b. (a -> b -> b) -> b -> [a] -> b
foldr f z xs = go xs z
  where
    go :: [a] -> b -> b
    go [] = id
    go (x:xs) = f x . go xs

First let’s rearrange the arguments a little, swapping z and xs:

foldr :: forall a b. (a -> b -> b) -> [a] -> b -> b
foldr f xs z = go xs z
  where
    go :: [a] -> b -> b
    go [] = id
    go (x:xs) = f x . go xs

We can see the type b -> b in lots of places, and we’re using the function id and composition as well. That’s just the Endo monoid, so let’s rewrite like this:

foldr :: forall a b. (a -> Endo b) -> [a] -> Endo b
foldr f = go
  where
    go :: [a] -> Endo b
    go [] = mempty
    go (x:xs) = f x <> go xs

Without touching the implementation, we can generalise the type to any monoid, and rename:

foldMap :: forall a m. Monoid m => (a -> m) -> [a] -> m
foldMap f = go
  where
    go :: [a] -> m
    go [] = mempty
    go (x:xs) = f x <> go xs

The wrapper is no longer doing anything, so we have just this:

foldMap :: Monoid m => (a -> m) -> [a] -> m
foldMap f [] = mempty
foldMap f (x:xs) = f x <> foldMap f xs

Notice how foldMap follows the structure of the list data type. We can recover foldr by specialising to the Endo monoid, and extracting the result by applying to the argument z:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z xs = appEndo (foldMap (Endo . f) xs) z

To get foldl, we would need a version of Endo in which the monoid append (i.e. the function composition) is reversed. Dual is a monoid transformer which just flips the monoid append operation, <>, with respect to the base monoid:

foldl :: (a -> b -> b) -> b -> [a] -> b
foldl f z xs = appEndo (getDual $ foldMap (Dual . Endo . f) xs) z

What about Tree flattening? Well, Tree has a foldMap, too:

foldMap :: Monoid m => (a -> m) -> Tree a -> m
foldMap f Leaf = mempty
foldMap f (Branch l x r) = foldMap f l <> f x <> foldMap f r

Again, notice how foldMap follows the structure of the Tree data type. We could reuse the same definitions of foldr and foldl above, just by changing the types.

In fact, we’ve just reinvented the Foldable type class from the base package:

class Foldable t where
  foldMap :: Monoid m => (a -> m) -> t a -> m

Just by implementing foldMap for your Foldable data types, you get foldl and foldr for free. This means you can use whichever fold is most appropriate for your situation. For example:

reverse :: Foldable t => t a -> [a]
reverse = foldl (flip (:)) []

flatten :: Foldable t => t a -> [a]
flatten = foldr (:) []

first :: Foldable t => t a -> Maybe a
first = foldr (const . Just) Nothing

last :: Foldable t => t a -> Maybe a
last = foldl (const Just) Nothing

What’s more, if your foldMap follows the structure of your data type, foldl and foldr will both be maximally lazy and efficient. For example, first is O(1) on lists, including infinite lists, and O(log n) on Data.Set, while last is O(n) on lists, but still O(log n) on Data.Set.

Analogy to free monads

When defining a term language and one or more interpreters, the free monad makes it easy to write computions in the term language without reference to any specific interpretation.

Janis Voitgländer showed how to improve the asymptotic complexity of computations over free monads. The method for free monad computations is analogous to the difference-list transformation for list constructions, but operating in a different category. Whereas list types are free monoid objects in the category Hask, free monads are the corresponding objects in the endofunctor category.

Following Voitgländer, we can write our extract function with a more restrictive rank-2 type:

extract :: forall a. (forall m. (Monoid m, Reducer m a) => m) -> [a]
extract xs = appEndo xs []

This ensures that the computed list is in fact a difference list, by restricting the construction to mempty, (<>) and unit.