In this talk, we show how to match
in Coq, using techniques described by Adam
Chlipala in his book, Certified Programming with Dependent Types.
We first review what it means to patternmatch on inductive families,
contrasting Coq with Agda, and examine what it is about Coq that complicates
pattern matching. Using a simple running example, we’ll show how to use Coq
match
annotations to eliminate nonsense cases, and the convoy pattern for
refining the types of things already in scope. Finally, we’ll show that by
equipping an inductive family with some wellchosen combinators, it is often
possible to regain some semblance of elegance.
This is a recording of a practice run for a talk at YOW! Lambda Jam. You can download the video as MP4 or WebM, and slides as PDF. There is some code on GitHub. Also on YouTube.
]]>
The first half is aimed at Haskell novices, while the latter parts might be interesting to intermediates.
Let’s reverse some lists. I’m sure you’ve seen this before:
1 2 3 

It’s a simple and clear definition, but it performs badly. List concatenation,
(++)
, traverses its left argument, and since that’s a recursive call, this
version takes quadratic time overall.
We can get to linear time by introducing an accumulator:
1 2 3 4 5 6 

This is a workerwrapper, where the wrapper initialises the accumulator to
the empty list, []
. The worker, go
, is tail recursive, and accumulates the
result in reverse order as it forwardtraverses the input.
Although efficient, this is less clear. Let’s tidy up the worker by eliminating
the accumulator parameter, acc
:
1 2 3 4 5 6 

The first case (line 5) is just the identity function, id
, while we can write
the second case as a function composition:
1 2 3 4 5 6 

If we now compare the worker, go
, with our original version of reverse
, we
see a similar structure, with some differences:
[]
, becomes the identity function, id
.(++)
, becomes function composition, (.)
.[x]
, becomes a function, (x:)
, which prepends x
to
its argument.[a]
, becomes the function type, [a] > [a]
, in the result.For the last point, recall that a > b > c
parses as a > (b > c)
.
For a second example, let’s take a binary tree and flatten it to a list of
elements in lefttoright order. Again, this version is clear, but since there
is a recursive call to flatten
on the left of list append, (++)
, it takes
quadratic time:
1 2 3 4 5 

As before, the lineartime version uses an accumulator to build the result in reverse:
1 2 3 4 5 6 

If we rearrange, and compare the worker, go
, with the original, inefficient
version, we can see the same patterns we saw in the reverse
function:
1 2 3 4 5 6 

Having identified the pattern, we can define the type of difference lists, together with its basic operations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 

This means a difference list is a function which prepends some number of items to its argument, without inspecting that argument. For example, this is a difference list:
1 2 

But the following is not a difference list, because the reverse
function
inspects its argument, and because it does not incorporate the argument in its
result:
1 2 

Comparing the difference list operations with the corresponding ones on
ordinary lists, we see that that difference list append
takes constant time,
whereas for ordinary lists, (++)
takes time linear in the size of the left
argument. This means difference lists are good for building lists using
leftassociated concatenation.
But there’s a catch!
The time complexities are amortised, and they are only valid when the
difference list is used ephemerally. If you attempt to use a difference list
persistently, toList
may degrade to linear time.
The reason for this is that there are two costs associated with append
. The
first is paid when append
builds a closure for the function composition. The
second is paid every time the function composition is applied to an argument.
That is, whenever toList
is evaluated.
This means that toList
actually takes time that is linear in the number of
function compositions it has to apply. However, if we are only interested in
the overall time taken by a sequence of operations, we can assign the cost of
evaluating the function composition to append
, even though the evaluation
occurs in toList
. And so we can say that both append
and toList
take
amortised constant time.
Of course, that analysis assumes that toList
is only evaluated once. This is
what it means to restrict ourselves to ephemeral usage.
To show that difference lists are not magical, let’s see how we might use them inappropriately. Since difference lists are good for appending, perhaps they could be used to implement a FIFO queue:
1 2 3 4 5 6 7 8 9 10 

The put
operation seems fine, provided we don’t attempt to use the Queue
persistently. For take
, we find that we need to use toList
before we can
inspect the contents of the queue. Worst of all, we must use the lineartime
fromList
to construct the new Queue
.
Since switching back and forth between difference lists and ordinary lists is inefficient, difference lists are primarily useful in buildthenconsume algorithms. We can use a difference list to efficiently build a list, before converting to an ordinary list that is then consumed.
While it’s good to know when and how to use difference lists, it’s even nicer
to understand the connections to folds and monoids. Let’s reexamine our
differencelistbased reverse
:
1 2 3 4 5 6 

The only places we actually construct any lists are the empty list, []
, at
the end of line 2, and the singleton difference list, (x:)
, at the end of
line 6. Let’s abstract over those, calling them z
and f x
respectively:
1 2 3 4 5 6 

We can then generalise the type without touching the implementation, and arrive at a familiar function:
1 2 3 4 5 6 

The type of argument f
is flipped with respect to the Prelude version, but
the meaning is the same. So we can say that foldl
is just reverse
with the
constructors abstracted.
Similarly, we can say that foldr
is just a list traversal with the
constructors abstracted. In fact, we can get the rightfold on Tree
using the
same generalisation on flatten
. For lists, we can get foldr
by simply
flipping the function composition in line 6:
1 2 3 4 5 6 

To make precise the comparison between the two versions of each of our
examples, we need to recognise that we are using the corresponding operations
of two different monoids, the list monoid and the Endo
monoid:
1 2 3 4 5 6 7 8 9 

Notice that in the Endo
monoid, we’ve generalised from the type [a] > [a]
that we used for difference lists, to b > b
, just as we did when we
generalised from reverse
to foldl
.
We also need a way to inject a single element into a monoid. We use the unit
method of the Reducer
class, from the reducers package. We can then
rewrite reverse
and flatten
like this:
1 2 3 4 5 6 7 

If we treat the result as a list, we get the inefficient versions:
1 2 3 4 5 

But if we add an appropriate wrapper, we get the efficient versions:
1 2 3 4 5 6 7 8 

Now that we know about Endo
, let’s reexamine foldr
:
1 2 3 4 5 6 

First let’s rearrange the arguments a little, swapping z
and xs
:
1 2 3 4 5 6 

We can see the type b > b
in lots of places, and we’re using the
function id
and composition as well. That’s just the Endo
monoid,
so let’s rewrite like this:
1 2 3 4 5 6 

Without touching the implementation, we can generalise the type to any monoid, and rename:
1 2 3 4 5 6 

The wrapper is no longer doing anything, so we have just this:
1 2 3 

Notice how foldMap
follows the structure of the list data type. We can
recover foldr
by specialising to the Endo
monoid, and extracting the
result by applying to the argument z
:
1 2 

To get foldl
, we would need a version of Endo
in which the monoid append
(i.e. the function composition) is reversed. Dual
is a monoid transformer
which just flips the monoid append operation, <>
, with respect to the base
monoid:
1 2 

What about Tree
flattening? Well, Tree
has a foldMap
, too:
1 2 3 

Again, notice how foldMap
follows the structure of the Tree
data type.
We could reuse the same definitions of foldr
and foldl
above, just by
changing the types.
In fact, we’ve just reinvented the Foldable
type class from the base
package:
1 2 

Just by implementing foldMap
for your Foldable
data types, you get foldl
and foldr
for free. This means you can use whichever fold is most appropriate
for your situation. For example:
1 2 3 4 5 6 7 8 9 10 11 

What’s more, if your foldMap
follows the structure of your data type, foldl
and foldr
will both be maximally lazy and efficient. For example, first
is
O(1) on lists, including infinite lists, and O(log n) on Data.Set
, while
last
is O(n) on lists, but still O(log n) on Data.Set
.
When defining a term language and one or more interpreters, the free monad makes it easy to write computions in the term language without reference to any specific interpretation.
Janis Voitgländer showed how to improve the asymptotic complexity of computations over free monads. The method for free monad computations is analogous to the differencelist transformation for list constructions, but operating in a different category. Whereas list types are free monoid objects in the category Hask, free monads are the corresponding objects in the endofunctor category.
Following Voitgländer, we can write our extract
function with a more
restrictive rank2 type:
1 2 

This ensures that the computed list is in fact a difference list, by
restricting the construction to mempty
, (<>)
and unit
.
main()
? Why? References,
please!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 

I have written the program so that it needs no #include
directives, and
therefore you can be sure there is not a single typedef
, decltype
or auto
specifier anywhere in or out of sight. That means there’s only one way that
socalled universal references can arise.
However, you might find one or two other little surprises. Oh, and Clang 3.3 and GCC 4.8.1 don’t even agree on this program, so there’s not much point in cheating! I know where I’m putting my money, though…
]]>I’ve had loads of fun helping organise BFPG, and I’m still enjoying it. So why quit? On a personal level, I do have a bunch of projects that need more focus. But that’s not the only reason.
This decision is largely due to two excellent people who joined the team within the last year: Ben Kolera and Katie Miller. I’m really proud of both Ben and Katie, having watched them rapidly grow from nervous firsttime presenters into confident organisers and mentors, and of course, good friends. I’m now keen to see where they choose to take the group.
To Ben and Katie, congratulations and best wishes. You’ve earned this!
Lots of people and organisations have contributed their time, energy and hardearned cash to make BFPG such an enjoyable group. Thanks to all of you!
To Tom Adams, for starting this thing, and for ongoing support.
To OJ Reeves, for building community, welcoming many new members, and organising and personally sponsoring some spectacular events. You’re a tough act to follow, so I hope I’ve done you proud!
To Tony Morris, for lots of interesting talks, and for a commitment to sharing knowledge and experience.
To Nick Partridge, for mentoring new speakers, and being a fountain of ideas. To Kristian Domagala, also for mentoring, and for bringing samples of some very tasty brews.
To Rob Manthey, for paying for a Vimeo Plus subscription, and for shooting, editing and uploading many hours of talk videos.
To Microsoft, Ventyx and Suncorp for making great venues available to us, and to Charles O’Farrell, Jason Stevens, Richard Glew, Steven McCormick, Frank Valks and others, for organising access and equipment, and taking responsibility for our use of those venues.
To iseek Communications, for shouting us pizza every month, and to Chris MacKay for organising delivery. Also to Tom Wilson and ThoughtWorks for doing the same in the past.
To YOW! conference organisers, past and present, including Dave Thomas and Craig Smith for reaching out to local groups, for bringing the first ever Lambda Jam to Brisbane, for giving us great group discounts, and for allowing us to borrow some fantastic speakers for our own events.
And of course, thanks to everyone who has given a talk, or turned up to listen and discuss.
To everyone in BFPG, hopefully you can see the pattern: noone can do this forever! I’m joining a growing group of organiser alumni. As much as we all still enjoy the group, we’ve all inevitably reached the point where other things demand our attention, or where we realise that other people have fresher ideas than we do about how to grow and evolve the group. This is a good thing! But one of the reasons it has worked is that there has always been plenty of overlap.
Ben and Katie are doing a great job, but don’t take them for granted. If you enjoy the group, give them your support, and get involved now.
Based on my experience, by far the best way to get involved is to make a commitment to give a talk. As you negotiate a topic and content, and possibly do a practice run or two, you’ll get to know the team and how it works, and if you want further involvement, it will naturally evolve from there.
Preparing a talk, especially your first, is a fair bit of work, but BFPG is an unusually safe and supportive place to start, and you’ll get a lot more out than you ever put in.
]]>
You can download the video as MP4 or WebM, as well as the slides I used for my YLJ13 talk, in PDF or Keynote. There’s some code on GitHub. Sorry, no subtitles yet, but you can read along with this script. Also on YouTube.
]]>Thanks to Tony Morris, Greg Davis and Clinton Freeman for giving me the idea. Thanks to everyone else for not giving me too much shit about my noisy hole.
You can download the video as MP4 with embedded subtitles, WebM with separate subtitles, or find it on YouTube.
]]>