The Universal Properties of Map, Fold, and Filter

A lot of people who like functional programming often give the reason that the functional style is simply more elegant than the imperative style. When compelled or inspired to explain (as I did in my old post, How I Learned to Love Functional Programming), they often point to the three “higher-order” functions map, fold, and filter, as providing a unifying framework for writing and reasoning about programs. But how unifying are they, really? In this post we’ll give characterizations of these functions in terms of universal properties, and we’ll see that in fact fold is the “most” universal of the three, and its natural generalization gives a characterization of transformations of standard compound data types.

By being universal or having a universal property, we don’t mean that map, fold, and filter are somehow mystically connected to all programming paradigms. That might be true, but it’s not the point of this article. Rather, by saying something has a universal property we are making a precise mathematical statement that it is either an initial or final object (or the unique morphism determined by such) in some category.

That means that, as a fair warning to the reader, this post assumes some basic knowledge of category theory, but no more than what we’ve covered on this blog in the past. Of particular importance for the first half of this post is the concept of a universal property, and in the followup post we’ll need some basic knowledge of functors.

Map, Filter, and Fold

Recalling their basic definitions, map is a function which accepts as input a list $ L$ whose elements all have the same type (call it $ A$), and a function $ f$ which maps $ A$ to another type $ B$. Map then applies $ f$ to every entry of $ L$ to produce a new list whose entries all have type $ B$.

In most languages, implementing the map function is an elementary exercise. Here is one possible definition in ML.

fun map(_, nil) = nil
  | map(f, (head::tail)) = f(head) :: map(f, tail)

Next, filter takes as input a boolean-valued predicate $ p : A \to \textup{bool}$ on types $ A$ and a list $ L$ whose entries have type $ A$, and produces a list of those entries of $ L$ which satisfy the predicate. Again, it’s implementation in ML might be:

fun filter(_, nil) = nil
  | filter(p, (head::tail)) = if p(head)
                                 then (head :: filter(p, tail))
                                 else filter(p, tail)

Finally, fold is a function which “reduces” a list $ L$ of entries with type $ A$ down to a single value of type $ B$. It accepts as input a function $ f : A \times B \to B$, and an initial value $ v \in B$, and produces a value of type $ B$ by recursively applying $ f$ as follows:

fun fold(_, v, nil) = v
  | fold(f, v, (head::tail)) = f(head, fold(f, v, tail))

(If this definition is mysterious, you’re probably not ready for the rest of this post.)

One thing that’s nice about fold is that we can define map and filter in terms of it:

fun map(f, L) = fold((fn x, xs => (f(x):xs)), [], L)
fun filter(p, L) = fold((fn x, xs => if p(x) then x:xs else xs end), [], L)

We’ll see that this is no accident: fold happens to have the “most general” universality of the three functions, but as a warm-up and a reminder we first investigate the universality of map and filter.

Free Monoids and Map

Map is the easiest function of the three to analyze, but to understand it we need to understand something about monoids. A monoid is simple to describe, it’s just a set $ M$ with an associative binary operation denoted by multiplication and an identity element for that operation.

monoid homomoprhism between two monoids $ M, N$ is a function $ f: M \to N$ such that $ f(xy) = f(x)f(y)$. Here the multiplication on the left hand side of the equality is the operation from $ M$ and on the right it’s the one from $ N$ (this is the same definition as a group homomorphism). As is to be expected, monoids with monoid homomorphisms form a category.

We encounter simple monoids every day in programming which elucidate this definition. Strings with the operation of string concatenation form a monoid, and the empty string acts as an identity because concatenating a string to the empty string has no effect. Similarly, lists with list concatenation form a monoid, where the identity is the empty list. A nice example of a monoid homomorphism is the length function. We know it’s a homomorphism because the length of a concatenation of two lists is just the sum of the lengths of the two lists.

Integers also form a monoid, with addition as the operation and zero as the identity element. However, the list and string monoids have an extra special property that integers do not. For a number $ n$ you can always find $ -n$ so that $ n + (-n) = 0$ is the identity element. But for lists, the only way to concatenate two lists and get the empty list is if both of the lists were empty to begin with. A monoid with this property is called free, and to fully understand it we should make a definition which won’t seem to mean anything at first.

Definition: Let $ \mathbf{C}$ be a category. Given a set $ A$, the free object over $ A$, denoted $ F(A)$, is an object of $ \mathbf{C}$ which is universal with respect to set-maps $ A \to B$ for any object $ B$ in $ \mathbf{C}$.

As usual with a definition by universal property, we should elaborate as to exactly what’s going on. Let $ \mathbf{C}$ be a category whose objects are sets and let $ A$ a set, possibly not in this category. We can construct a new category, $ \mathbf{C}_{A \to X}$ whose objects are set-maps

$ \displaystyle f: A \to X$

and whose morphisms are commutative diagrams of the form

free-object-cat-morphism

where $ \varphi$ is a morphism in $ \mathbf{C}$. In other words, we simply require that $ \varphi(f_1(a)) = f_2(a)$ for every $ a \in A$.

By saying that the free monoid on $ A$ satisfies this universal property for the category of monoids, we really mean that it is initial in this category of set-maps and commutative diagrams. That is, there is a monoid $ F(A)$ and a set-map $ i_A: A \to F(A)$ so that for every monoid $ Y$ and set-map $ f: A \to Y$ there is a unique monoid homomorphism from $ i_A$ to $ f$. In other words, there is a unique monoid homomorphism $ \varphi$ making the following diagram commute:

free-object-univ-propFor example, if $ A$ is the set of all unicode characters, then $ F(A)$ is precisely the monoid of all strings on those characters. The map $ i_A$ is then the map taking a character $ a$ to the single-character string “$ a$”. More generally, if $ T$ is any type in the category of types and computable functions, then $ F(T)$ is the type of homogeneous lists whose elements have type $ T$.

We will now restrict our attention to lists and types, and we will denote the free (list) monoid on a type $ A$ as $ [A]$. The universal property of map is then easy to describe, it’s just a specific instance of this more general universal property for the free monoids. Specifically, we work in the sub-category of homogeneous list monoids (we only allow objects which are $ [T]$ for some type $ T$). The morphism $ i_A: A \to [A]$ just takes a value $ a$ of type $ A$ and spits out a single-item list $ [a]$. We might hope that for any mapping $ A \to [B]$, there is a unique monoid homomorphism $ [A] \to [B]$ making the following diagram commute.

bad-map

And while this is true, the diagram lies because “map” does not achieve what $ \varphi$ does. The map $ f$ might send all of $ A$ to the empty list, and this would cause $ \varphi$ to be the trivial map. But if we decomposed $ f$ further to require it to send $ a$ to a single $ b$, such as

still-bad-map

then things work out nicely. In particular, specifying a function $ f: A \to B$ uniquely defines how a list homomorphism operates on lists. In particular, the diagram forces the behavior for single-element lists: $ [a] \mapsto [f(a)]$. And the property of $ \varphi$ being a monoid homomorphism forces $ \varphi$ to preserve order.

Indeed, we’ve learned from this analysis that the structure of the list involved is crucial in forming the universal property. Map is far from the only computable function making the diagram commute, but it is clearly the only monoid homomorphism. As we’ll see, specifying the order of operations is more generally described by fold, and we’ll be able to restate the universal property of map without relying on monoid homomorphisms.

Filter

The filter function is a small step up in complexity from map, but its universal property is almost exactly the same. Again filter can be viewed through the lens of a universal property for list monoids, because filter itself takes a predicate and produces a list monoid. We know this because filtering two lists by the same predicate and concatenating them is the same as concatenating them first and then filtering them. Indeed, the only difference here is that the diagram now looks like this

filter-bad

where $ p$ is our predicate, and $ j_A$ is defined by $ (a, T) \mapsto [a]$ and $ (a,F) \mapsto []$. The composition $ j_A \circ p$ gives the map $ A \to [A]$ that we can extend uniquely to a monoid homomorphism $ [A] \to [A]$. We won’t say any more on it, except to mention that again maintaining the order of the list is better described via fold.

Fold, and its Universal Property

Fold is different from map and filter in a very concrete way. Even though map and filter do specify that the order of the list should be preserved, it’s not an important part of their definition: filter should filter, and map should map, but no interaction occurs between different entries during the computation. In a sense, we got lucky that having everything be monoid homomorphisms resulted in preserving order without our specific intention.

Because it doesn’t necessarily produce lists and the operation can be quite weird, fold cannot exist without an explicit description of the order of operations. Let’s recall the definition of fold, and stick to the “right associative” order of operations sometimes specified by calling the function “foldr.” We will use foldr and fold interchangeably.

fun foldr(_, v, nil) = v
  | foldr(f, v, (head::tail)) = f(head, foldr(f, v, tail))

Even more, we can’t talk about foldr in terms of monoids. Even after fixing $ f$ and $ v$, foldr need not produce a monoid homomorphism. So if we’re going to find a universal property of foldr, we’ll need a more general categorical picture.

One first try would be to view foldr as a morphism of sets

$ \displaystyle B \times \textup{Hom}(A \times B, B) \to \textup{Hom}([A], B)$

The mapping is just $ f,b \mapsto \textup{foldr } f \textup{ } b$, and this is just the code definition we gave above. One might hope that this mapping defines an isomorphism in the category of types and programs, but it’s easy to see that it does not. For example, let

$ A = \left \{ 1 \right \}, B = \left \{ 1,2 \right \}$

Then the left hand side of the mapping above is a set of size 8 (there are eight ways to combine a choice of element in $ B$ with a map from $ A \times B \to B$). But the right hand size is clearly infinite. In fact, it’s uncountably infinite, though not all possible mappings are realizable in programs of a reasonable length (in fact, very few are). So the morphism can’t possibly be a surjective and hence is not an isomorphism.

So what can we say about fold? The answer is a bit abstract, but it works out nicely.

Say we fix a type for our lists, $ A$. We can define a category $ \mathbf{C}_A$ which has as objects the following morphisms

fold objects

By $ 1$ we mean the type with one value (null, if you wish), and $ f$ is a morphism from a coproduct (i.e. there are implicit parentheses around $ A \times B$). As we saw in our post on universal properties, a morphism from a coproduct is the same thing as a pair of functions which operates on each piece. Here one operates on $ 1$ and the other on $ A \times B$. Since morphisms from $ 1$ are specified by the image of the sole value $ f(1) = b$, we will often write such a function as $ b \amalg h$, where $ h: A \times B \to B$.

Now the morphisms in $ \mathbf{C}_A$ are the pairs of morphisms $ \varphi, \overline{\varphi}$ which make the following diagram commute:

fold morphimsHere we write $ \overline{\varphi}$ because it is determined by $ \varphi$ in a canonical way. It maps $ 1 \to 1$ in the only way one can, and it maps $ (a,b) \mapsto (a, \varphi(b))$. So we’re really specifying $ \varphi$ alone, but as we’ll see it’s necessary to include $ \overline{\varphi}$ as well; it will provide the “inductive step” in the definition of fold.

Now it turns out that fold satisfies the universal property of being initial in this category. Well, not quite. We’re saying first that the following object $ \mathscr{A}$ is initial,

initial object fold

Where cons is the list constructor $ A \times [A] \to [A]$ which sends $ (a_0, [a_1, \dots, a_n]) \mapsto [a_0, a_1, \dots, a_n]$. By being initial, we mean that for any object $ \mathscr{B}$ given by the morphism $ b \amalg g: 1 \coprod A \times B \to B$, there is a unique morphism from $ \mathscr{A} \to \mathscr{B}$. The “fold” is precisely this unique morphism, which we denote by $ (\textup{fold g, b})$. We implicitly know its “barred” counterpart making the following diagram commute.

fold-univ-propertyThis diagram has a lot going on in it, so let’s go ahead and recap. The left column represents an object $ \mathscr{A}$ we’re claiming is initial in this crazy category we’ve made. The right hand side is an arbitrary object $ \mathscr{B}$, which is equivalently the data of an element $ b \in B$ and a mapping $ g: A \times B \to B$. This is precisely the data needed to define fold. The dashed lines represent the unique morphisms making the diagram commute, whose existence and uniqueness is the defining property of what it means for an object to be initial in this category. Finally, we’re claiming that foldr, when we fix $ g$ and $ b$, makes this diagram commute, and is hence the very same unique morphism we seek.

To prove all of this, we need to first show that the object $ \mathscr{A}$ is initial. That is, that any two morphisms we pick from $ \mathscr{A} \to \mathscr{B}$ must be equal. The first thing to notice is that the two objects $ 1 \coprod A \times [A]$ and $ [A]$ are really the same thing. That is, $ \textup{nil} \amalg \textup{cons}$ has a two-sided inverse which makes it an isomorphism in the category of types. Specifically, the inverse is the map $ \textup{split}$ sending

$ \textup{nil} \mapsto \textup{nil}$
$ [a_1, \dots, a_n] \mapsto (a_1, [a_2, \dots, a_n])$

So if we have a morphism $ \varphi, \overline{\varphi}$ from $ \mathscr{A} \to \mathscr{B}$, and the diagram commutes, we can see that $ \varphi = (b \amalg g) \circ \overline{\varphi} \circ \textup{split}$. We’re just going the long way around the diagram.

Supposing that we have two such morphisms $ \varphi_1, \varphi_2$, we can prove they are equal by induction on the length of the input list. It is trivial to see that they both must send the empty list to $ b$. Now suppose that for lists of length $ n-1$ the two are equal. Given a list $ [a_1, \dots, a_n]$ we see that

$ \displaystyle \varphi_1([a_1, \dots, a_n]) = \varphi_1 \circ \textup{cons} (a_1, [a_2, \dots, a_n]) = g \circ \overline{\varphi_1} (a_1, [a_2, \dots, a_n])$

where the last equality holds by the commutativity of the diagram. Continuing,

$ \displaystyle g \circ \overline{\varphi_1} (a_1, [a_2, \dots, a_n]) = g (a_1, \varphi_1([a_2, \dots, a_n])) = g(a_1, \varphi_2(a_2, \dots, a_n))$

where the last equality holds by the inductive hypothesis. From here, we can reverse the equalities using $ \varphi_2$ and it’s “barred” version to get back to $ \varphi_2([a_1, \dots, a_n])$, proving the equality.

To show that fold actually makes the diagram commute is even simpler. In order to make the diagram commute we need to send the empty list to $ b$, and we need to inductively send $ [a_1, \dots, a_n]$ to $ g(a_1, (\textup{fold g b})([a_2, \dots, a_n]))$, but this is the very definition of foldr!

So we’ve established that fold has this universal property. It’s easy now to see how map and filter fit into the picture. For mapping types $ A$ to $ C$ via $ f$, just use the type $ [C]$ in place of $ B$ above, and have $ g(a, L) = \textup{cons}(f(a), L)$, and have $ b$ be the empty list. Filter similarly just needs a special definition for $ g$.

A skeptical reader might ask: what does all of this give us? It’s a good question, and one that shouldn’t be taken lightly because I don’t have an immediate answer. I do believe that with some extra work we could use universal properties to give a trivial proof of the third homomorphism theorem for lists, which says that any function expressible as both a foldr and a foldl can be expressed as a list homomorphism. The proof would involve formulating a universal property for foldl, which is very similar to the one for foldr, and attaching the diagrams in a clever way to give the universal property of a monoid homomorphism for lists. Caveat emptor: this author has not written such a proof, but it seems likely that it would work.

More generally, any time we can represent the requirements of a list computation by an object like $ \mathscr{B}$, we can represent the computation as a foldr. What good does this do us? Well, it might already be obvious when you can and can’t use fold. But in addition to proving when it’s possible to use fold, this new perspective generalizes very nicely to give us a characterization of arbitrary computations on compound data types. One might want to know when to perform fold-like operations on trees, or other sufficiently complicated beasts, and the universal property gives us a way to see when such a computation is possible.

That’s right, I said it: there’s more to the world than lists. Shun me if you must, but I will continue dream of great things.

In an effort to make the egregiously long posts on this blog slightly shorter, we’ll postpone our generalization of the universal property of fold until next time. There we’ll define “initial algebras” and show how to characterize “fold-like” computations any compound data type.

Until then!

Functoriality

Last time we worked through some basic examples of universal properties, specifically singling out quotients, products, and coproducts. There are many many more universal properties that we will mention as we encounter them, but there is one crucial topic in category theory that we have only hinted at: functoriality.

As we’ve repeatedly stressed, the meat of category theory is in the morphisms. One natural question one might ask is, what notion of morphism is there between categories themselves? Indeed, the most straightforward way to see category theoretic concepts in classical mathematics is in a clever choice of functor. For example (and this example isn’t necessary for the rest of the article) one can “associate” to each topological space a group, called the homology group, in such a way that continuous functions on topological spaces translate to group homomorphisms. Moreover, this translation is functorial in the following sense: the group homomorphism associated to a composition is the composition of the associated group homomorphisms. If we denote the association by a subscripted asterisk, then we get the following common formula.

$ \displaystyle (fg)_* = f_* g_*$

This is the crucial property that maintains the structure of morphisms. Again, this should reinforce the idea that the crucial ingredient of every definition in category theory is its effect on morphisms.

Functors: a Definition

In complete generality, a functor is a mapping between two categories which preserves the structure of morphisms. Formally,

Definition: Let $ \mathbf{C}, \mathbf{D}$ be categories. A functor $ \mathscr{F}$ consists of two parts:

  • For each object $ C \in \mathbf{C}$ an associated object $ \mathscr{F}(C) \in \mathbf{D}$.
  • For each morphism $ f \in \textup{Hom}_{\mathbf{C}}(A,B)$ a corresponding morphism $ \mathscr{F}(f) \in \textup{Hom}_{\mathbf{D}}(\mathscr{F}(A), \mathscr{F}(B))$. Specifically, for each $ A,B$ we have a set-function $ \textup{Hom}_{\mathbf{C}}(A,B) \to \textup{Hom}_{\mathbf{C}}(\mathscr{F}(A), \mathscr{F}(B))$.

There are two properties that a functor needs to satisfy to “preserve structure.” The first is that the identity morphisms are preserved for every object; that is, $ \mathscr{F}(1_A) = 1_{\mathscr{F}(A)}$ for every object $ A$. Second, composition must be preserved. That is, if $ f \in \textup{Hom}_{\mathbf{C}}(A,B)$ and $ g \in \textup{Hom}_{\mathbf{C}}(B,C)$, we have

$ \displaystyle \mathscr{F}(gf) = \mathscr{F}(g) \mathscr{F}(f)$

We often denote a functor as we would a function $ \mathscr{F}: \mathbf{C} \to \mathbf{D}$, and use the function application notation as if everything were with sets.

Let’s look at a few simple examples.

Let $ \mathbf{FiniteSet}$ be the poset category of finite sets with subsets as morphisms, and let $ \mathbf{Int}$ be the category whose objects are integers where there is a unique morphism from $ x \to y$ if $ x \leq y$. Then the size function is a functor $ \mathbf{Set} \to \mathbf{Int}$. Continuing with $ \mathbf{Int}$, remember that $ \mathbf{Int}$ forms a group under addition (also known as $ \mathbb{Z}$). And so by its very definition any group homomorphism $ \mathbb{Z} \to \mathbb{Z}$ is a functor from $ \mathbf{Int} \to \mathbf{Int}$. A functor from a category to itself is often called an endofunctor.

There are many examples of functors from the category $ \mathbf{Top}$ of topological spaces to the category $ \mathbf{Grp}$ of groups. These include some examples we’ve seen on this blog, such as the fundamental group and homology groups.

One trivial example of a functor is called the forgetful functor. Let $ \mathbf{C}$ be a category whose objects are sets and whose morphisms are set-maps with additional structure, for example the category of groups. Define a functor $ \mathbf{C} \to \mathbf{Set}$ which acts as the identity on both sets and functions. This functor simply “forgets” the structure in $ \mathbf{C}$. In the realm of programs and types (this author is thinking Java), one can imagine this as a ‘type-cast’ from String to Object.  In the same vein, one could define an “identity” endofunctor $ \mathbf{C} \to \mathbf{C}$ which does absolutely nothing.

One interesting way to think of a functor is as a “realization” of one category inside another. In particular, because the composition structure of $ \mathbf{C}$ is preserved by a functor $ \mathbf{C} \to \mathbf{D}$, it must be the case that all commutative diagrams are “sent” to commutative diagrams. In addition, isomorphisms are sent to isomorphisms because if $ f, g$ are inverses of each other, $ 1_{\mathscr{F}(A)} = \mathscr{F}(1_A) = \mathscr{F}(gf) = \mathscr{F}(g)\mathscr{F}(f)$ and likewise for the reverse composition. And so if we have a functor $ \mathscr{F}$ from a poset category (say, the real numbers with the usual inequality) to some category $ \mathbf{C}$, then we can realize the structure of the poset sitting inside of $ \mathbf{C}$ (perhaps involving only some of the objects of $ \mathbf{C}$). This view comes in handy in a few places we’ll see later in our series on computational topology.

The Hom Functor

There is a very important and nontrivial example called the “hom functor” which is motivated by the category of vector spaces. We’ll stick to the concrete example of vector spaces, but the generalization to arbitrary categories is straightforward. If the reader knows absolutely nothing about vector spaces, replace “vector space” with “object” and “linear map” with “morphism.” It won’t quite be correct, but it will get the idea across.

To each vector space $ V$ one can define a dual vector space of functions $ V \to \mathbb{C}$ (or whatever the field of scalars for $ V$ is). Following the lead of hom sets, the dual vector space is denoted $ \textup{Hom}(V,\mathbb{C})$. Here the morphisms in the set are those from the category of vector spaces (that is, linear maps $ V \to \mathbb{C}$). Indeed, this is a vector space: one can add two functions pointwise ($ (f+g)(v) = f(v) + g(v)$) and scale them ($ (\lambda f)(v) = \lambda f(v)$), and the properties for a vector spaces are trivial to check.

Now the mapping $ \textup{Hom}(-, \mathbb{C})$ which takes $ V$ and produces $ \textup{Hom}(V, \mathbb{C})$ is a functor called the hom functor. But let’s inspect this one more closely. The source category is obviously the category of vector spaces, but what is the target category? The objects are clear: the hom sets $ \textup{Hom}(V,\mathbb{C})$ where $ V$ is a vector space. The morphisms of the category are particularly awkward. Officially, they are written as

$ \displaystyle \textup{Hom}(\textup{Hom}(V, \mathbb{C}), \textup{Hom}(W, \mathbb{C}))$

so a morphism in this category takes as input a linear map $ V \to \mathbb{C}$ and produces as output one $ W \to \mathbb{C}$. But what are the morphisms in words we can understand? And how can we compose them? Before reading on, think about what a morphism of morphisms should look like.

Okay, ready?

The morphisms in this category can be thought of as linear maps $ W \to V$. More specifically, given a morphism $ \varphi: V \to \mathbb{C}$ and a linear map $ g: W \to V$, we can construct a linear map $ W \to \mathbb{C}$ by composing $ \varphi g$.

And so if we apply the $ \textup{Hom}$ functor to a morphism $ f: W \to V$, we get a morphism in $ \textup{Hom}(\textup{Hom}(V, \mathbb{C}), \textup{Hom}(W, \mathbb{C}))$. Let’s denote the application of the hom functor using an asterisk so that $ f \mapsto f_*$.

But wait a minute! The mapping here is going in the wrong direction: we took a map in one category going from the $ V$ side to the $ W$ side, and after applying the functor we got a map going from the $ W$ side ($ \textup{Hom}(W, \mathbb{C})$) to the $ V$ side ($ \textup{Hom}(V, \mathbb{C})$). It seems there is no reasonable way to take a map $ V \to \mathbb{C}$ and get a map in $ W \to \mathbb{C}$ using just $ f$, but the other way is obvious. The hom functor “goes backward” in a sense. In other words, the composition property for our “functor” makes the composite $ (g f)_*$ to the map taking $ \varphi$ to $ \varphi g f$. On the other hand, there is no way to compose $ g_* f_*$, as they operate on the wrong domains! It must be the other way around:

$ \displaystyle (gf)_* = f_* g_*$

We advise the reader to write down the commutative diagram and trace out the compositions to make sure everything works out. But this is a problem, because it makes the hom functor fail the most important requirement. In order to fix this reversal “problem,” we make the following definition:

Definition: A functor $ \mathscr{F} : \mathbf{C} \to \mathbf{D}$ is called covariant if it preserves the order of morphism composition, so that $ \mathscr{F}(gf) = \mathscr{F}(g) \mathscr{F}(f)$. If it reverses the order, we call it contravariant.

And so the hom functor on vector spaces is a contravariant functor, while all of the other functors we’ve defined in this post are covariant.

There is another way to describe a contravariant functor as a covariant functor which is often used. It involves the idea of an “opposite” category. For any category $ \mathbf{C}$ we can define the opposite category $ \mathbf{C}^{\textup{op}}$ to be a category with the same objects as $ \mathbf{C}$, but with all morphisms reversed. That is, we define

$ \displaystyle \textup{Hom}_{\mathbf{C}^{\textup{op}}}(A,B) = \textup{Hom}_{\mathbf{C}}(B,A)$

We leave it to the reader to verify that this is indeed a category. It is also not hard to see that $ (\mathbf{C}^{\textup{op}})^{\textup{op}} = \mathbf{C}$. Opposite categories give us a nice recharacterization of a contrvariant functor. Indeed, because composition in opposite categories is reversed, a contravariant functor $ \mathbf{C} \to \mathbf{D}$ is just a covariant functor on the opposite category $ \mathbf{C}^{\textup{op}} \to \mathbf{D}$. Or equivalently, one $ \mathbf{C} \to \mathbf{D}^{\textup{op}}$. More than anything, opposite categories are syntactical sugar. Composition is only reversed artificially to make domains and codomains line up, but the actual composition is the same as in the original category.

Functors as Types

Before we move on to some code, let’s take a step back and look at the big picture (we’ve certainly plowed through enough details up to this point). The main thesis is that functoriality is a valuable property for an operation to have, but it’s not entirely clear why. Even the brightest of readers can only assume such properties are useful for mathematical analysis. It seems that the question we started this series out with, “what does category theory allow us to do that we couldn’t do before?” still has the answer, “nothing.” More relevantly, the question of what functoriality allows us to do is unclear. Indeed, once again the answer is “nothing.” Rather, functoriality in a computation allows one to analyze the behavior of a program. It gives the programmer a common abstraction in which to frame operations, and ease in proving the correctness of one’s algorithms.

In this light, the best we can do in implementing functors in programs is to give a type definition and examples. And in this author’s opinion this series is quickly becoming boring (all of the computational examples are relatively lame), so we will skip the examples in favor of the next post which will analyze more meaty programming constructs from a categorical viewpoint.

So recall the ML type definition of a category, a tuple of operations for source, target, identity, and composition:

datatype ('object, 'arrow)Category =
    category of ('arrow -> 'object) *
                ('arrow -> 'object) *
                ('object -> 'arrow) *
                ('arrow * 'arrow -> 'arrow)

And so a functor consists of the two categories involved (as types), and the mapping on objects, and the mapping on morphisms.

datatype ('cObject, 'cArrow, 'dObject, 'dArrow)Functor =
   aFunctor of ('cObject, 'cArrow)Category *
               ('cObject -> 'dObject) *
               ('cArrow -> 'dArrow) *
               ('dObject -> 'dArrow)Category

We encourage the reader who is uncomfortable with these type definitions to experiment with them by implementing some of our simpler examples (say, the size functor from sets to integers). Insofar as the basic definitions go, functors are not all that interesting. They become much more interesting when additional structure is imposed on them, and in the distant future we will see a glimpse of this in the form of adjointness. We hope to get around to analyzing statements like “syntax and semantics are adjoint functors.” For the next post in this series, we will take the three beloved functions of functional programming (map, foldl(r), and filter), and see what their categorical properties are.

Until then!

Universal Properties

Previously in this series we’ve seen the definition of a category and a bunch of examples, basic properties of morphisms, and a first look at how to represent categories as types in ML. In this post we’ll expand these ideas and introduce the notion of a universal property. We’ll see examples from mathematics and write some programs which simultaneously prove certain objects have universal properties and construct the morphisms involved.

A Grand Simple Thing

One might go so far as to call universal properties the most important concept in category theory. This should initially strike the reader as odd, because at first glance universal properties are so succinctly described that they don’t seem to be very interesting. In fact, there are only two universal properties and they are that of being initial and final.

Definition: An object $ A$ in a category $ \mathbf{C}$ is called initial if for every object $ B$ there is a unique morphism $ A \to B$. An object $ Z$ is called final if for every object $ B$ there is a unique morphism $ B \to Z$. If an object satisfies either of these properties, it is called universal. If an object satisfies both, it is called a zero object.

In both cases, the existence of a unique morphism is the same as saying the relevant Hom set is a singleton (i.e., for initial objects $ A$, the Hom set $ \textup{Hom}_{\mathbf{C}}(A,B)$ consists of a single element). There is one and only one morphism between the two objects. In the particular case of $ \textup{Hom}(A,A)$ when $ A$ is initial (or final), the definition of a category says there must be at least one morphism, the identity, and the universal property says there is no other.

There’s only one way such a simple definition could find fruitful applications, and that is by cleverly picking categories. Before we get to constructing interesting categories with useful universal objects, let’s recognize some universal objects in categories we already know.

In $ \mathbf{Set}$ the single element set is final, but not initial; there is only one set-function to a single-element set. It is important to note that the single-element set is far from unique. There are infinitely many (uncountably many!) singleton sets, but as we have already seen all one-element sets are isomorphic in $ \mathbf{Set}$ (they all have the same cardinality). On the other hand, the empty set is initial, since the “empty function” is the only set-mapping from the empty set to any set. Here the initial object truly is unique, and not just up to isomorphism.

It turns out universal objects are always unique up to isomorphism, when they exist. Here is the official statement.

Proposition: If $ A, A’$ are both initial in $ \mathbf{C}$, then $ A \cong A’$ are isomorphic. If $ Z, Z’$ are both final, then $ Z \cong Z’$.

Proof. Recall that a mophism $ f: A \to B$ is an isomorphism if it has a two sided inverse, a $ g:B \to A$ so that $ gf = 1_A$ and $ fg=1_B$ are the identities. Now if $ A,A’$ are two initial objects there are unique morphisms $ f : A \to A’$ and $ g: A’ \to A$. Moreover, these compose to be morphisms $ gf: A \to A$. But since $ A$ is initial, the only morphism $ A \to A$ is the identity. The situation for $ fg : A’ \to A’$ is analogous, and so these morphisms are actually inverses of each other, and $ A, A’$ are isomorphic. The proof for final objects is identical. $ \square$

Let’s continue with examples. In the category of groups, the trivial group $ \left \{ 1 \right \}$ is both initial and final, because group homomorphisms must preserve the identity element. Hence the trivial group is a zero object. Again, “the” trivial group is not unique, but unique up to isomorphism.

In the category of types with computable (halting) functions as morphisms, the null type is final. To be honest, this depends on how we determine whether two computable functions are “equal.” In this case, we only care about the set of inputs and outputs, and for the null type all computable functions have the same output: null.

Partial order categories are examples of categories which need not have universal objects. If the partial order is constructed from subsets of a set $ X$, then the initial object is the empty set (by virtue of being a subset of every set), and $ X$ as a subset of itself is obviously final. But there are other partial orders, such as inequality of integers, which have no “smallest” or “largest” objects. Partial order categories which have particularly nice properties (such as initial and final objects, but not quite exactly) are closely related to the concept of a domain in denotational semantics, and the language of universal properties is relevant to that discussion as well.

The place where universal properties really shine is in defining new constructions. For instance, the direct product of sets is defined by the fact that it satisfies a universal property. Such constructions abound in category theory, and they work via the ‘diagram categories’ we defined in our introductory post. Let’s investigate them now.

Quotients

Let’s recall the classical definition from set theory of a quotient. We described special versions of quotients in the categories of groups and topological spaces, and we’ll see them all unified via the universal property of a quotient in a moment.

Definition: An equivalence relation on a set $ X$ is a subset of the set product $ \sim \subset X \times X$ which is reflexive, symmetric, and transitive. The quotient set $ X / \sim$ is the set of equivalence classes on $ X$. The canonical projection $ \pi : X \to X/\sim$ is the map sending $ x$ to its equivalence class under $ \sim$.

The quotient set $ X / \sim$ can also be described in terms of a special property: it is the “largest” set which agrees with the equivalence relation $ \sim$. On one hand, it is the case that whenever $ a \sim b$ in $ X$ then $ \pi(a) = \pi(b)$. Moreover, for any set $ Y$ and any map $ g: X \to Y$ which equates equivalent things ($ g(a) = g(b)$ for all $ a \sim b$), then there is a unique map $ f : X/\sim \to Y$ such that $ f \pi = g$. This word salad is best translated into a diagram.

universal-quotient

Here we use a dashed line to assert the existence of a morphism (once we’ve proven such a morphism exists, we use a solid line instead), and the symbol $ \exists !$ signifies existence ($ \exists$) and uniqueness (!).

We can prove this explicitly in the category $ \mathbf{Set}$. Indeed, if $ g$ is any map such that $ g(a) = g(b)$ for all equivalent $ a,b \in X$, then we can define $ f$ as follows: for any $ a \in X$ whose equivalence class is denoted by $ [a]$ in $ X / \sim$, and define $ f([a]) = g(a)$. This map is well defined because if $ a \sim b$, then $ f([a]) = g(a) = g(b) = f([b])$. It is unique because if $ f \pi = g = h \pi$ for some other $ h: X / \sim \to Y$, then $ h([x]) = g(x) = f([x])$; this is the only possible definition.

Now the “official” way to state this universal property is as follows:

The quotient set $ X / \sim$ is universal with respect to the property of mapping $ X$ to a set so that equivalent elements have the same image.

But as we said earlier, there are only two kinds of universal properties: initial and final. Now this $ X / \sim$ looks suspiciously like an initial object ($ f$ is going from $ X / \sim$, after all), but what exactly is the category we’re considering? The trick to dissecting this sentence is to notice that this is not a statement about just $ X / \sim$, but of the morphism $ \pi$.

That is, we’re considering a diagram category. In more detail: fix an object $ X$ in $ \mathbf{Set}$ and an equivalence relation $ \sim$ on $ X$. We define a category $ \mathbf{Set}_{X,\sim}$ as follows. The objects in the category are morphisms $ f:X \to Y$ such that $ a \sim b$ in $ X$ implies $ f(a) = f(b)$ in $ Y$. The morphisms in the category are commutative diagrams

diagrams-quotient-category

Here $ f_1, f_2$ need to be such that they send equivalent things to equal things (or they wouldn’t be objects in the category!), and by the commutativity of the diagram $ f_2 = \varphi f_1$. Indeed, the statement about quotients is that $ \pi : X \to X / \sim$ is an initial object in this category. In fact, we have already proved it! But note the abuse of language in our offset statement above: it’s not really $ X / \sim$ that is the universal object, but $ \pi$. Moreover, the statement itself doesn’t tell us what category to inspect, nor whether we care about initial or final objects in that category. Unfortunately this abuse of language is widespread in the mathematical world, and for arguably good reason. Once one gets acquainted with these kinds of constructions, reading between the limes becomes much easier and it would be a waste of time to spell it out. After all, once we understand $ X / \sim$ there is no “obvious” choice for a map $ X \to X / \sim$ except for the projection $ \pi$. This is how $ \pi$ got its name, the canonical projection.

Two last bits of terminology: if $ \mathbf{C}$ is any category whose objects are sets (and hence, where equivalence relations make sense), we say that $ \mathbf{C}$ has quotients if for every object $ X$ there is a morphism $ \pi$ satisfying the universal property of a quotient. Another way to state the universal property is to say that all maps respecting the equivalence structure factor through the quotient, in the sense that we get a diagram like the one above.

What is the benefit of viewing $ X / \sim$ by its universal property? For one, the set $ X / \sim$ is unique up to isomorphism. That is, if any other pair $ (Z, g)$ satisfies the same property, we automatically get an isomorphism $ X / \sim \to Z$. For instance, if $ \sim$ is defined via a function $ f : X \to Y$ (that is, $ a \sim b$ if $ f(a) = f(b)$), then the pair $ (\textup{im}(f), f)$ satisfies the universal property of a quotient. This means that we can “decompose” any function into three pieces:

$ \displaystyle X \to X / \sim \to \textup{im}(f) \to Y$

The first map is the canonical projection, the second is the isomorphism given by the universal property of the quotient, and the last is the inclusion map into $ Y$. In a sense, all three of these maps are “canonical.” This isn’t so magical for set-maps, but the same statement (and essentially the same proof) holds for groups and topological spaces, and are revered as theorems. For groups, this is called The First Isomorphism Theorem, but it’s essentially the claim that the category of groups has quotients.

This is getting a bit abstract, so let’s see how the idea manifests itself as a program. In fact, it’s embarrassingly simple. Using our “simpler” ML definition of a category from last time, the constructive proof that quotient sets satisfy the universal property is simply a concrete version of the definition of $ f$ we gave above. In code,

fun inducedMapFromQuotient(setMap(x, pi, q), setMap(x, g, y)) =
   setMap(q, (fn a => g(representative(a))), y)

That is, once we have $ \pi$ and $ X / \sim$ defined for our given equivalence relation, this function accepts as input any morphism $ g$ and produces the uniquely defined $ f$ in the diagram above. Here the “representative” function just returns an arbitrary element of the given set, which we added to the abstract datatype for sets. If the set $ X$ is empty, then all functions involved will raise an “empty” exception upon being called, which is perfectly fine. We leave the functions which explicitly construct the quotient set given $ X, \sim$ as an exercise to the reader.

Products and Coproducts

Just as the concept of a quotient set or quotient group can be generalized to a universal property, so can the notion of a product. Again we take our intuition from $ \mathbf{Set}$. There the product of two sets $ X,Y$ is the set of ordered pairs

$ \displaystyle X \times Y = \left \{ (x,y) : x \in X, y \in Y \right \}$

But as with quotients, there’s much more going on and the key is in the morphisms. Specifically, there are two obvious choices for morphisms $ X \times Y \to X$ and $ X \times Y \to Y$. These are the two projections onto the components, namely $ \pi_1(x,y) = x$ and $ \pi_2(x,y) = y$. These projections are also called “canonical projections,” and they satisfy their own universal property.

The product of sets is universal with respect to the property of having two morphisms to its factors.

Indeed, this idea is so general that it can be formulated in any category, not just categories whose objects are sets. Let $ X,Y$ be two fixed objects in a category $ \mathbf{C}$. Should it exist, the product $ X \times Y$ is defined to be a final object in the following diagram category. This category has as objects pairs of morphisms

product-diagram-object

and as morphisms it has commutative diagrams

product-diagram-morphism

In words, to say products are final is to say that for any object in this category, there is a unique map $ \varphi$ that factors through the product, so that $ \pi_1 \varphi = f$ and $ \pi_2 \varphi = g$. In a diagram, it is to claim the following commutes:

product-universal-property

If the product $ X \times Y$ exists for any pair of objects, we declare that the category $ \mathbf{C}$ has products.

Indeed, many familiar product constructions exist in pure mathematics: sets, groups, topological spaces, vector spaces, and rings all have products. In fact, so does the category of ML types. Given two types ‘a and ‘b, we can form the (aptly named) product type ‘a * ‘b. The canonical projections exist because ML supports parametric polymorphism. They are

fun leftProjection(x,y) = x
fun rightProjection(x,y) = y

And to construct the unique morphism to the product,

fun inducedMapToProduct(f,g) = fn a => (f(a), g(a))

We leave the uniqueness proof to the reader as a brief exercise.

There is a similar notion called a coproduct, denoted $ X \coprod Y$, in which everything is reversed: the arrows in the diagram category go to $ X \coprod Y$ and the object is initial in the diagram category. Explicitly, for a fixed $ X, Y$ the objects in the diagram category are again pairs of morphisms, but this time with arrows going to the central object

coprod-diagram-object

The morphisms are again commutative diagrams, but with the connecting morphism going away from the central object

coprod-diagram-morphism

And a coproduct is defined to be an initial object in this category. That is, a pair of morphisms $ i_1, i_2$ such that there is a unique connecting morphism in the following diagram.

coprod-universal-property

Coproducts are far less intuitive than products in their concrete realizations, but the universal property is no more complicated.  For the category of sets, the coproduct is a disjoint union (in which shared elements of two sets $ X, Y$ are forcibly considered different), and the canonical morphisms are “inclusion” maps (hence the switch from $ \pi$ to $ i$ in the diagram above). Specifically, if we define the coproduct

$ \displaystyle X \coprod Y = (X \times \left \{ 1 \right \}) \cup (Y \times \left \{ 2 \right \})$

as the set of “tagged” elements (the right entry in a tuple signifies which piece of the coproduct the left entry came from), then the inclusion maps $ i_1(x) = (x,1)$ and $ i_2(y) = (y,2)$ are the canonical morphisms.

There are similar notions of disjoint unions for topological spaces and graphs, which are their categories’ respective coproducts. However, in most categories the coproducts are dramatically different from “unions.” In group theory, it is a somewhat complicated object known as the free product. We mentioned free products in our hasty discussion of the fundamental group, but understanding why and where free groups naturally occur is quite technical. Similarly, coproducts of vector spaces (or $ R$-modules) are more like a product, with the stipulation that at most finitely many of the entries of a tuple are nonzero (e.g., formal linear combinations of things from the pieces). Even without understanding these examples, the reader should begin to believe that relatively simple universal properties can yield very useful objects with potentially difficult constructions in specific categories. The ubiquity of the concepts across drastically different fields of mathematics is part of why universal properties are called “universal.”

Luckily, the category of ML types has a nice coproduct which feels like a union, but it is not supported as a “native” language feature like products types are. Specifically, given two types ‘a, ‘b we can define the “coproduct type”

datatype ('a, 'b)Coproduct = left of 'a | right of 'b

Let’s prove this is actually a coproduct: fix two types ‘a and ‘b, and let $ i_1, i_2$ be the functions

fun leftInclusion(x) = left(x)
fun rightInclusion(y) = right(y)

Then given any other pair of functions $ f,g$ which accept as input types ‘a and ‘b, respectively, there is a unique function $ \varphi$ operating on the coproduct type. We construct it as follows.

fun inducedCoproductMap(f,g) =
   let
      theMap(left(a)) = f(a)
      theMap(right(b)) = g(b)
   in
      theMap
   end

The uniqueness of this construction is self-evident. This author finds it fascinating that these definitions are so deep and profound (indeed, category theory is heralded as the queen of abstraction), but their realizations are trivially obvious to the working programmer. Perhaps this is a statement about how well-behaved the category of ML types is.

Continuing On

So far we have seen three relatively simple examples of universal properties, and explored how they are realized in some categories. We should note before closing two things. The first is that not every category has objects with these universal properties. Unfortunately poset categories don’t serve as a good counterexample for this (they have both products and coproducts; what are they?), but it may be the case that in some categories only some pairs of objects have products or coproducts, while others do not.

Second, there are many more universal properties that we haven’t covered here. For instance, the notion of a limit is a universal property, as is the notion of a “free” object. There are kernels, pull-backs, equalizers, and many other ad-hoc universal properties without names. And for every universal property there is a corresponding “dual” property that results from reversing the arrows in every diagram, as we did with coproducts. We will visit the relevant ones as they come up in our explorations.

In the next few posts we’ll encounter functors and the concept of functoriality, and start asking some poignant questions about familiar programmatic constructions.

Until then!

Properties of Morphisms

This post is mainly mathematical. We left it out of our introduction to categories for brevity, but we should lay these definitions down and some examples before continuing on to universal properties and doing more computation. The reader should feel free to skip this post and return to it later when the words “isomorphism,” “monomorphism,” and “epimorphism” come up again. Perhaps the most important part of this post is the description of an isomorphism.

Isomorphisms, Monomorphisms, and Epimorphisms

Perhaps the most important paradigm shift in category theory is the focus on morphisms as the main object of study. In particular, category theory stipulates that the only knowledge one can gain about an object is in how it relates to other objects. Indeed, this is true in nearly all fields of mathematics: in groups we consider all isomorphic groups to be the same. In topology, homeomorphic spaces are not distinguished. The list goes on. The only way to do determine if two objects are “the same” is by finding a morphism with special properties. Barry Mazur gives a more colorful explanation by considering the meaning of the number 5 in his essay, “When is one thing equal to some other thing?” The point is that categories, more than existing to be a “foundation” for all mathematics as a formal system (though people are working to make such a formal system), exist primarily to “capture the essence” of mathematical discourse, as Mazur puts it. A category defines objects and morphisms, but literally all of the structure of a category lies in its morphisms. And so we study them.

The strongest kind of morphism we can consider is an isomorphism. An isomorphism is the way we say two objects in a category are “the same.” We don’t usually care whether two objects are equal, but rather whether some construction is unique up to isomorphism (more on that when we talk of universal properties). The choices made in defining morphisms in a particular category allow us to strengthen or weaken this idea of “sameness.”

Definition: A morphism $ f : A \to B$ in a category $ \mathbf{C}$ is an isomorphism if there exists a morphism $ g: B \to A$ so that both ways to compose $ f$ and $ g$ give the identity morphisms on the respective objects. That is,

$ gf = 1_A$ and $ fg = 1_B$.

The most basic (usually obvious, but sometimes very deep) question in approaching a new category is to ask what the isomorphisms are. Let us do this now.

In $ \mathbf{Set}$ the morphisms are set-functions, and it is not hard to see that any two sets of equal cardinality have a bijection between them. As all bijections have two-sided inverses, two objects in $ \mathbf{Set}$ are isomorphic if and only if they have the same cardinality. For example, all sets of size 10 are isomorphic. This is quite a weak notion of “sameness.” In contrast, there is a wealth of examples of groups of equal cardinality which are not isomorphic (the smallest example has cardinality 4). On the other end of the spectrum, a poset category $ \mathbf{Pos}_X$ has no isomorphisms except for the identity morphisms. The poset categories still have useful structure, but (as with objects within a category) the interesting structure is in how a poset category relates to other categories. This will become clearer later when we look at functors, but we just want to dissuade the reader from ruling out poset categories as uninteresting due to a lack of interesting morphisms.

Consider the category $ \mathbf{C}$ of ML types with ML functions as morphisms. An isomorphism in this category would be a function which has a two-sided inverse. Can the reader think of such a function?

Let us now move on to other, weaker properties of morphisms.

Definition: A morphism $ f: A \to B$ is a monomorphism if for every object $ C$ and every pair of morphisms $ g,h: C \to A$ the condition $ fg = fh$ implies $ g = h$.

The reader should parse this notation carefully, but truly think of it in terms of the following commutative diagram:

monomorphism-diagram

Whenever this diagram commutes and $ f$ is a monomorphism, then we conclude (by definition) that $ g=h$. Remember that a diagram commuting just means that all ways to compose morphisms (and arrive at morphisms with matching sources and targets) result in an identical morphism. In this diagram, commuting is the equivalent of claiming that $ fg = fh$, since there are only two nontrivial ways to compose.

The idea is that monomorphisms allow one to “cancel” $ f$ from the left side of a composition (although, confusingly, this means the cancelled part is on the right hand side of the diagram).

The corresponding property for cancelling on the right is defined identically.

Definition: A morphism $ f: A \to B$ is an epimorphism if for every object $ C$ and every pair of morphism $ g,h: B \to C$ the condition $ gf = hf$ implies $ g = h$.

Again, the relevant diagram.

epimorphism-diagram

Whenever $ f$ is an epimorphism and this diagram commutes, we can conclude $ g=h$.

Now one of the simplest things one can do when considering a category is to identify the monomorphisms and epimorphisms. Let’s do this for a few important examples.

Monos and Epis in Various Categories

In the category $ \mathbf{Set}$, monomorphisms and epimorphisms correspond to injective and surjective functions, respectively. Lets see why this is the case for monomorphisms. Recall that an injective function $ f$ has the property that $ f(x) = f(y)$ implies $ x=y$. With this property we can show $ f$ is a monomorphism because if $ f(g(x)) = f(h(x))$ then the injective property gives us immediately that $ g(x) = h(x)$. Conversely, if $ f$ is a monomorphism and $ f(x) = f(y)$, we will construct a set $ C$ and two convenient functions $ g, h: C \to A$ to help us show that $ x=y$. In particular, pick $ C$ to be the one point set $ \left \{ c \right \}$, and define $ g(c) = x, h(c) = y$. Then as functions $ fg = fh$. Indeed, there is only one value in the domain, so saying this amounts to saying $ f(x) = fg(c) = fh(c) = f(y)$, which we know is true by assumption. By the monomorphism property $ g = h$, so $ x = g(c) = h(c) = y$.

Now consider epimorphisms. It is clear that a surjective map is an epimorphism, but the converse is a bit trickier. We prove by contraposition. Instead of now picking the “one-point set,” for our $ C$, we must choose something which is one element bigger than $ B$. In particular, define $ g, h : B \to B’$, where $ B’$ is $ B$ with one additional point $ x$ added (which we declare to not already be in $ B$). Then if $ f$ is not surjective, and there is some $ b_0 \in B$ which is missed by $ f$, we define $ g(b_0) = x$ and $ g(b) = b$ otherwise. We can also define $ h$ to be the identity on $ B$, so that $ gf = hf$, but $ g \neq h$. So epimorphisms are exactly the surjective set-maps.

There is one additional fact that makes the category of sets well-behaved: a morphism in $ \mathbf{Set}$ is an isomorphism if and only if it is both a monomorphism and an epimorphism. Indeed, isomorphisms are set-functions with two-sided inverses (bijections) and we know from classical set theory that bijections are exactly the simultaneous injections and surjections. A warning to the reader: not all categories are like this! We will see in a moment an example of a nontrivial category in which isomorphisms are not the same thing as simultaneous monomorphisms and epimorphisms.

The category $ \mathbf{Grp}$ is very similar to $ \mathbf{Set}$ in regards to monomorphisms and epimorphisms. The former are simply injective group homomorphisms, while the latter are surjective group homomorphisms. And again, a morphisms is an isomorphism if and only if it is both a monomorphism and an epimorphism. We invite the reader to peruse the details of the argument above and adapt it to the case of groups. In both cases, the hard decision is in choosing $ C$ when necessary. For monomorphisms, the “one-point group” does not work because we are constrained to send the identity to the identity in any group homomorphism. The fortuitous reader will avert their eyes and think about which group would work, and otherwise we suggest trying $ C = \mathbb{Z}$. After completing the proof, the reader will see that the trick is to find a $ C$ for which only one “choice” can be made. For epimorphisms, the required $ C$ is a bit more complex, but we invite the reader to attempt a proof to see the difficulties involved.

Why do these categories have the same properties but they are acquired in such different ways? It turns out that although these proofs seem different in execution, they are the same in nature, and they follow from properties of the category as a whole. In particular, the “one-point object” (a singleton set for $ \mathbf{Set}$ and $ \mathbb{Z}$ for $ \mathbf{Grp}$) is more categorically defined as the “free object on one generator.” We will discuss this more when we get to universal properties, but a “free object on $ n$ generators” is roughly speaking an object $ A$ for which any morphism with source $ A$ must make exactly $ n$ “choices” in its definition. With sets that means $ n$ choices for the images of elements, for groups that means $ n$ consistent choices for images of group elements. On the epimorphism side, the construction is a sort of “disjoint union object” which is correctly termed a “coproduct.” But  momentarily putting aside all of this new and confusing terminology, let us see some more examples of morphisms in various categories.

Our recent primer on rings was well-timed, because the category $ \mathbf{Ring}$ of rings (with identity) is an example of a not-so-well-behaved category. As with sets and groups, we do have that monomorphisms are equivalent to injective ring homomorphisms, but the argument is trickier than it was for groups. It is not obvious which “convenient” object $ C$ to choose here, since maps $ \mathbb{Z} \to R$ yield no choices: 1 maps to 1, 0 maps to 0, and the properties of a ring homomorphism determine everything else (in fact, the abelian group structure and the fact that units are preserved is enough). This makes $ \mathbb{Z}$ into what’s called an “initial object” in $ \mathbf{Ring}$; more on that when we study universal properties. In fact, we invite the reader to return to this post after we talk about the universal property of polynomial rings. It turns out that $ \mathbb{Z}[x]$ is a suitable choice for $ C$, and the “choice” made is where to send the indeterminate $ x$.

On the other hand, things go awry when trying to apply analogous arguments to epimorphisms. While it is true that every surjective ring homomorphism is an epimorphism (it is already an epimorphism in $ \mathbf{Set}$, and the argument there applies), there are ring epimorphisms which are not surjections! Consider the inclusion map of rings $ i : \mathbb{Z} \to \mathbb{Q}$. The map $ i$ is not surjective, but it is an epimorphism. Suppose $ g, h : \mathbb{Q} \to R$ are two parallel ring morphisms, and they agree on $ \mathbb{Z}$ (they will always do so, since there is only one ring homomorphism $ \mathbb{Z} \to R$). Then $ g,h$ must also agree on $ \mathbb{Q}$, because if $ p,q \in \mathbb{Z}$ with $ q \neq 0$, then

$ \displaystyle g(p/q) = g(p)g(q^{-1}) = g(p)g(q)^{-1} = h(p)h(q)^{-1} = h(p/q)$

Because the map above is also an injection, the category of rings is a very naturally occurring example of a category which has morphisms that are both epimorphisms and monomorphisms, but not isomorphisms.

There are instances in which monomorphisms and epimorphisms are trivial. Take, for instance any poset category. There is at most one morphism between any two objects, and so the conditions for an epimorphism and monomorphism vacuously hold. This is an extreme example of a time when simultaneous monomorphisms and epimorphisms are not the same thing as isomorphisms! The only isomorphisms in a poset category are the identity morphisms.

Morals about Good and Bad Categories

The inspection of epimorphisms and monomorphisms is an important step in the analysis of a category. It gives one insight into how “well-behaved” the category is, and picks out the objects which are special either for their useful properties or confounding trickery.

This reminds us of a quote of Alexander Grothendieck, one of the immortal gods of mathematics who popularized the use of categories in mainstream mathematics.

A good category containing some bad objects is preferable to a bad category containing only good objects.

I suppose the thesis here is that the having only “good” objects yields less interesting and useful structure, and that one should be willing to put up with the bad objects in order to have access to that structure.

Next time we’ll jump into a discussion of universal properties, and start constructing some programs to prove that various objects satisfy them.

Until then!