Perhaps primarily due to the prominence of monads in the Haskell programming language, programmers are often curious about category theory. Proponents of Haskell and other functional languages can put category-theoretic concepts on a pedestal or in a mexican restaurant, and their benefits can seem as mysterious as they are magical. For instance, the most common use of a monad in Haskell is to simulate the mutation of immutable data. Others include suspending and backtracking computations, and even untying tangled rope.

Category theory is often mocked (or praised) as the be-all and end-all of mathematical abstraction, and as such (and for other reasons I’ve explored on this blog) many have found it difficult to digest and impossible to master. However, in truth category theory arose from a need for organizing mathematical ideas based on their shared structure. In this post, I want to give a brief overview of what purpose category theory serves within mathematics, and emphasize what direction we’ll be going with it on this blog.

We should also note (writing this after the fact), that this article is meant to be a motivation to our future series on category theory. It is very difficult to explain what category theory is without going into very specific details, but we can explain by analogy what category theory achieves for us. That is the goal of this post.

## Category Theory as a Modern Language

It would be a silly question to ask why we don’t program entirely in binary. It’s a slow, inefficient process prone to mistakes of all sorts, and so we generally avoid it (although a well-rounded programmer can readily work in binary when it is necessary). But once upon a time there was no choice. Eventually people found that certain programmatic constructions were ubiquitous, and the developers of the next generation of languages abstracted these ideas to make types, lists, loops, if statements, and functions. The cycle continued: we found that we needed to further allow programmers to define custom data types, polymorphic operations, protected data, and others. Another iteration and we have list comprehensions, Python decorators, Javascript promises, and a whole host of other programming paradigms that have turned into features [1].

So it is with programming as it is in mathematics. I would digress by detailing the history of numbers as symbols, or of the transition to set-based mathematics in the late 1800′s to early 1900′s. But within the last fifty years there has been a major revolution in the discourse of modern mathematics: the integration of category theory.

I have to contextualize this immediately, because I don’t want to enter a metamathematical discussion about what the proper foundation of all mathematics should be. This is a huge can of worms riddled with strong opinions and logical fallacies. I’m not saying that mathematicians have unanimously agreed that category theory is the correct basis for all logic and mathematics. I wouldn’t claim *that* just as I wouldn’t claim that all programmers agree that object-oriented programming is the best model for all programs, because that would be ridiculous and myopic.

I am saying, however, that enough mathematicians have agreed category theory is useful and convenient. As a result category theory is the contemporary baseline for serious discussion in many fields of pure mathematics. In short, anyone who wants to do serious mathematics in these fields must learn the language of category theory.

One goal of this blog’s category theory series is to gain fair fluency in this modern language.

## Category Theory as an Organizing Principle

As most readers will readily understand, people who study and develop programming languages think differently about language features than people who just use programming languages. For instance, a systems programmer friend of mine was pleased to discover that Python supports multiple variable assignments in a single statement (e.g. a,b = 7, False). Now a language designer might smile and think, “Well that’s just syntactic sugar.” But to a working programmer, this kind of feature is quite convenient and extensible (to one-line variable swapping, multiple return values, etc.). On the other hand, when a language designer claims to have invented a wonderful new paradigm, say, continuations, the working programmer would think, “What good is that for? It’s just an ugly way to write functions!” It’s not until someone uses the feature to do amazing things like eliminate the need for a stack or provide lightweight extensible stack inspection that the working programmer appreciates the abstraction.

Analogously, those who study category theory from a logical viewpoint see it very differently than those who use it to come up with new mathematics. Many would argue it was not until the work of Alexander Grothendieck (in the late 1950′s and 60′s) that non-category-theorists saw the true value of applying category theory to their fields. Grothendieck implemented a sweeping reform of a number of mathematical fields, his most notable stemming from the invention of étale cohomology. Now I’m not a number theorist or an algebraic geometer, but I know enough to understand category theory’s role there.

The main benefit to using category theory is as a way to organize and synthesize information. This is particularly true of the concept of a *universal property*. We will hear more about this in due time, but as it turns out most important mathematical structures can be phrased in terms of universal properties. Moreover, a universal property jumps right to the heart of *why* a construction is important. For example, one new to mathematics might wonder why polynomials are so ubiquitous and important. The answer is (vaguely) that they satisfy a universal property which makes them canonical extensions of certain computations.

I want to make this point very clear, because most newcomers to category theory are never told this. Category theory exists because it fills a *need*. Even if that need is a need for better organization and a refocusing of existing definitions. It was not just an attempt to build higher abstractions, but a successful adaptation of mathematics to a more complex world.

And so as category theory has spread through the mathematical world, more and more definitions are phrased in terms of various universal constructions in special categories. This is a good thing precisely because there are *only two* universal properties. As we’ll see, by stating that an object satisfies a universal property, one immediately understands how the proof will progress, and many properties like uniqueness and invariance will follow trivially. Familiar readers of this blog will remember our posts on groups (and will read future posts on rings and fields), in which we state and prove theorems about quotients and products and isomorphism theorems which are all essentially the same across the various fields. Viewing the problem abstractly in category theory allows one to prove all of these theorems simultaneously, and study the differences between the objects via the properties of the category as a whole.

To reiterate, category theory streamlines the process of making precise technical definitions and proving their well-definition. One hopes, then, that very general theorems proved within category theory can apply to a wide breadth of practical areas. [2]

## Category as a Tool to Gain More Knowledge

When someone invents a new programming tool, the primary goal is usually to allow a programmer to do something that he couldn’t do previously (or was difficult/inconvenient to do). For instance, programmers invented version control to allow for easy project collaboration and rollbacks. Before then, managing multiple versions of a file was a horrendous task.

In mathematics, we can ask the poignant question: what does category theory allow us to do that we couldn’t do before? This should be read as *besides* having a new way to think about mathematical structures and *besides* having a more efficient language for discourse. Of course, this is a highly philosophical question. Could it be that there is some (non-categorical) theorem that can’t be proved unless you resort to category-theoretical arguments? In my optimistic mind the answer must certainly be no. Moreover, it appears that most proofs that “rely” on category theory only really do so because they’re so deeply embedded in the abstraction that unraveling them to find non-category-theoretical proofs would be a tiresome and fruitless process.

In programming we can ask a related question: what insights does category theory give us about programming? Can we write programs better if we resort to organizing things in terms of categories? Would it be easier to prove correctness of our programs or to discover a good algorithm to solve a task?

I think it goes without saying that we certainly can’t do anything that we couldn’t have done before (this would violate the notion that our usual languages are Turing complete). But these other soft questions should have positive answers. While in the previous two sections I gave concrete reasons why one might want to learn category theory, here the reason is very vague. Supposedly, learning category theory makes one a better programmer by forcing one to make connections between structures and computation. Then when a new problem comes along, it becomes easy (almost natural!) to fit it into a categorical organization of the world, and the right solution just “falls into your lap.”

While I don’t necessarily espouse this line of thinking (I believe any mathematics makes you better at analyzing problems), this is essentially the argument for why functional programming is a good thing to learn.

## What We’ll Do With Categories

I don’t necessarily have any amazing applications of category theory in mind for this blog. Instead, I want to develop a fair fluency and categorical organization (the first to sections of this article) among my readers. Along the way, we will additionally *implement* the concepts of category theory in code. This will give us a chance to play with the ideas as we learn, and hopefully will make all of the abstract nonsense much more concrete.

So next time we’ll start with the definition of a category, and give a wealth of examples. Until then!

**[1]** I’m obviously oversimplifying the history of programming languages, but the spirit is true, and the same as for all technological developments: incremental improvements based on a need for convenience and removing repetitive tasks. (back)

**[2]** One place this is particularly convenient is actually in the theory of persistent homology. Though on this blog the plan is to avoid the theory side before we investigate the algorithm from a high-level, once we get to the theory we will see an effective use of category theory in action. (back)

The sentence “Could it be that there is some (non-categorical) theorem that can’t be proved unless you resort to category-theoretical arguments? In my optimistic mind the answer must certainly be no.” worries me a lot. If someone said “Could it be that there is some (non- complex analytic) theorem that can’t be proved unless you resort to complex analytic arguments? In my optimistic mind the answer must certainly be no” pretty much any modern mathematician would laugh and name any number of theorems from number theory alone, yet there was a time when complex numbers were a level of abstraction too far for many people. Category theory has proven itself an extremely useful tool for doing mathematics, in addition to being a convenient way to think about mathematics.

Anyway I look forward to the next post on this.

What I had in mind was more of the basic results: we want to prove that some object has said universal property? One can work with abstract nonsense and chase diagrams, or explicitly construct the object in the category of sets and prove does what it should (say, for dealing with abliean categories where elements make sense). The former is obviously more convenient and perhaps more elegant, but it may not be necessary.

But perhaps I just am foolish enough to believe that there are always more ways to prove a theorem. Have I been lulled into a false sense of security by the fundamental theorem of algebra? Maybe, but I think you’d be hard pressed to find a theorem for which you could definitively conclude has no proof which does not use complex-analytic methods (I’m excluding, of course, theorems that one proves which are inextricably complex-analytic).

But then again I have no idea how one would even go about proving such a claim. I’m not trying to say that category theory or complex analysis aren’t the best way to get their respective jobs done, but to say there is no other way is a dreary outlook to me.

I mean this exactly the sort of thing that the incompleteness theorem guarantees. Statements which are about number theory (or any area of mathematics) which are not provable within number theory. I don’t have any specific examples in mind which are provable within complex analysis and (provably) unprovable within Peano arithmetic (axiomatic elementary number theory), however there are famous examples of natural (i.e. not just made to have this property) theorems provably not provable within PA, the Paris-Huntington theorem is a famous example of such. So I guess my point is we can’t a priori be so sure that a simpler (non-categorical, non-complex analytic, etc) proof of a statement exists.

That’s a very good point. I think I may have been referring too much to set theory, and of course set theory suffers from incompleteness. I’m totally inept in the ways of formal logic, but would I be right to guess that category theory also suffers from incompleteness? Then there’s the obvious question: is category theory a strict strengthening of ZFC? I’m not even entirely sure that incompleteness applies to category theory. I think I meant for my philosophical musings to imply that “non-categorical” could imply presently unknown techniques that may later come along, and I would be very surprised to hear of a theorem that could not be proved with any (possibly future) methods that don’t lie in category theory.

On the other hand, it’s clear these are questions I just spit out and don’t think through.

Yes, category theory still suffers from incompleteness (any sufficiently strong formal logic will). If you are interested, Yanofsky wrote a fascinating paper (based on one by Lawvere — I don’t really understand either, not being well-versed in category theory) that develops several classical paradoxes, including Godel’s First Incompleteness Theorem, in the language of categories.

Though I should note that that paper isn’t so much about incompleteness of category theory, as it is proving incompleteness using category theoretical language. Just thought it might be interesting.

(hopefully this shows up in the right place)

Asking whether category theory suffers from incompleteness is a bit strange… The category axioms, like the group axioms define a class of structures, rather than a single “theory”. Any category is a model of these axioms, so rather trivially, category theory is incomplete.

On the other hand, asking whether certain categorical foundations (E.g., ETCS or HoTT) are complete is sensible, but the answer had better be “no.”: the Incompleteness Theorem says that if a theory can do arithmetic, it is incomplete.

Counter-intuitively, these theories are almost always weaker than ZFC from a pure “what is provable?” standpoint. (ETCS can easily be strengthened to be “equivalent” to ZFC). This is actually really obvious once you realize that pure category theory can’t prove or disprove strict equality; we take it for granted that “equality” means “up to canonical isomorphism”, and we are literally lying when we say “THE limit”. The thing is, this weakness buys us expressiveness– ZFC is cumbersome to work with structurally, precisely because it is so concrete, so rigid.

Something else I’d like to point out is that HoTT (univalent foundations) is really not possible without category theory; the central result (which says that identity types behave like higher homotopies) is inexpressible without categorical language, and requires a categorical perspective to see.

One aspect of category theory that deserves mentioning is that categories (and higher categories) give a much more satisfying answer to “when are two things ‘the same’?”

In set theory, equality ends up too strict. You cannot distinguish the circle group and the group of unit complex numbers through group homomorphisms. You can’t distinguish pairs of real numbers from the complex plane using real-valued linear maps. You can’t distinguish the real line and an open interval using continuous maps. You can’t distinguish a disk from a point using homotopies.

In any modern branch of mathematics, we end up studying not equality “on the nose”. Doing so makes unnecessary distinctions. What we want, instead, is to consider spaces which are “essentially” equal. That is, they contain the same information given the structure they are endowed with.

This is the notion of isomorphism.

In category theory, we prefer things to only worry about things “up to isomorphism”. Universal properties, the bread and butter of category theory, don’t even uniquely define objects. They define them merely up to isomorphism.

Depending on your foundations, objects can still be “strictly equal” in categories. However, there’s a strong preference to ignore this fact. Making use of strict equality is humorously referred to as “evil” in category theory. And some foundations seek to do away with the notion of strict equality entirely. This is essentially the principle behind the infamous Univalence Axiom: isomorphism *is* equality.

This question of equality also motivates some of the research in the subject. In basic category theory, objects can be isomorphic, but morphisms have no such weak notion of equality — they are always strictly equal. This gets you into trouble when you talk about the category of categories. This category has not just objects (with weak “isomorphic” equality) and morphisms (with only strict equality), but also natural transformations (morphisms between morphisms).

You end up with categories which are “essentially the same”, but which aren’t equal because the morphisms aren’t equal in a strict sense. Higher category theory gives each layer of the cake its own happy notion of weak equality.

For readers who didn’t quite get that, here’s a silly analogy. When driving, no one obeys the speed limit *all the time*. And no cop pulls you over for speeding when they catch you doing 5mph over. What’s important isn’t that the law is respected to the letter — what *is* important is that you don’t cause too much trouble. Isomorphism is treating non-equal things equal as long as you can’t get yourself into trouble doing so.

Well said.

The closest I’ve come to fiddling with morphisms in the sense of higher categories is probably homotopy categories. I honestly don’t know how useful higher categories will be for what I intend to do with this series, but chances are I will be learning much more about them before it’s done.

Barry Mazur wrote an article about some of these questions called “When is one thing equal to some other thing?” that you might find interesting. It is available at http://www.math.harvard.edu/~mazur/preprints/when_is_one.pdf

He talks about how category theory is a “bring your own set theory” party, where you can do a lot of work without even mentioning which set of axioms underlies your Set category.

I’ve been really looking forward to you writing a series on category theory! I’ve never had the time/chance to really study it, and that’s definitely a weakness in my mathematics.

I happened to click through on a LinkedIn update: lucky me! I’m looking forward to this series, and I hope that you’ll consider posting your updates on G+ as you did for a while, last year. Er, two years ago. Er, whenever.