The Čech Complex and the Vietoris-Rips Complex

It’s about time we got back to computational topology. Previously in this series we endured a lightning tour of the fundamental group and homology, then we saw how to compute the homology of a simplicial complex using linear algebra.

What we really want to do is talk about the inherent shape of data. Homology allows us to compute some qualitative features of a given shape, i.e., find and count the number of connected components or a given shape, or the number of “2-dimensional holes” it has. This is great, but data doesn’t come in a form suitable for computing homology. Though they may have originated from some underlying process that follows nice rules, data points are just floating around in space with no obvious connection between them.

Here is a cool example of Thom Yorke, the lead singer of the band Radiohead, whose face was scanned with a laser scanner for their music video “House of Cards.”

radiohead-still

Radiohead’s Thom Yorke in the music video for House of Cards (click the image to watch the video).

Given a point cloud such as the one above, our long term goal (we’re just getting started in this post) is to algorithmically discover what the characteristic topological features are in the data. Since homology is pretty coarse, we might detect the fact that the point cloud above looks like a hollow sphere with some holes in it corresponding to nostrils, ears, and the like. The hope is that if the data set isn’t too corrupted by noise, then it’s a good approximation to the underlying space it is sampled from. By computing the topological features of a point cloud we can understand the process that generated it, and Science can proceed.

But it’s not always as simple as Thom Yorke’s face. It turns out the producers of the music video had to actually degrade the data to get what you see above, because their lasers were too precise and didn’t look artistic enough! But you can imagine that if your laser is mounted on a car on a bumpy road, or tracking some object in the sky, or your data comes from acoustic waves traveling through earth, you’re bound to get noise. Or more realistically, if your data comes from thousands of stock market prices then the process generating the data is super mysterious. It changes over time, it may not follow any discernible pattern (though speculators may hope it does), and you can’t hope to visualize the entire dataset in any useful way.

But with persistent homology, so the claim goes, you’d get a good qualitative understanding of the dataset. Your results would be resistant to noise inherent in the data. It also wouldn’t be sensitive to the details of your data cleaning process. And with a dash of ingenuity, you can come up with a reasonable mathematical model of the underlying generative process. You could use that model to design algorithms, make big bucks, discover new drugs, recognize pictures of cats, or whatever tickles your fancy.

But our first problem is to resolve the input data type error. We want to use homology to describe data, but our data is a point cloud and homology operates on simplicial complexes. In this post we’ll see two ways one can do this, and see how they’re related.

The Čech complex

Let’s start with the Čech complex. Given a point set X in some metric space and a number \varepsilon > 0, the Čech complex C_\varepsilon is the simplicial complex whose simplices are formed as follows. For each subset S \subset X of points, form a (\varepsilon/2)-ball around each point in S, and include S as a simplex (of dimension |S|) if there is a common point contained in all of the balls in S. This structure obviously satisfies the definition of a simplicial complex: any sub-subset S' \subset S of a simplex S will be also be a simplex. Here is an example of the epsilon balls.

Image credit: Robert Ghrist

An example of a point cloud (left) and a corresponding choice of (epsilon/2)-balls. To get the Cech complex, we add a k-simplex any time we see a subset of k points with common intersection.  [Image credit: Robert Ghrist]

Let me superscript the Čech complex to illustrate the pieces. Specifically, we’ll let C_\varepsilon^{j} denote all the simplices of dimension up to j. In particular, C_\varepsilon^1 is a graph where an edge is placed between x,y if d(x,y) < \varepsilon, and C_{\varepsilon}^2 places triangles (2-simplices) on triples of points whose balls have a three-way intersection.

A topologist will have a minor protest here: the simplicial complex is supposed to resemble the structure inherent in the underlying points, but how do we know that this abstract simplicial complex (which is really hard to visualize!) resembles the topological space we used to make it? That is, X was sitting in some metric space, and the union of these epsilon-balls forms some topological space X(\varepsilon) that is close in structure to X. But is the Čech complex C_\varepsilon close to X(\varepsilon)? Do they have the same topological structure? It’s not a trivial theorem to prove, but it turns out to be true.

The Nerve Theorem: The homotopy types of X(\varepsilon) and C_\varepsilon are the same.

We won’t remind the readers about homotopy theory, but suffice it to say that when two topological spaces have the same homotopy type, then homology can’t distinguish them. In other words, if homotopy type is too coarse for a discriminator for our dataset, then persistent homology will fail us for sure.

So this theorem is a good sanity check. If we want to learn about our point cloud, we can pick a \varepsilon and study the topology of the corresponding Čech complex C_\varepsilon. The reason this is called the “Nerve Theorem” is because one can generalize it to an arbitrary family of convex sets. Given some family F of convex sets, the nerve is the complex obtained by adding simplices for mutually overlapping subfamilies in the same way. The nerve theorem is actually more general, it says that with sufficient conditions on the family F being “nice,” the resulting Čech complex has the same topological structure as F.

The problem is that Čech complexes are tough to compute. To tell whether there are any 10-simplices (without additional knowledge) you have to inspect all subsets of size 10. In general computing the entire complex requires exponential time in the size of X, which is extremely inefficient. So we need a different kind of complex, or at least a different representation to compensate.

The Vietoris-Rips complex

The Vietoris-Rips complex is essentially the same as the Čech complex, except instead of adding a d-simplex when there is a common point of intersection of all the (\varepsilon/2)-balls, we just do so when all the balls have pairwise intersections. We’ll denote the Vietoris-Rips complex with parameter \varepsilon as VR_{\varepsilon}.

Here is an example to illustrate: if you give me three points that are the vertices of an equilateral triangle of side length 1, and I draw (1/2)-balls around each point, then they will have all three pairwise intersections but no common point of intersection.

intersection

Three balls which intersect pairwise, but have no point of triple intersection. With appropriate parameters, the Cech and V-R complexes are different.

So in this example the Vietoris-Rips complex is a graph with a 2-simplex, while the Čech complex is just a graph.

One obvious question is: do we still get the benefits of the nerve theorem with Vietoris-Rips complexes? The answer is no, obviously, because the Vietoris-Rips complex and Čech complex in this triangle example have totally different topology! But everything’s not lost. What we can do instead is compare Vietoris-Rips and Čech complexes with related parameters.

Theorem: For all \varepsilon > 0, the following inclusions hold

\displaystyle C_{\varepsilon} \subset VR_{2 \varepsilon} \subset C_{2\varepsilon}

So if the Čech complexes for both \varepsilon and 2\varepsilon are good approximations of the underlying data, then so is the Vietoris-Rips complex. In fact, you can make this chain of inclusions slightly tighter, and if you’re interested you can see Theorem 2.5 in this recent paper of de Silva and Ghrist.

Now your first objection should be that computing a Vietoris-Rips complex still requires exponential time, because you have to scan all subsets for the possibility that they form a simplex. It’s true, but one nice thing about the Vietoris-Rips complex is that it can be represented implicitly as a graph. You just include an edge between two points if their corresponding balls overlap. Once we want to compute the actual simplices in the complex we have to scan for cliques in the graph, so that sucks. But it turns out that computing the graph is the first step in other more efficient methods for computing (or approximating) the VR complex.

Let’s go ahead and write a (trivial) program that computes the graph representation of the Vietoris-Rips complex of a given data set.

import numpy
def naiveVR(points, epsilon):
   points = [numpy.array(x) for x in points]   
   vrComplex = [(x,y) for (x,y) in combinations(points, 2) if norm(x - y) < 2*epsilon]
   return numpy.array(vrComplex)

Let’s try running it on a modestly large example: the first frame of the Radiohead music video. It’s got about 12,000 points in \mathbb{R}^4 (x,y,z,intensity), and sadly it takes about twenty minutes. There are a couple of ways to make it more efficient. One is to use specially-crafted data structures for computing threshold queries (i.e., find all points within \varepsilon of this point). But those are only useful for small thresholds, and we’re interested in sweeping over a range of thresholds. Another is to invoke approximations of the data structure which give rise to “approximate” Vietoris-Rips complexes.

Other stuff

In a future post we’ll implement a method for speeding up the computation of the Vietoris-Rips complex, since this is the primary bottleneck for topological data analysis. But for now the conceptual idea of how Čech complexes and Vietoris-Rips complexes can be used to turn point clouds into simplicial complexes in reasonable ways.

Before we close we should mention that there are other ways to do this. I’ve chosen the algebraic flavor of topological data analysis due to my familiarity with algebra and the work based on this approach. The other approaches have a more geometric flavor, and are based on the Delaunay triangulation, a hallmark of computational geometry algorithms. The two approaches I’ve heard of are called the alpha complex and the flow complex. The downside of these approaches is that, because they are based on the Delaunay triangulation, they have poor scaling in the dimension of the data. Because high dimensional data is crucial, many researchers have been spending their time figuring out how to speed up approximations of the V-R complex. See these slides of Afra Zomorodian for an example.

Until next time!

Fixing Bugs in “Computing Homology”

A few awesome readers have posted comments in Computing Homology to the effect of, “Your code is not quite correct!” And they’re right! Despite the almost year since that post’s publication, I haven’t bothered to test it for more complicated simplicial complexes, or even the basic edge cases! When I posted it the mathematics just felt so solid to me that it had to be right (the irony is rich, I know).

As such I’m apologizing for my lack of rigor and explaining what went wrong, the fix, and giving some test cases. As of the publishing of this post, the Github repository for Computing Homology has been updated with the correct code, and some more examples.

The main subroutine was the simultaneousReduce function which I’ll post in its incorrectness below

def simultaneousReduce(A, B):
   if A.shape[1] != B.shape[0]:
      raise Exception("Matrices have the wrong shape.")

   numRows, numCols = A.shape # col reduce A

   i,j = 0,0
   while True:
      if i >= numRows or j >= numCols:
         break

      if A[i][j] == 0:
         nonzeroCol = j
         while nonzeroCol < numCols and A[i,nonzeroCol] == 0:
            nonzeroCol += 1

         if nonzeroCol == numCols:
            j += 1
            continue

         colSwap(A, j, nonzeroCol)
         rowSwap(B, j, nonzeroCol)

      pivot = A[i,j]
      scaleCol(A, j, 1.0 / pivot)
      scaleRow(B, j, 1.0 / pivot)

      for otherCol in range(0, numCols):
         if otherCol == j:
            continue
         if A[i, otherCol] != 0:
            scaleAmt = -A[i, otherCol]
            colCombine(A, otherCol, j, scaleAmt)
            rowCombine(B, j, otherCol, -scaleAmt)

      i += 1; j+= 1

   return A,B

It’s a beast of a function, and the persnickety detail was just as beastly: this snippet should have an i += 1 instead of a j.

if nonzeroCol == numCols:
   j += 1
   continue

This is simply what happens when we’re looking for a nonzero entry in a row to use as a pivot for the corresponding column, but we can’t find one and have to move to the next row. A stupid error on my part that would be easily caught by proper test cases.

The next mistake is a mathematical misunderstanding. In short, the simultaneous column/row reduction process is not enough to get the \partial_{k+1} matrix into the right form! Let’s see this with a nice example, a triangulation of the Mobius band. There are a number of triangulations we could use, many of which are seen in these slides. The one we’ll use is the following.

mobius-triangulation

It’s first and second boundary maps are as follows (in code, because latex takes too much time to type out)

mobiusD1 = numpy.array([
   [-1,-1,-1,-1, 0, 0, 0, 0, 0, 0],
   [ 1, 0, 0, 0,-1,-1,-1, 0, 0, 0],
   [ 0, 1, 0, 0, 1, 0, 0,-1,-1, 0],
   [ 0, 0, 0, 1, 0, 0, 1, 0, 1, 1],
])

mobiusD2 = numpy.array([
   [ 1, 0, 0, 0, 1],
   [ 0, 0, 0, 1, 0],
   [-1, 0, 0, 0, 0],
   [ 0, 0, 0,-1,-1],
   [ 0, 1, 0, 0, 0],
   [ 1,-1, 0, 0, 0],
   [ 0, 0, 0, 0, 1],
   [ 0, 1, 1, 0, 0],
   [ 0, 0,-1, 1, 0],
   [ 0, 0, 1, 0, 0],
])

And if we were to run the above code on it we’d get a first Betti number of zero (which is incorrect, it’s first homology group has rank 1). Here are the reduced matrices.

>>> A1, B1 = simultaneousReduce(mobiusD1, mobiusD2)
>>> A1
array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]])
>>> B1
array([[ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [ 0,  1,  0,  0,  0],
       [ 1, -1,  0,  0,  0],
       [ 0,  0,  0,  0,  1],
       [ 0,  1,  1,  0,  0],
       [ 0,  0, -1,  1,  0],
       [ 0,  0,  1,  0,  0]])

The first reduced matrix looks fine; there’s nothing we can do to improve it. But the second one is not quite fully reduced! Notice that rows 5, 8 and 10 are not linearly independent. So we need to further row-reduce the nonzero part of this matrix before we can read off the true rank in the way we described last time. This isn’t so hard (we just need to reuse the old row-reduce function we’ve been using), but why is this allowed? It’s just because the corresponding column operations for those row operations are operating on columns of all zeros! So we need not worry about screwing up the work we did in column reducing the first matrix, as long as we only work with the nonzero rows of the second.

Of course, nothing is stopping us from ignoring the “corresponding” column operations, since we know we’re already done there. So we just have to finish row reducing this matrix.

This changes our bettiNumber function by adding a single call to a row-reduce function which we name so as to be clear what’s happening. The resulting function is

def bettiNumber(d_k, d_kplus1):
   A, B = numpy.copy(d_k), numpy.copy(d_kplus1)
   simultaneousReduce(A, B)
   finishRowReducing(B)

   dimKChains = A.shape[1]
   kernelDim = dimKChains - numPivotCols(A)
   imageDim = numPivotRows(B)

   return kernelDim - imageDim

And running this on our Mobius band example gives:

>>> bettiNumber(mobiusD1, mobiusD2))
1

As desired. Just to make sure things are going swimmingly under the hood, we can check to see how finishRowReducing does after calling simultaneousReduce

>>> simultaneousReduce(mobiusD1, mobiusD2)
(array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]]), array([[ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [ 0,  1,  0,  0,  0],
       [ 1, -1,  0,  0,  0],
       [ 0,  0,  0,  0,  1],
       [ 0,  1,  1,  0,  0],
       [ 0,  0, -1,  1,  0],
       [ 0,  0,  1,  0,  0]]))
>>> finishRowReducing(mobiusD2)
array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

Indeed, finishRowReducing finishes row reducing the second boundary matrix. Note that it doesn’t preserve how the rows of zeros lined up with the pivot columns of the reduced version of \partial_1 as it did in the previous post, but since in the end we’re only counting pivots it doesn’t matter how we switch rows. The “zeros lining up” part is just for a conceptual understanding of how the image lines up with the kernel for a valid simplicial complex.

In fixing this issue we’ve also fixed an issue another commenter mentioned, that you couldn’t blindly plug in the zero matrix for \partial_0 and get zeroth homology (which is the same thing as connected components). After our fix you can.

Of course there still might be bugs, but I have so many drafts lined up on this blog (and research papers to write, experiments to run, theorems to prove), that I’m going to put off writing a full test suite. I’ll just have to update this post with new bug fixes as they come. There’s just so much math and so little time :) But extra kudos to my amazing readers who were diligent enough to run examples and spot my error. I’m truly blessed to have you on my side.

Also note that this isn’t the most efficient way to represent the simplicial complex data, or the most efficient row reduction algorithm. If you’re going to run the code on big inputs, I suggest you take advantage of sparse matrix algorithms for doing this sort of stuff. You can represent the simplices as entries in a dictionary and do all sorts of clever optimizations to make the algorithm effectively linear time in the number of simplices.

Until next time!

Guest Post: Torus-Knotted Baklava

Guest Post: Torus-Knotted Baklava at Baking And Math, a blog by my friend and colleague, Yen.

Computing Homology

Update: the mistakes made in the code posted here are fixed and explained in a subsequent post (one minor code bug was fixed here, and a less minor conceptual bug is fixed in the linked post).

In our last post in this series on topology, we defined the homology group. Specifically, we built up a topological space as a simplicial complex (a mess of triangles glued together), we defined an algebraic way to represent collections of simplices called chains as vectors in a vector space, we defined the boundary homomorphism \partial_k as a linear map on chains, and finally defined the homology groups as the quotient vector spaces

\displaystyle H_k(X) = \frac{\textup{ker} \partial_k}{\textup{im} \partial_{k+1}}.

The number of holes in X was just the dimension of this quotient space.

In this post we will be quite a bit more explicit. Because the chain groups are vector spaces and the boundary mappings are linear maps, they can be represented as matrices whose dimensions depend on our simplicial complex structure. Better yet, if we have explicit representations of our chains by way of a basis, then we can use row-reduction techniques to write the matrix in a standard form.

Of course the problem arises when we want to work with two matrices simultaneously (to compute the kernel-mod-image quotient above). This is not computationally any more difficult, but it requires some theoretical fiddling. We will need to dip a bit deeper into our linear algebra toolboxes to see how it works, so the rusty reader should brush up on their linear algebra before continuing (or at least take some time to sort things out if or when confusion strikes).

Without further ado, let’s do an extended example and work our ways toward a general algorithm. As usual, all of the code used for this post is available on this blog’s Github page.

Two Big Matrices

Recall our example simplicial complex from last time.

circle-wedge-sphere

We will compute H_1 of this simplex (which we saw last time was \mathbb{Q}) in a more algorithmic way than we did last time.

Once again, we label the vertices 0-4 so that the extra “arm” has vertex 4 in the middle, and its two endpoints are 0 and 2. This gave us orientations on all of the simplices, and the following chain groups. Since the vertex labels (and ordering) are part of the data of a simplicial complex, we have made no choices in writing these down.

\displaystyle C_0(X) = \textup{span} \left \{ 0,1,2,3,4 \right \}

\displaystyle C_1(X) = \textup{span} \left \{ [0,1], [0,2], [0,3], [0,4], [1,2], [1,3],[2,3],[2,4] \right \}

\displaystyle C_2(X) = \textup{span} \left \{ [0,1,2], [0,1,3], [0,2,3], [1,2,3] \right \}

Now given our known definitions of \partial_k as an alternating sum from last time, we can give a complete specification of the boundary map as a matrix. For \partial_1, this would be

\displaystyle \partial_1 = \bordermatrix{  & [0,1] & [0,2] & [0,3] & [0,4] & [1,2] & [1,3] & [2,3] & [2,4] \cr  0 & -1 & -1 & -1 & -1 & 0 & 0 & 0 & 0\cr  1 & 1 & 0 & 0 & 0 & -1 & -1 & 0 & 0\cr  2 & 0 & 1 & 0 & 0 & 1 & 0 & -1 & -1 \cr  3 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 \cr  4 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 },

where the row labels are the basis for C_0(X) and the column labels are the basis for C_1(X). Similarly, \partial_2 is

\displaystyle \partial_2 = \bordermatrix{  & [0,1,2] & [0,1,3] & [0,2,3] & [1,2,3] \cr  [0,1] & 1 & 1 & 0 & 0\cr  [0,2] & -1 & 0 & 1 & 0\cr  [0,3] & 0 & -1 & -1 & 0\cr  [0,4] & 0 & 0 & 0 & 0\cr  [1,2] & 1 & 0 & 0 & 1\cr  [1,3] & 0 & 1 & 0 & -1\cr  [2,3] & 0 & 0 & 1 & 1\cr  [2,4] & 0 & 0 & 0 & 0}

The reader is encouraged to check that these matrices are written correctly by referring to the formula for \partial as given last time.

Remember the crucial property of \partial, that \partial^2 = \partial_k \partial_{k+1} = 0. Indeed, the composition of the two boundary maps just corresponds to the matrix product of the two matrices, and one can verify by hand that the above two matrices multiply to the zero matrix.

We know from basic linear algebra how to compute the kernel of a linear map expressed as a matrix: column reduce and inspect the columns of zeros. Since the process of row reducing is really a change of basis, we can encapsulate the reduction inside a single invertible matrix A, which, when left-multiplied by \partial, gives us the reduced form of the latter. So write the reduced form of \partial_1 as \partial_1 A.

However, now we’re using two different sets of bases for the shared vector space involved in \partial_1 and \partial_2. In general, it will no longer be the case that \partial_kA\partial_{k+1} = 0. The way to alleviate this is to perform the “corresponding” change of basis in \partial_{k+1}. To make this idea more transparent, we return to the basics.

Changing Two Matrices Simultaneously

Recall that a matrix M represents a linear map between two vector spaces f : V \to W. The actual entries of M depend crucially on the choice of a basis for the domain and codomain. Indeed, if v_i form a basis for V and w_j for W, then the k-th column of the matrix representation M is defined to be the coefficients of the representation of f(v_k) in terms of the w_j. We hope to have nailed this concept down firmly in our first linear algebra primer.

Recall further that row operations correspond to changing a basis for the codomain, and column operations correspond to changing a basis for the domain. For example, the idea of swapping columns i,j in M gives a new matrix which is the representation of f with respect to the (ordered) basis for V which swaps the order of v_i , v_j. Similar things happen for all column operations (they all correspond to manipulations of the basis for V), while analogously row operations implicitly transform the basis for the codomain. Note, though, that the connection between row operations and transformations of the basis for the codomain are slightly more complicated than they are for the column operations. We will explicitly see how it works later in the post.

And so if we’re working with two maps A: U \to V and B: V \to W, and we change a basis for V in B via column reductions, then in order to be consistent, we need to change the basis for V in A via “complementary” row reductions. That is, if we call the change of basis matrix Q, then we’re implicitly sticking Q in between the composition BA to get (BQ)A. This is not the same map as BA, but we can make it the same map by adding a Q^{-1} in the right place:

\displaystyle BA = B(QQ^{-1})A = (BQ)(Q^{-1}A)

Indeed, whenever Q is a change of basis matrix so is Q^{-1} (trivially), and moreover the operations that Q performs on the columns of B are precisely the operations that Q^{-1} performs on the rows of A (this is because elementary row operations take different forms when multiplied on the left or right).

Coming back to our boundary operators, we want a canonical way to view the image of \partial_{k+1} as sitting inside the kernel of \partial_k. If we go ahead and use column reductions to transform \partial_k into a form where the kernel is easy to read off (as the columns consisting entirely of zeroes), then the corresponding row operations, when performed on \partial_{k+1} will tell us exactly the image of \partial_{k+1} inside the kernel of \partial_k.

This last point is true precisely because \textup{im} \partial_{k+1} \subset \textup{ker} \partial_k. This fact guarantees that the irrelevant rows of the reduced version of \partial_{k+1} are all zero.

Let’s go ahead and see this in action on our two big matrices above. For \partial_1, the column reduction matrix is

\displaystyle A =  \begin{pmatrix}  0 & 1 & 0 & 0 & 1 & 1 & 0 & 0\\  0 & 0 & 1 & 0 & -1 & 0 & 1 & 1\\  0 & 0 & 0 & 1 & 0 & -1 & -1 & 0\\  -1 & -1 & -1 & -1 & 0 & 0 & 0 & -1\\  0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\  0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\  0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\  0 & 0 & 0 & 0 & 0 & 0 & 0 & 1  \end{pmatrix}

And the product \partial_1 A is

\displaystyle \partial_1 A =  \begin{pmatrix}  1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\  0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\  0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\  0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\  -1 & -1 & -1 & -1 & 0 & 0 & 0 & 0  \end{pmatrix}

Now the inverse of A, which is the corresponding basis change for \partial_2, is

\displaystyle A^{-1} =  \begin{pmatrix}  -1 & -1 & -1 & -1 & -0 & -0 & -0 & -0\\  1 & 0 & 0 & 0 & -1 & -1 & 0 & 0\\  0 & 1 & 0 & 0 & 1 & 0 & -1 & -1\\  0 & 0 & 1 & 0 & 0 & 1 & 1 & 0\\  0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\  0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\\  0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\  0 & 0 & 0 & 0 & 0 & 0 & 0 & 1  \end{pmatrix}

and the corresponding reduced form of \partial_2 is

\displaystyle A^{-1} \partial_2 =  \begin{pmatrix}  0 & 0 & 0 & 0\\  0 & 0 & 0 & 0\\  0 & 0 & 0 & 0\\  0 & 0 & 0 & 0\\  1 & 0 & 0 & 1\\  0 & 1 & 0 & -1\\  0 & 0 & 1 & 1\\  0 & 0 & 0 & 0  \end{pmatrix}

As a side note, we got these matrices by slightly modifying the code from our original post on row reduction to output the change of basis matrix in addition to performing row reduction. It turns out one can implement column reduction as row reduction of the transpose, and the change of basis matrix you get from this process will be the transpose of the change of basis matrix you want (by (AB)^\textup{T} = (B^\textup{T}A^\textup{T})). Though the code is particularly ad-hoc, we include it with the rest of the code used in this post on this blog’s Github page.

Now let’s inspect the two matrices \partial_1 A and A^{-1} \partial_2 more closely. The former has four “pivots” left over, and this corresponds to the rank of the matrix being 4. Moreover, the four basis vectors representing the columns with nonzero pivots, which we’ll call v_1, v_2, v_3, v_4 (we don’t care what their values are), span a complementary subspace to the kernel of \partial_1. Hence, the remaining four vectors (which we’ll call v_5, v_6, v_7, v_8) span the kernel. In particular, this says that the kernel has dimension 4.

On the other hand, we performed the same transformation of the basis of C_1(X) for \partial_2. Looking at the matrix that resulted, we see that the first four rows and the last row (representing v_1, v_2, v_3, v_4, v_8) are entirely zeros and so the image of \partial_2 intersects their span trivially. and the remaining three rows (representing v_5, v_6, v_7) have nonzero pivots. This tells us exactly that the image of \partial_2 is spanned by v_5, v_6, v_7.

And now, the coup de grâce, the quotient to get homology is simply

\displaystyle \frac{ \textup{span} \left \{ v_5, v_6, v_7, v_8 \right \}}{ \textup{span} \left \{ v_5, v_6, v_7 \right \}} = \textup{span} \left \{ v_8 \right \}

And the dimension of the homology group is 1, as desired.

The General Algorithm

It is no coincidence that things worked out at nicely as they did. The process we took of simultaneously rewriting two matrices with respect to a common basis is the bulk of the algorithm to compute homology. Since we’re really only interested in the dimensions of the homology groups, we just need to count pivots. If the number of pivots arising in \partial_k is y and the number of pivots arising in \partial_{k+1} is z, and the dimension of C_k(X) is n, then the dimension is exactly

(n-y) - z = \textup{dim}(\textup{ker} \partial_k) - \textup{dim}(\textup{im}\partial_{k+1})

And it is no coincidence that the pivots lined up so nicely to allow us to count dimensions this way. It is a minor exercise to prove it formally, but the fact that the composition \partial_k \partial_{k+1} = 0 implies that the reduced version of \partial_{k+1} will have an almost reduced row-echelon form (the only difference being the rows of zeros interspersed above, below, and possibly between pivot rows).

As the reader may have guessed at this point, we don’t actually need to compute A and A^{-1}. Instead of this, we can perform the column/row reductions simultaneously on the two matrices. The above analysis helped us prove the algorithm works, and with that guarantee we can throw out the analytical baggage and just compute the damn thing.

Indeed, assuming the input is already processed as two matrices representing the boundary operators with respect to the standard bases of the chain groups, computing homology is only slightly more difficult than row reducing in the first place. Putting our homology where our mouth is, we’ve implemented the algorithm in Python. As usual, the entire code used in this post is available on this blog’s Github page.

The first step is writing auxiliary functions to do elementary row and column operations on matrices. For this post, we will do everything in numpy (which makes the syntax shorter than standard Python syntax, but dependent on the numpy library).

import numpy

def rowSwap(A, i, j):
   temp = numpy.copy(A[i, :])
   A[i, :] = A[j, :]
   A[j, :] = temp

def colSwap(A, i, j):
   temp = numpy.copy(A[:, i])
   A[:, i] = A[:, j]
   A[:, j] = temp

def scaleCol(A, i, c):
   A[:, i] *= c*numpy.ones(A.shape[0])

def scaleRow(A, i, c):
   A[i, :] *= c*numpy.ones(A.shape[1])

def colCombine(A, addTo, scaleCol, scaleAmt):
   A[:, addTo] += scaleAmt * A[:, scaleCol]

def rowCombine(A, addTo, scaleRow, scaleAmt):
   A[addTo, :] += scaleAmt * A[scaleRow, :]

From here, the main meat of the algorithm is doing column reduction on one matrix, and applying the corresponding row operations on the other.

def simultaneousReduce(A, B):
   if A.shape[1] != B.shape[0]:
      raise Exception("Matrices have the wrong shape.")

   numRows, numCols = A.shape # col reduce A

   i,j = 0,0
   while True:
      if i >= numRows or j >= numCols:
         break

      if A[i][j] == 0:
         nonzeroCol = j
         while nonzeroCol < numCols and A[i,nonzeroCol] == 0:
            nonzeroCol += 1

         if nonzeroCol == numCols:
            i += 1
            continue

         colSwap(A, j, nonzeroCol)
         rowSwap(B, j, nonzeroCol)

      pivot = A[i,j]
      scaleCol(A, j, 1.0 / pivot)
      scaleRow(B, j, 1.0 / pivot)

      for otherCol in range(0, numCols):
         if otherCol == j:
            continue
         if A[i, otherCol] != 0:
            scaleAmt = -A[i, otherCol]
            colCombine(A, otherCol, j, scaleAmt)
            rowCombine(B, j, otherCol, -scaleAmt)

      i += 1; j+= 1

   return A,B

This more or less parallels the standard algorithm for row-reduction (with the caveat that all the indices are swapped to do column-reduction). The only somewhat confusing line is the call to rowCombine, which explicitly realizes the corresponding row operation as the inverse of the performed column operation. Note that for row operations, the correspondence between operations on the basis and operations on the rows is not as direct as it is for columns. What’s given above is the true correspondence. Writing down lots of examples will reveal why, and we leave that as an exercise to the reader.

Then the actual algorithm to compute homology is just a matter of counting pivots. Here are two pivot-counting functions in a typical numpy fashion:

def numPivotCols(A):
   z = numpy.zeros(A.shape[0])
   return [numpy.all(A[:, j] == z) for j in range(A.shape[1])].count(False)

def numPivotRows(A):
   z = numpy.zeros(A.shape[1])
   return [numpy.all(A[i, :] == z) for i in range(A.shape[0])].count(False)

And the final function is just:

def bettiNumber(d_k, d_kplus1):
   A, B = numpy.copy(d_k), numpy.copy(d_kplus1)
   simultaneousReduce(A, B)

   dimKChains = A.shape[1]
   kernelDim = dimKChains - numPivotCols(A)
   imageDim = numPivotRows(B)

   return kernelDim - imageDim

And there we have it! We’ve finally tackled the beast, and written a program to compute algebraic features of a topological space.

The reader may be curious as to why we didn’t come up with a more full-bodied representation of a simplicial complex and write an algorithm which accepts a simplicial complex and computes all of its homology groups. We’ll leave this direct approach as a (potentially long) exercise to the reader, because coming up in this series we are going to do one better. Instead of computing the homology groups of just one simplicial complex using by repeating one algorithm many times, we’re going to compute all the homology groups of a whole family of simplicial complexes in a single bound. This family of simplicial complexes will be constructed from a data set, and so, in grandiose words, we will compute the topological features of data.

If it sounds exciting, that’s because it is! We’ll be exploring a cutting-edge research field known as persistent homology, and we’ll see some of the applications of this theory to data analysis.

Until then!