# Elliptic Curves as Python Objects

Last time we saw a geometric version of the algorithm to add points on elliptic curves. We went quite deep into the formal setting for it (projective space $\mathbb{P}^2$), and we spent a lot of time talking about the right way to define the “zero” object in our elliptic curve so that our issues with vertical lines would disappear.

With that understanding in mind we now finally turn to code, and write classes for curves and points and implement the addition algorithm. As usual, all of the code we wrote in this post is available on this blog’s Github page.

## Points and Curves

Every introductory programming student has probably written the following program in some language for a class representing a point.

class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y


It’s the simplest possible nontrivial class: an x and y value initialized by a constructor (and in Python all member variables are public).

We want this class to represent a point on an elliptic curve, and overload the addition and negation operators so that we can do stuff like this:

p1 = Point(3,7)
p2 = Point(4,4)
p3 = p1 + p2


But as we’ve spent quite a while discussing, the addition operators depend on the features of the elliptic curve they’re on (we have to draw lines and intersect it with the curve). There are a few ways we could make this happen, but in order to make the code that uses these classes as simple as possible, we’ll have each point contain a reference to the curve they come from. So we need a curve class.

It’s pretty simple, actually, since the class is just a placeholder for the coefficients of the defining equation. We assume the equation is already in the Weierstrass normal form, but if it weren’t one could perform a whole bunch of algebra to get it in that form (and you can see how convoluted the process is in this short report or page 115 (pdf p. 21) of this book). To be safe, we’ll add a few extra checks to make sure the curve is smooth.

class EllipticCurve(object):
def __init__(self, a, b):
# assume we're already in the Weierstrass form
self.a = a
self.b = b

self.discriminant = -16 * (4 * a*a*a + 27 * b * b)
if not self.isSmooth():
raise Exception("The curve %s is not smooth!" % self)

def isSmooth(self):
return self.discriminant != 0

def testPoint(self, x, y):
return y*y == x*x*x + self.a * x + self.b

def __str__(self):
return 'y^2 = x^3 + %Gx + %G' % (self.a, self.b)

def __eq__(self, other):
return (self.a, self.b) == (other.a, other.b)


And here’s some examples of creating curves

>>> EllipticCurve(a=17, b=1)
y^2 = x^3 + 17x + 1
>>> EllipticCurve(a=0, b=0)
Traceback (most recent call last):
[...]
Exception: The curve y^2 = x^3 + 0x + 0 is not smooth!


So there we have it. Now when we construct a Point, we add the curve as the extra argument and a safety-check to make sure the point being constructed is on the given elliptic curve.

class Point(object):
def __init__(self, curve, x, y):
self.curve = curve # the curve containing this point
self.x = x
self.y = y

if not curve.testPoint(x,y):
raise Exception("The point %s is not on the given curve %s" % (self, curve))


Note that this last check will serve as a coarse unit test for all of our examples. If we mess up then more likely than not the “added” point won’t be on the curve at all. More precise testing is required to be bullet-proof, of course, but we leave explicit tests to the reader as an excuse to get their hands wet with equations.

Some examples:

>>> c = EllipticCurve(a=1,b=2)
>>> Point(c, 1, 2)
(1, 2)
>>> Point(c, 1, 1)
Traceback (most recent call last):
[...]
Exception: The point (1, 1) is not on the given curve y^2 = x^3 + 1x + 2


Before we go ahead and implement addition and the related functions, we need to be decide how we want to represent the ideal point $[0 : 1 : 0]$. We have two options. The first is to do everything in projective coordinates and define a whole system for doing projective algebra. Considering we only have one point to worry about, this seems like overkill (but could be fun). The second option, and the one we’ll choose, is to have a special subclass of Point that represents the ideal point.

class Ideal(Point):
def __init__(self, curve):
self.curve = curve

def __str__(self):
return "Ideal"


Note the inheritance is denoted by the parenthetical (Point) in the first line. Each function we define on a Point will require a 1-2 line overriding function in this subclass, so we will only need a small amount of extra bookkeeping. For example, negation is quite easy.

class Point(object):
...
def __neg__(self):
return Point(self.curve, self.x, -self.y)

class Ideal(Point):
...
def __neg__(self):
return self


Note that Python allows one to override the prefix-minus operation by defining __neg__ on a custom object. There are similar functions for addition (__add__), subtraction, and pretty much every built-in python operation. And of course addition is where things get more interesting. For the ideal point it’s trivial.

class Ideal(Point):
...
def __add__(self, Q):
return Q


Why does this make sense? Because (as we’ve said last time) the ideal point is the additive identity in the group structure of the curve. So by all of our analysis, $P + 0 = 0 + P = P$, and the code is satisfyingly short.

For distinct points we have to follow the algorithm we used last time. Remember that the trick was to form the line $L(x)$ passing through the two points being added, substitute that line for $y$ in the elliptic curve, and then figure out the coefficient of $x^2$ in the resulting polynomial. Then, using the two existing points, we could solve for the third root of the polynomial using Vieta’s formula.

In order to do that, we need to analytically solve for the coefficient of the $x^2$ term of the equation $L(x)^2 = x^3 + ax + b$. It’s tedious, but straightforward. First, write

$\displaystyle L(x) = \left ( \frac{y_2 - y_1}{x_2 - x_1} \right ) (x - x_1) + y_1$

The first step of expanding $L(x)^2$ gives us

$\displaystyle L(x)^2 = y_1^2 + 2y_1 \left ( \frac{y_2 - y_1}{x_2 - x_1} \right ) (x - x_1) + \left [ \left (\frac{y_2 - y_1}{x_2 - x_1} \right ) (x - x_1) \right ]^2$

And we notice that the only term containing an $x^2$ part is the last one. Expanding that gives us

$\displaystyle \left ( \frac{y_2 - y_1}{x_2 - x_1} \right )^2 (x^2 - 2xx_1 + x_1^2)$

And again we can discard the parts that don’t involve $x^2$. In other words, if we were to rewrite $L(x)^2 = x^3 + ax + b$ as $0 = x^3 - L(x)^2 + ax + b$, we’d expand all the terms and get something that looks like

$\displaystyle 0 = x^3 - \left ( \frac{y_2 - y_1}{x_2 - x_1} \right )^2 x^2 + C_1x + C_2$

where $C_1, C_2$ are some constants that we don’t need. Now using Vieta’s formula and calling $x_3$ the third root we seek, we know that

$\displaystyle x_1 + x_2 + x_3 = \left ( \frac{y_2 - y_1}{x_2 - x_1} \right )^2$

Which means that $x_3 = \left ( \frac{y_2 - y_1}{x_2 - x_1} \right )^2 - x_2 - x_1$. Once we have $x_3$, we can get $y_3$ from the equation of the line $y_3 = L(x_3)$.

Note that this only works if the two points we’re trying to add are different! The other two cases were if the points were the same or lying on a vertical line. These gotchas will manifest themselves as conditional branches of our add function.

class Point(object):
...
def __add__(self, Q):
if isinstance(Q, Ideal):
return self

x_1, y_1, x_2, y_2 = self.x, self.y, Q.x, Q.y

if (x_1, y_1) == (x_2, y_2):
# use the tangent method
...
else:
if x_1 == x_2:
return Ideal(self.curve) # vertical line

# Using Vieta's formula for the sum of the roots
m = (y_2 - y_1) / (x_2 - x_1)
x_3 = m*m - x_2 - x_1
y_3 = m*(x_3 - x_1) + y_1

return Point(self.curve, x_3, -y_3)



First, we check if the two points are the same, in which case we use the tangent method (which we do next). Supposing the points are different, if their $x$ values are the same then the line is vertical and the third point is the ideal point. Otherwise, we use the formula we defined above. Note the subtle and crucial minus sign at the end! The point $(x_3, y_3)$ is the third point of intersection, but we still have to do the reflection to get the sum of the two points.

Now for the case when the points $P, Q$ are actually the same. We’ll call it $P = (x_1, y_1)$, and we’re trying to find $2P = P+P$. As per our algorithm, we compute the tangent line $J(x)$ at $P$. In order to do this we need just a tiny bit of calculus. To find the slope of the tangent line we implicitly differentiate the equation $y^2 = x^3 + ax + b$ and get

$\displaystyle \frac{dy}{dx} = \frac{3x^2 + a}{2y}$

The only time we’d get a vertical line is when the denominator is zero (you can verify this by taking limits if you wish), and so $y=0$ implies that $P+P = 0$ and we’re done. The fact that this can ever happen for a nonzero $P$ should be surprising to any reader unfamiliar with groups! But without delving into a deep conversation about the different kinds of group structures out there, we’ll have to settle for such nice surprises.

In the other case $y \neq 0$, we plug in our $x,y$ values into the derivative and read off the slope $m$ as $(3x_1^2 + a)/(2y_1)$. Then using the same point slope formula for a line, we get $J(x) = m(x-x_1) + y_1$, and we can use the same technique (and the same code!) from the first case to finish.

There is only one minor wrinkle we need to smooth out: can we be sure Vieta’s formula works? In fact, the real problem is this: how do we know that $x_1$ is a double root of the resulting cubic? Well, this falls out again from that very abstract and powerful theorem of Bezout. There is a lot of technical algebraic geometry (and a very interesting but complicated notion of dimension) hiding behind the curtain here. But for our purposes it says that our tangent line intersects the elliptic curve with multiplicity 2, and this gives us a double root of the corresponding cubic.

And so in the addition function all we need to do is change the slope we’re using. This gives us a nice and short implementation

def __add__(self, Q):
if isinstance(Q, Ideal):
return self

x_1, y_1, x_2, y_2 = self.x, self.y, Q.x, Q.y

if (x_1, y_1) == (x_2, y_2):
if y_1 == 0:
return Ideal(self.curve)

# slope of the tangent line
m = (3 * x_1 * x_1 + self.curve.a) / (2 * y_1)
else:
if x_1 == x_2:
return Ideal(self.curve)

# slope of the secant line
m = (y_2 - y_1) / (x_2 - x_1)

x_3 = m*m - x_2 - x_1
y_3 = m*(x_3 - x_1) + y_1

return Point(self.curve, x_3, -y_3)


What’s interesting is how little the data of the curve comes into the picture. Nothing depends on $b$, and only one of the two cases depends on $a$. This is one reason the Weierstrass normal form is so useful, and it may bite us in the butt later in the few cases we don’t have it (for special number fields).

Here are some examples.

>>> C = EllipticCurve(a=-2,b=4)
>>> P = Point(C, 3, 5)
>>> Q = Point(C, -2, 0)
>>> P+Q
(0.0, -2.0)
>>> Q+P
(0.0, -2.0)
>>> Q+Q
Ideal
>>> P+P
(0.25, 1.875)
>>> P+P+P
Traceback (most recent call last):
...
Exception: The point (-1.958677685950413, 0.6348610067618328) is not on the given curve y^2 = x^3 + -2x + 4!

>>> x = -1.958677685950413
>>> y = 0.6348610067618328
>>> y*y - x*x*x + 2*x - 4
-3.9968028886505635e-15


And so we crash headfirst into our first floating point arithmetic issue. We’ll vanquish this monster more permanently later in this series (in fact, we’ll just scrap it entirely and define our own number system!), but for now here’s a quick fix:

>>> import fractions
>>> frac = fractions.Fraction
>>> C = EllipticCurve(a = frac(-2), b = frac(4))
>>> P = Point(C, frac(3), frac(5))
>>> P+P+P
(Fraction(-237, 121), Fraction(845, 1331))


Now that we have addition and negation, the rest of the class is just window dressing. For example, we want to be able to use the subtraction symbol, and so we need to implement __sub__

def __sub__(self, Q):
return self + -Q


Note that because the Ideal point is a subclass of point, it inherits all of these special functions while it only needs to override __add__ and __neg__. Thank you, polymorphism! The last function we want is a scaling function, which efficiently adds a point to itself $n$ times.

class Point(object):
...
def __mul__(self, n):
if not isinstance(n, int):
raise Exception("Can't scale a point by something which isn't an int!")
else:
if n < 0:
return -self * -n
if n == 0:
return Ideal(self.curve)
else:
Q = self
R = self if n & 1 == 1 else Ideal(self.curve)

i = 2
while i <= n:
Q = Q + Q

if n & i == i:
R = Q + R

i = i << 1
return R

def __rmul__(self, n):
return self * n

class Ideal(Point):
...
def __mul__(self, n):
if not isinstance(n, int):
raise Exception("Can't scale a point by something which isn't an int!")
else:
return self


The scaling function allows us to quickly compute $nP = P + P + \dots + P$ ($n$ times). Indeed, the fact that we can do this more efficiently than performing $n$ additions is what makes elliptic curve cryptography work. We’ll take a deeper look at this in the next post, but for now let’s just say what the algorithm is doing.

Given a number written in binary $n = b_kb_{k-1}\dots b_1b_0$, we can write $nP$ as

$\displaystyle b_0 P + b_1 2P + b_2 4P + \dots + b_k 2^k P$

The advantage of this is that we can compute each of the $P, 2P, 4P, \dots, 2^kP$ iteratively using only $k$ additions by multiplying by 2 (adding something to itself) $k$ times. Since the number of bits in $n$ is $k= \log(n)$, we’re getting a huge improvement over $n$ additions.

The algorithm is given above in code, but it’s a simple bit-shifting trick. Just have $i$ be some power of two, shifted by one at the end of every loop. Then start with $Q_0$ being $P$, and replace $Q_{j+1} = Q_j + Q_j$, and in typical programming fashion we drop the indices and overwrite the variable binding at each step (Q = Q+Q). Finally, we have a variable $R$ to which $Q_j$ is added when the $j$-th bit of $n$ is a 1 (and ignored when it’s 0). The rest is bookkeeping.

Note that __mul__ only allows us to write something like P * n, but the standard notation for scaling is n * P. This is what __rmul__ allows us to do.

We could add many other helper functions, such as ones to allow us to treat points as if they were lists, checking for equality of points, comparison functions to allow one to sort a list of points in lex order, or a function to transform points into more standard types like tuples and lists. We have done a few of these that you can see if you visit the code repository, but we’ll leave flushing out the class as an exercise to the reader.

Some examples:

>>> import fractions
>>> frac = fractions.Fraction
>>> C = EllipticCurve(a = frac(-2), b = frac(4))
>>> P = Point(C, frac(3), frac(5))
>>> Q = Point(C, frac(-2), frac(0))
>>> P-Q
(Fraction(0, 1), Fraction(-2, 1))
>>> P+P+P+P+P
(Fraction(2312883, 1142761), Fraction(-3507297955, 1221611509))
>>> 5*P
(Fraction(2312883, 1142761), Fraction(-3507297955, 1221611509))
>>> Q - 3*P
(Fraction(240, 1), Fraction(3718, 1))
>>> -20*P
(Fraction(872171688955240345797378940145384578112856996417727644408306502486841054959621893457430066791656001, 520783120481946829397143140761792686044102902921369189488390484560995418035368116532220330470490000), Fraction(-27483290931268103431471546265260141280423344817266158619907625209686954671299076160289194864753864983185162878307166869927581148168092234359162702751, 11884621345605454720092065232176302286055268099954516777276277410691669963302621761108166472206145876157873100626715793555129780028801183525093000000))


As one can see, the precision gets very large very quickly. One thing we’ll do to avoid such large numbers (but hopefully not sacrifice security) is to work in finite fields, the simplest version of which is to compute modulo some prime.

So now we have a concrete understanding of the algorithm for adding points on elliptic curves, and a working Python program to do this for rational numbers or floating point numbers (if we want to deal with precision issues). Next time we’ll continue this train of thought and upgrade our program (with very little work!) to work over other simple number fields. Then we’ll delve into the cryptographic issues, and talk about how one might encode messages on a curve and use algebraic operations to encode their messages.

Until then!

About these ads

# Simulating a Biased Coin with a Fair Coin

This is a guest post by my friend and colleague Adam Lelkes. Adam’s interests are in algebra and theoretical computer science. This gem came up because Adam gave a talk on probabilistic computation in which he discussed this technique.

Problem: simulate a biased coin using a fair coin.

Solution: (in Python)

def biasedCoin(binaryDigitStream, fairCoin):
for d in binaryDigitStream:
if fairCoin() != d:
return d


Discussion: This function takes two arguments, an iterator representing the binary expansion of the intended probability of getting 1 (let us denote it as $p$) and another function that returns 1 or 0 with equal probability. At first glance this might seem like an overcomplicated way of solving this problem: why can’t the probability be a floating point number?

The point is that $p$ can have infinite precision! Assuming that fairCoin() gives us a perfectly random stream of 1′s and 0′s (independently and with probability 1/2) and we can read each bit of the binary expansion of $p$, this function returns 1 with probability exactly $p$ even if $p$ is irrational or a fraction with infinite decimal expansion. If we used floating point arithmetic there would be a small chance we get unlucky and exhaust the precision available. We would only get an approximation of the true bias at best.

Now let us explain why this algorithm works. We keep tossing our fair coins to get a sequence of random bits, until one of our random bits is different from the corresponding bit in the binary expansion of $p$. If we stop after $i$ steps, that means that the first $i-1$ bits in the two binary sequences were the same, which happens with probability $\frac{1}{2^{i-1}}$. Given that this happens, in the $i$th step we will return the $i$th bit of $p$; let us denote this bit by $p_i$. Then the probability of returning 1 is $\sum_{i=1}^\infty \frac{p_i}{2^{i-1}}$, which is the binary expansion of $p$.

This algorithm is also efficient. By efficient here we mean that the expected running time is constant. Of course, to show this we need to make some assumption about the computational complexity of calculating the bits of $p$. If we assume that the bits of $p$ are efficiently computable in the sense that the time required to compute $p_i$ is bounded by a polynomial in $i$, then this algorithm does run in constant expected time.

Indeed, the expected running time is $\sum_{i=0}^\infty \frac{i^n}{2^i}$. Showing that this sum is a constant is an easy calculus exercise: using the ratio test we get that

$\displaystyle \textup{limsup}_{i \to \infty} \left | \frac{\frac{(i+1)^n}{2^{i+1}}}{\frac{i^n}{2^i}} \right | = \limsup_{i\to\infty} \frac{\left(\frac{i+1}{i}\right)^n}{2} = \frac{1}{2} < 1$,

thus the series is convergent.

Now that we proved that our algorithm works, it’s time to try it! Let’s say that we want to simulate a coin which gives “heads” with probability 1/3.
We need to construct our binary digit stream. Since 1/3 is 0.010101… in binary, we could use the following simple generator:

def oneThird():
while True:
yield 0
yield 1


However, we might want to have a more general generator that gives us the binary representation of any number. The following function, which takes a number between 0 and 1 as its argument, does the job:

def binaryDigits(fraction):
while True:
fraction *= 2
yield int(fraction)
fraction = fraction % 1


We also need a fair coin simulator. For this simulation, let’s just use Python’s built-in pseudo-random number generator:

def fairCoin():
return random.choice([0,1])


Let us toss our biased coin 10000 times and take the sum. We expect the sum to be around 3333. Indeed, when I tried

>>> sum(biasedCoin(oneThird(), fairCoin) for i in range(10000))
3330


It might be worth noting oneThird() is approximately ten times faster than binaryDigits(fractions.Fraction(1,3)), so when a large number of biased coins is needed, you can hardwire the binary representation of $p$ into the program.

# Simulating a Fair Coin with a Biased Coin

This is a guest post by my friend and colleague Adam Lelkes. Adam’s interests are in algebra and theoretical computer science. This gem came up because Adam gave a talk on probabilistic computation in which he discussed this technique.

Problem: Simulate a fair coin given only access to a biased coin.

Solution: (in Python)

def fairCoin(biasedCoin):
coin1, coin2 = 0,0
while coin1 == coin2:
coin1, coin2 = biasedCoin(), biasedCoin()
return coin1


Discussion: This is originally von Neumann’s clever idea. If we have a biased coin (i.e. a coin that comes up heads with probability different from 1/2), we can simulate a fair coin by tossing pairs of coins until the two results are different. Given that we have different results, the probability that the first is “heads” and the second is “tails” is the same as the probability of “tails” then “heads”. So if we simply return the value of the first coin, we will get “heads” or “tails” with the same probability, i.e. 1/2.

Note that we did not have to know or assume anything about our biasedCoin function other than it returns 0 or 1 every time, and the results between function calls are independent and identically distributed. In particular, we do not need to know the probability of getting 1. (However, that probability should be strictly between 0 or 1.) Also, we do not use any randomness directly, only through the biasedCoin function.

Here is a simple simulation:

from random import random
def biasedCoin():
return int(random() < 0.2)


This function will return 1 with probability 0.2. If we try

sum(biasedCoin() for i in range(10000))


with high probability we will get a number that is close to 2000. I got 2058.

On the other hand, if we try

sum(fairCoin(biasedCoin) for i in range(10000))


we should see a value that is approximately 5000. Indeed, when I tried it, I got 4982, which is evidence that fairCoin(biasedCoin) returns 1 with probability 1/2 (although I already gave a proof!).

One might wonder how many calls to biasedCoin we expect to make before the function returns. One can recognize the experiment as a geometric distribution and use the known expected value, but it is short so here is a proof. Let $s$ be the probability of seeing two different outcomes in the biased coin flip, and $t$ the expected number of trials until that happens. If after two flips we see the same outcome (HH or TT), then by independence the expected number of flips we need is unchanged. Hence

$t = 2s + (1-s)(2 + t)$

Simplifying gives $t = 2/s$, and since we know $s = 2p(1-p)$ we expect to flip the coin $\frac{1}{p(1-p)}$ times.

For a deeper dive into this topic, see these notes by Michael Mitzenmacher from Harvard University. They discuss strategies for simulating a fair coin from a biased coin that are optimal in the expected number of flips required to run the experiment once. He has also written a book on the subject of randomness in computing.

# Fixing Bugs in “Computing Homology”

A few awesome readers have posted comments in Computing Homology to the effect of, “Your code is not quite correct!” And they’re right! Despite the almost year since that post’s publication, I haven’t bothered to test it for more complicated simplicial complexes, or even the basic edge cases! When I posted it the mathematics just felt so solid to me that it had to be right (the irony is rich, I know).

As such I’m apologizing for my lack of rigor and explaining what went wrong, the fix, and giving some test cases. As of the publishing of this post, the Github repository for Computing Homology has been updated with the correct code, and some more examples.

The main subroutine was the simultaneousReduce function which I’ll post in its incorrectness below

def simultaneousReduce(A, B):
if A.shape[1] != B.shape[0]:
raise Exception("Matrices have the wrong shape.")

numRows, numCols = A.shape # col reduce A

i,j = 0,0
while True:
if i >= numRows or j >= numCols:
break

if A[i][j] == 0:
nonzeroCol = j
while nonzeroCol < numCols and A[i,nonzeroCol] == 0:
nonzeroCol += 1

if nonzeroCol == numCols:
j += 1
continue

colSwap(A, j, nonzeroCol)
rowSwap(B, j, nonzeroCol)

pivot = A[i,j]
scaleCol(A, j, 1.0 / pivot)
scaleRow(B, j, 1.0 / pivot)

for otherCol in range(0, numCols):
if otherCol == j:
continue
if A[i, otherCol] != 0:
scaleAmt = -A[i, otherCol]
colCombine(A, otherCol, j, scaleAmt)
rowCombine(B, j, otherCol, -scaleAmt)

i += 1; j+= 1

return A,B


It’s a beast of a function, and the persnickety detail was just as beastly: this snippet should have an $i += 1$ instead of a $j$.

if nonzeroCol == numCols:
j += 1
continue


This is simply what happens when we’re looking for a nonzero entry in a row to use as a pivot for the corresponding column, but we can’t find one and have to move to the next row. A stupid error on my part that would be easily caught by proper test cases.

The next mistake is a mathematical misunderstanding. In short, the simultaneous column/row reduction process is not enough to get the $\partial_{k+1}$ matrix into the right form! Let’s see this with a nice example, a triangulation of the Mobius band. There are a number of triangulations we could use, many of which are seen in these slides. The one we’ll use is the following.

It’s first and second boundary maps are as follows (in code, because latex takes too much time to type out)

mobiusD1 = numpy.array([
[-1,-1,-1,-1, 0, 0, 0, 0, 0, 0],
[ 1, 0, 0, 0,-1,-1,-1, 0, 0, 0],
[ 0, 1, 0, 0, 1, 0, 0,-1,-1, 0],
[ 0, 0, 0, 1, 0, 0, 1, 0, 1, 1],
])

mobiusD2 = numpy.array([
[ 1, 0, 0, 0, 1],
[ 0, 0, 0, 1, 0],
[-1, 0, 0, 0, 0],
[ 0, 0, 0,-1,-1],
[ 0, 1, 0, 0, 0],
[ 1,-1, 0, 0, 0],
[ 0, 0, 0, 0, 1],
[ 0, 1, 1, 0, 0],
[ 0, 0,-1, 1, 0],
[ 0, 0, 1, 0, 0],
])


And if we were to run the above code on it we’d get a first Betti number of zero (which is incorrect, it’s first homology group has rank 1). Here are the reduced matrices.

>>> A1, B1 = simultaneousReduce(mobiusD1, mobiusD2)
>>> A1
array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]])
>>> B1
array([[ 0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0],
[ 0,  1,  0,  0,  0],
[ 1, -1,  0,  0,  0],
[ 0,  0,  0,  0,  1],
[ 0,  1,  1,  0,  0],
[ 0,  0, -1,  1,  0],
[ 0,  0,  1,  0,  0]])


The first reduced matrix looks fine; there’s nothing we can do to improve it. But the second one is not quite fully reduced! Notice that rows 5, 8 and 10 are not linearly independent. So we need to further row-reduce the nonzero part of this matrix before we can read off the true rank in the way we described last time. This isn’t so hard (we just need to reuse the old row-reduce function we’ve been using), but why is this allowed? It’s just because the corresponding column operations for those row operations are operating on columns of all zeros! So we need not worry about screwing up the work we did in column reducing the first matrix, as long as we only work with the nonzero rows of the second.

Of course, nothing is stopping us from ignoring the “corresponding” column operations, since we know we’re already done there. So we just have to finish row reducing this matrix.

This changes our bettiNumber function by adding a single call to a row-reduce function which we name so as to be clear what’s happening. The resulting function is

def bettiNumber(d_k, d_kplus1):
A, B = numpy.copy(d_k), numpy.copy(d_kplus1)
simultaneousReduce(A, B)
finishRowReducing(B)

dimKChains = A.shape[1]
kernelDim = dimKChains - numPivotCols(A)
imageDim = numPivotRows(B)

return kernelDim - imageDim


And running this on our Mobius band example gives:

>>> bettiNumber(mobiusD1, mobiusD2))
1


As desired. Just to make sure things are going swimmingly under the hood, we can check to see how finishRowReducing does after calling simultaneousReduce

>>> simultaneousReduce(mobiusD1, mobiusD2)
(array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]]), array([[ 0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0],
[ 0,  1,  0,  0,  0],
[ 1, -1,  0,  0,  0],
[ 0,  0,  0,  0,  1],
[ 0,  1,  1,  0,  0],
[ 0,  0, -1,  1,  0],
[ 0,  0,  1,  0,  0]]))
>>> finishRowReducing(mobiusD2)
array([[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])


Indeed, finishRowReducing finishes row reducing the second boundary matrix. Note that it doesn’t preserve how the rows of zeros lined up with the pivot columns of the reduced version of $\partial_1$ as it did in the previous post, but since in the end we’re only counting pivots it doesn’t matter how we switch rows. The “zeros lining up” part is just for a conceptual understanding of how the image lines up with the kernel for a valid simplicial complex.

In fixing this issue we’ve also fixed an issue another commenter mentioned, that you couldn’t blindly plug in the zero matrix for $\partial_0$ and get zeroth homology (which is the same thing as connected components). After our fix you can.

Of course there still might be bugs, but I have so many drafts lined up on this blog (and research papers to write, experiments to run, theorems to prove), that I’m going to put off writing a full test suite. I’ll just have to update this post with new bug fixes as they come. There’s just so much math and so little time But extra kudos to my amazing readers who were diligent enough to run examples and spot my error. I’m truly blessed to have you on my side.

Also note that this isn’t the most efficient way to represent the simplicial complex data, or the most efficient row reduction algorithm. If you’re going to run the code on big inputs, I suggest you take advantage of sparse matrix algorithms for doing this sort of stuff. You can represent the simplices as entries in a dictionary and do all sorts of clever optimizations to make the algorithm effectively linear time in the number of simplices.

Until next time!