Connecting Elliptic Curves with Finite Fields

So here we are. We’ve studied the general properties of elliptic curves, written a program for elliptic curve arithmetic over the rational numbers, and taken a long detour to get some familiarity with finite fields (the mathematical background and a program that implements arbitrary finite field arithmetic).

And now we want to get back on track and hook our elliptic curve program up with our finite field program to make everything work. And indeed, for most cases it’s just that simple! For example, take the point $ P = (2,1)$ on the elliptic curve $ y = x^3 + x + 1$ with coefficients in $ \mathbb{Z}/5$. Using purely code produced in previous posts, we can do arithmetic:

>>> F5 = FiniteField(5, 1)
>>> C = EllipticCurve(a=F5(1), b=F5(1))
>>> P = Point(C, F5(2), F5(1))
>>> P
(2 (mod 5), 1 (mod 5))
>>> 2*P
(2 (mod 5), 4 (mod 5))
>>> 3*P

Here’s an example of the same curve $ y^2 = x^3 + x + 1$ with coefficients over the finite field of order 25 $ \mathbb{F}_{5^2}$.

>>> F25 = FiniteField(5,2)
>>> F25.idealGenerator
3 + 0 t^1 + 1 t^2
>>> curve = EllipticCurve(a=F25([1]), b=F25([1]))
>>> x = F25([2,1])
>>> y = F25([0,2])
>>> y*y - x*x*x - x - 1
0 ∈ F_{5^2}
>>> curve.testPoint(x,y)
>>> P = Point(curve, x, y)
>>> -P
(2 + 1 t^1, 0 + 3 t^1)
>>> P+P
(3 + 1 t^1, 2)
>>> 4*P
(3 + 2 t^1, 4 + 4 t^1)
>>> 9*P

There are some subtle issues, though, in that we shouldn’t use the code we have to work over any finite field. But we’ve come very far and covered a lot of technical details, so let’s briefly remember how we got here.

Taking a Step Back

At the beginning there was only $ \mathbb{Q}$, the field of rational numbers. We had a really nice geometric picture of elliptic curves over this field, and using that picture we developed an algorithm for (geometrically) adding points.

add-points-exampleIf we assume the equation of the elliptic curve had this nice form (the so-called Weierstrass normal form, $ y^2 = x^3 + ax + b$), then we were able to translate the geometric algorithm into an algebraic one. This made it possible to write a program to perform the additions, and this was our first programmatic milestone. Along the way, we learned about groups and projective geometry, which I explained was the proper mathematical setting for elliptic curves. In that setting, we saw that for most fields, every elliptic curve could be modified into one in Weierstrass normal form without changing the algebraic structure of the set of solutions. Moreover, we saw that you can replace the field $ \mathbb{Q}$ with the field of your choice. The set of solutions to an elliptic curve still forms a group and the same algebraic point-adding algorithm works. It’s just an interesting quirk of mathematics that one way to represent elements of finite fields are as polynomial remainders when dividing by a “prime” polynomial (analogous to modular arithmetic with integers). So we spent a while actually implementing finite fields in terms of this representation.

The reader has probably heard of this, but in practice one uses a (very large) finite field for the coefficients of their elliptic curve. Often this is $ \mathbb{Z}/p$ for some really large prime $ p$, or the field of $ 2^m$ elements for some large integer $ m$. But one would naturally complain: there are so many (infinitely many!) finite fields to choose from! Which one should we use, and how did they choose these?

As with most engineering problems the answer is a trade-off, in this case between efficiency and security. Arithmetic is faster in fields of characteristic 2 (and easy to implement at the hardware level!) but a lot is known about the finite field of $ 2^m$ elements. In fact, if you are sloppy in picking $ m$ you’ll get no security at all! One prominent example is the so-called Weil descent attack, which breaks security assumptions for elliptic curve cryptography when $ m$ is not prime. These attacks use some sophisticated machinery, but this is how it goes. An abstract mathematical breakthrough can immediately invalidate cryptography based on certain elliptic curves.

But before we get neck-deep in cryptography we have an even bigger problem: for some finite fields, not every elliptic curve has a Weierstrass normal form! So our program isn’t expressive enough to represent all elliptic curves we might want to. We could avoid these curves in our applications, but that would be unnecessarily limiting. With a bit more careful work, we can devise a more general algorithm (and a different normal form) that works for all fields. But let’s understand the problem first.

In general, you can have an elliptic curve of the form $ \sum_{i+j=3} a_{i,j}x^iy^j = 0$. That is, it’s just a really general degree 3 polynomial in two variables. If we assume the discriminant of this polynomial is nonzero, we’ll get a smooth curve. And then to get to the Weierstrass normal form involves a bunch of changes of variables. The problem is that the algebraic manipulations you do require you to multiply and divide by 2 and 3. In a field of either characteristic, these operations are either destructive (multiplying by zero) or totally illegal (dividing by zero), and they ruin Weierstrass’s day.

So what can we do?

Well it turns out that there is a more general Weierstrass normal form, unsurprisingly called the generalized Weierstrass normal form. It looks like this

$ \displaystyle y^2 + a_1 xy + a_3y = x^3 + a_2x^2 + a_4x + a_6$

The same geometric idea of drawing lines works for this curve as well. It’s just that now the formula is way more complicated. It involves computing a bunch of helper constants and computing far more arithmetic. My colleague Daniel Ngheim was kind enough to code up the algorithm, and here it is

    def __add__(self, Q):
        if isinstance(Q, Ideal):
            return Point(self.curve, self.x, self.y)

        a1,a2,a3,a4,a6 = (self.curve.a1, self.curve.a2, self.curve.a3, self.curve.a4, self.curve.a6)

        if self.x == Q.x:
            x = self.x
            if self.y + Q.y + a1*x + a3 == 0:
                return Ideal(self.curve)
                c = ((3*x*x + 2*a2*x + a4 - a1*self.y) / (2*self.y + a1*x + a3))
                d = (-(x*x*x) + a4*x + 2*a6 - a3*self.y) / (2*self.y + a1*x + a3)
                Sum_x = c*c + a1*c - a2 - 2*self.x
                Sum_y = -(c + a1) * Sum_x - d - a3
                return Point(self.curve, Sum_x, Sum_y)
            c =  (Q.y - self.y) / (Q.x - self.x)
            d =  (self.y*Q.x - Q.y*self.x) / (Q.x - self.x)
            Sum_x = c*c + a1*c - a2 - self.x - Q.x
            Sum_y = -(c + a1)*Sum_x - d - a3
            return Point(self.curve, Sum_x, Sum_y)

   def __neg__(self):
      return Point(self.curve, self.x, -self.y - self.curve.a1*self.x - self.curve.a3)

I trust that the devoted reader could derive this algorithm by hand, but for a more detailed derivation see the book of Silverman (it’s a graduate level text, but the point is that if you’re not really serious about implementing elliptic curve cryptography then you shouldn’t worry about this more general algorithm).

One might start to wonder: are there still other forms of elliptic curves that we could use to get around some of the difficulties of the Weierstrass normal form? The answer is yes, but we’ll defer their discussion to a future post. The brief explanation is that through a different choice of variable changes you can get to a different form of curve, and the algorithms you get from writing out the algebraic equations for adding points are slightly more efficient.

For the remainder of this series we’ll just work with one family of finite fields, those fields of the form $ \mathbb{Z}/p$ for some large $ p$. There is one particularly famous elliptic curve over this field that is used in some of the most secure applications in existence, and this will roughly be our target. In either case, we have provided the combined elliptic curve and finite field code (and the generalized elliptic curve class) on this blog’s Github page.

So in the next post we’ll actually start talking about cryptography and how to use elliptic curves to do things like generate a shared secret key.

Until then!

(Finite) Fields — A Primer

So far on this blog we’ve given some introductory notes on a few kinds of algebraic structures in mathematics (most notably groups and rings, but also monoids). Fields are the next natural step in the progression.

If the reader is comfortable with rings, then a field is extremely simple to describe: they’re just commutative rings with 0 and 1, where every nonzero element has a multiplicative inverse. We’ll give a list of all of the properties that go into this “simple” definition in a moment, but an even more simple way to describe a field is as a place where “arithmetic makes sense.” That is, you get operations for $ +,-, \cdot , /$ which satisfy the expected properties of addition, subtraction, multiplication, and division. So whatever the objects in your field are (and sometimes they are quite weird objects), they behave like usual numbers in a very concrete sense.

So here’s the official definition of a field. We call a set $ F$ a field if it is endowed with two binary operations addition ($ +$) and multiplication ($ \cdot$, or just symbol juxtaposition) that have the following properties:

  • There is an element we call 0 which is the identity for addition.
  • Addition is commutative and associative.
  • Every element $ a \in F$ has a corresponding additive inverse $ b$ (which may equal $ a$) for which $ a + b = 0$.

These three properties are just the axioms of a (commutative) group, so we continue:

  • There is an element we call 1 (distinct from 0) which is the identity for multiplication.
  • Multiplication is commutative and associative.
  • Every nonzero element $ a \in F$ has a corresponding multiplicative inverse $ b$ (which may equal $ a$) for which $ ab = 1$.
  • Addition and multiplication distribute across each other as we expect.

If we exclude the existence of multiplicative inverses, these properties make $ F$ a commutative ring, and so we have the following chain of inclusions that describes it all

$ \displaystyle \textup{Fields} \subset \textup{Commutative Rings} \subset \textup{Rings} \subset \textup{Commutative Groups} \subset \textup{Groups}$

The standard examples of fields are the real numbers $ \mathbb{R}$, the rationals $ \mathbb{Q}$, and the complex numbers $ \mathbb{C}$. But of course there are many many more. The first natural question to ask about fields is: what can they look like?

For example, can there be any finite fields? A field $ F$ which as a set has only finitely many elements?

As we saw in our studies of groups and rings, the answer is yes! The simplest example is the set of integers modulo some prime $ p$. We call them $ \mathbb{Z} / p \mathbb{Z},$ or sometimes just $ \mathbb{Z}/p$ for short, and let’s rederive what we know about them now.

As a set, $ \mathbb{Z}/p$ consists of the integers $ \left \{ 0, 1, \dots, p-1 \right \}$. The addition and multiplication operations are easy to define, they’re just usual addition and multiplication followed by a modulus. That is, we add by $ a + b \mod p$ and multiply with $ ab \mod p$. This thing is clearly a commutative ring (because the integers form a commutative ring), so to show this is a field we need to show that everything has a multiplicative inverse.

There is a nice fact that allows us to do this: an element $ a$ has an inverse if and only if the only way for it to divide zero is the trivial way $ 0a = 0$. Here’s a proof. For one direction, suppose $ a$ divides zero nontrivially, that is there is some $ c \neq 0$ with $ ac = 0$. Then if $ a$ had an inverse $ b$, then $ 0 = b(ac) = (ba)c = c$, but that’s very embarrassing for $ c$ because it claimed to be nonzero. Now suppose $ a$ only divides zero in the trivial way. Then look at all possible ways to multiply $ a$ by other nonzero elements of $ F$. No two can give you the same result because if $ ax = ay$ then (without using multiplicative inverses) $ a(x-y) = 0$, but we know that $ a$ can only divide zero in the trivial way so $ x=y$. In other words, the map “multiplication by $ a$” is injective. Because the set of nonzero elements of $ F$ is finite you have to hit everything (the map is in fact a bijection), and some $ x$ will give you $ ax = 1$.

Now let’s use this fact on $ \mathbb{Z}/p$ in the obvious way. Since $ p$ is a prime, there are no two smaller numbers $ a, b < p$ so that $ ab = p$. But in $ \mathbb{Z}/p$ the number $ p$ is equivalent to zero (mod $ p$)! So $ \mathbb{Z}/p$ has no nontrivial zero divisors, and so every element has an inverse, and so it’s a finite field with $ p$ elements.

The next question is obvious: can we get finite fields of other sizes? The answer turns out to be yes, but you can’t get finite fields of any size. Let’s see why.

Characteristics and Vector Spaces

Say you have a finite field $ k$ (lower-case k is the standard letter for a field, so let’s forget about $ F$). Beacuse the field is finite, if you take 1 and keep adding it to itself you’ll eventually run out of field elements. That is, $ n = 1 + 1 + \dots + 1 = 0$ at some point. How do I know it’s zero and doesn’t keep cycling never hitting zero? Well if at two points $ n = m \neq 0$, then $ n-m = 0$ is a time where you hit zero, contradicting the claim.

Now we define $ \textup{char}(k)$, the characteristic of $ k$, to be the smallest $ n$ (sums of 1 with itself) for which $ n = 0$. If there is no such $ n$ (this can happen if $ k$ is infinite, but doesn’t always happen for infinite fields), then we say the characteristic is zero. It would probably make more sense to say the characteristic is infinite, but that’s just the way it is. Of course, for finite fields the characteristic is always positive. So what can we say about this number? We have seen lots of example where it’s prime, but is it always prime? It turns out the answer is yes!

For if $ ab = n = \textup{char}(k)$ is composite, then by the minimality of $ n$ we get $ a,b \neq 0$, but $ ab = n = 0$. This can’t happen by our above observation, because being a zero divisor means you have no inverse! Contradiction, sucker.

But it might happen that there are elements of $ k$ that can’t be written as $ 1 + 1 + \dots + 1$ for any number of terms. We’ll construct examples in a minute (in fact, we’ll classify all finite fields), but we already have a lot of information about what those fields might look like. Indeed, since every field has 1 in it, we just showed that every finite field contains a smaller field (a subfield) of all the ways to add 1 to itself. Since the characteristic is prime, the subfield is a copy of $ \mathbb{Z}/p$ for $ p = \textup{char}(k)$. We call this special subfield the prime subfield of $ k$.

The relationship between the possible other elements of $ k$ and the prime subfield is very neat. Because think about it: if $ k$ is your field and $ F$ is your prime subfield, then the elements of $ k$ can interact with $ F$ just like any other field elements. But if we separate $ k$ from $ F$ (make a separate copy of $ F$), and just think of $ k$ as having addition, then the relationship with $ F$ is that of a vector space! In fact, whenever you have two fields $ k \subset k’$, the latter has the structure of a vector space over the former.

Back to finite fields, $ k$ is a vector space over its prime subfield, and now we can impose all the power and might of linear algebra against it. What’s it’s dimension? Finite because $ k$ is a finite set! Call the dimension $ m$, then we get a basis $ v_1, \dots, v_m$. Then the crucial part: every element of $ k$ has a unique representation in terms of the basis. So they are expanded in the form

$ \displaystyle f_1v_1 + \dots + f_mv_m$

where the $ f_i$ come from $ F$. But now, since these are all just field operations, every possible choice for the $ f_i$ has to give you a different field element. And how many choices are there for the $ f_i$? Each one has exactly $ |F| = \textup{char}(k) = p$. And so by counting we get that $ k$ has $ p^m$ many elements.

This is getting exciting quickly, but we have to pace ourselves! This is a constraint on the possible size of a finite field, but can we realize it for all choices of $ p, m$? The answer is again yes, and in the next section we’ll see how.  But reader be warned: the formal way to do it requires a little bit of familiarity with ideals in rings to understand the construction. I’ll try to avoid too much technical stuff, but if you don’t know what an ideal is, you should expect to get lost (it’s okay, that’s the nature of learning new math!).

Constructing All Finite Fields

Let’s describe a construction. Take a finite field $ k$ of characteristic $ p$, and say you want to make a field of size $ p^m$. What we need to do is construct a field extension, that is, find a bigger field containing $ k$ so that the vector space dimension of our new field over $ k$ is exactly $ m$.

What you can do is first form the ring of polynomials with coefficients in $ k$. This ring is usually denoted $ k[x]$, and it’s easy to check it’s a ring (polynomial addition and multiplication are defined in the usual way). Now if I were speaking to a mathematician I would say, “From here you take an irreducible monic polynomial $ p(x)$ of degree $ m$, and quotient your ring by the principal ideal generated by $ p$. The result is the field we want!”

In less compact terms, the idea is exactly the same as modular arithmetic on integers. Instead of doing arithmetic with integers modulo some prime (an irreducible integer), we’re doing arithmetic with polynomials modulo some irreducible polynomial $ p(x)$. Now you see the reason I used $ p$ for a polynomial, to highlight the parallel thought process. What I mean by “modulo a polynomial” is that you divide some element $ f$ in your ring by $ p$ as much as you can, until the degree of the remainder is smaller than the degree of $ p(x)$, and that’s the element of your quotient. The Euclidean algorithm guarantees that we can do this no matter what $ k$ is (in the formal parlance, $ k[x]$ is called a Euclidean domain for this very reason). In still other words, the “quotient structure” tells us that two polynomials $ f, g \in k[x]$ are considered to be the same in $ k[x] / p$ if and only if $ f – g$ is divisible by $ p$. This is actually the same definition for $ \mathbb{Z}/p$, with polynomials replacing numbers, and if you haven’t already you can start to imagine why people decided to study rings in general.

Let’s do a specific example to see what’s going on. Say we’re working with $ k = \mathbb{Z}/3$ and we want to compute a field of size $ 27 = 3^3$. First we need to find a monic irreducible polynomial of degree $ 3$. For now, I just happen to know one: $ p(x) = x^3 – x + 1$. In fact, we can check it’s irreducible, because to be reducible it would have to have a linear factor and hence a root in $ \mathbb{Z}/3$. But it’s easy to see that if you compute $ p(0), p(1), p(2)$ and take (mod 3) you never get zero.

So I’m calling this new ring

$ \displaystyle \frac{\mathbb{Z}/3[x]}{(x^3 – x + 1)}$

It happens to be a field, and we can argue it with a whole lot of ring theory. First, we know an irreducible element of this ring is also prime (because the ring is a unique factorization domain), and prime elements generate maximal ideals (because it’s a principal ideal domain), and if you quotient by a maximal ideal you get a field (true of all rings).

But if we want to avoid that kind of argument and just focus on this ring, we can explicitly construct inverses. Say you have a polynomial $ f(x)$, and for illustration purposes we’ll choose $ f(x) = x^4 + x^2 – 1$. Now in the quotient ring we could do polynomial long division to find remainders, but another trick is just to notice that the quotient is equivalent to the condition that $ x^3 = x – 1$. So we can reduce $ f(x)$ by applying this rule to $ x^4 = x^3 x$ to get

$ \displaystyle f(x) = x^2 + x(x-1) – 1 = 2x^2 – x – 1$

Now what’s the inverse of $ f(x)$? Well we need a polynomial $ g(x) = ax^2 + bx + c$ whose product with $ f$ gives us something which is equivalent to 1, after you reduce by $ x^3 – x + 1$. A few minutes of algebra later and you’ll discover that this is equivalent to the following polynomial being identically 1

$ \displaystyle (a-b+2c)x^2 + (-3a+b-c)x + (a – 2b – 2c) = 1$

In other words, we get a system of linear equations which we need to solve:

$ \displaystyle \begin{aligned} a & – & b & + & 2c & = 0 \\ -3a & + & b & – & c &= 0 \\ a & – & 2b & – & 2c &= 1 \end{aligned}$

And from here you can solve with your favorite linear algebra techniques. This is a good exercise for working in fields, because you get to abuse the prime subfield being characteristic 3 to say terrifying things like $ -1 = 2$ and $ 6b = 0$. The end result is that the inverse polynomial is $ 2x^2 + x + 1$, and if you were really determined you could write a program to compute these linear systems for any input polynomial and ensure they’re all solvable. We prefer the ring theoretic proof.

In any case, it’s clear that taking a polynomial ring like this and quotienting by a monic irreducible polynomial gives you a field. We just control the size of that field by choosing the degree of the irreducible polynomial to our satisfaction. And that’s how we get all finite fields!

One Last Word on Irreducible Polynomials

One thing we’ve avoided is the question of why irreducible monic polynomials exist of all possible degrees $ m$ over any $ \mathbb{Z}/p$ (and as a consequence we can actually construct finite fields of all possible sizes).

The answer requires a bit of group theory to prove this, but it turns out that the polynomial $ x^{p^m} – x$ has all degree $ m$ monic irreducible polynomials as factors. But perhaps a better question (for computer scientists) is how do we work over a finite field in practice? One way is to work with polynomial arithmetic as we described above, but this has some downsides: it requires us to compute these irreducible monic polynomials (which doesn’t sound so hard, maybe), to do polynomial long division every time we add, subtract, or multiply, and to compute inverses by solving a linear system.

But we can do better for some special finite fields, say where the characteristic is 2 (smells like binary) or we’re only looking at $ F_{p^2}$. The benefit there is that we aren’t forced to use polynomials. We can come up with some other kind of structure (say, matrices of a special form) which happens to have the same field structure and makes computing operations relatively painless. We’ll see how this is done in the future, and see it applied to cryptography when we continue with our series on elliptic curve cryptography.

Until then!

Elliptic Curves as Python Objects

Last time we saw a geometric version of the algorithm to add points on elliptic curves. We went quite deep into the formal setting for it (projective space $ \mathbb{P}^2$), and we spent a lot of time talking about the right way to define the “zero” object in our elliptic curve so that our issues with vertical lines would disappear.

With that understanding in mind we now finally turn to code, and write classes for curves and points and implement the addition algorithm. As usual, all of the code we wrote in this post is available on this blog’s Github page.

Points and Curves

Every introductory programming student has probably written the following program in some language for a class representing a point.

class Point(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

It’s the simplest possible nontrivial class: an x and y value initialized by a constructor (and in Python all member variables are public).

We want this class to represent a point on an elliptic curve, and overload the addition and negation operators so that we can do stuff like this:

p1 = Point(3,7)
p2 = Point(4,4)
p3 = p1 + p2

But as we’ve spent quite a while discussing, the addition operators depend on the features of the elliptic curve they’re on (we have to draw lines and intersect it with the curve). There are a few ways we could make this happen, but in order to make the code that uses these classes as simple as possible, we’ll have each point contain a reference to the curve they come from. So we need a curve class.

It’s pretty simple, actually, since the class is just a placeholder for the coefficients of the defining equation. We assume the equation is already in the Weierstrass normal form, but if it weren’t one could perform a whole bunch of algebra to get it in that form (and you can see how convoluted the process is in this short report or page 115 (pdf p. 21) of this book). To be safe, we’ll add a few extra checks to make sure the curve is smooth.

class EllipticCurve(object):
   def __init__(self, a, b):
      # assume we're already in the Weierstrass form
      self.a = a
      self.b = b

      self.discriminant = -16 * (4 * a*a*a + 27 * b * b)
      if not self.isSmooth():
         raise Exception(&quot;The curve %s is not smooth!&quot; % self)

   def isSmooth(self):
      return self.discriminant != 0

   def testPoint(self, x, y):
      return y*y == x*x*x + self.a * x + self.b

   def __str__(self):
      return 'y^2 = x^3 + %Gx + %G' % (self.a, self.b)

   def __eq__(self, other):
      return (self.a, self.b) == (other.a, other.b)

And here’s some examples of creating curves

&gt;&gt;&gt; EllipticCurve(a=17, b=1)
y^2 = x^3 + 17x + 1
&gt;&gt;&gt; EllipticCurve(a=0, b=0)
Traceback (most recent call last):
Exception: The curve y^2 = x^3 + 0x + 0 is not smooth!

So there we have it. Now when we construct a Point, we add the curve as the extra argument and a safety-check to make sure the point being constructed is on the given elliptic curve.

class Point(object):
   def __init__(self, curve, x, y):
      self.curve = curve # the curve containing this point
      self.x = x
      self.y = y

      if not curve.testPoint(x,y):
         raise Exception(&quot;The point %s is not on the given curve %s&quot; % (self, curve))

Note that this last check will serve as a coarse unit test for all of our examples. If we mess up then more likely than not the “added” point won’t be on the curve at all. More precise testing is required to be bullet-proof, of course, but we leave explicit tests to the reader as an excuse to get their hands wet with equations.

Some examples:

&gt;&gt;&gt; c = EllipticCurve(a=1,b=2)
&gt;&gt;&gt; Point(c, 1, 2)
(1, 2)
&gt;&gt;&gt; Point(c, 1, 1)
Traceback (most recent call last):
Exception: The point (1, 1) is not on the given curve y^2 = x^3 + 1x + 2

Before we go ahead and implement addition and the related functions, we need to be decide how we want to represent the ideal point $ [0 : 1 : 0]$. We have two options. The first is to do everything in projective coordinates and define a whole system for doing projective algebra. Considering we only have one point to worry about, this seems like overkill (but could be fun). The second option, and the one we’ll choose, is to have a special subclass of Point that represents the ideal point.

class Ideal(Point):
   def __init__(self, curve):
      self.curve = curve

   def __str__(self):
      return &quot;Ideal&quot;

Note the inheritance is denoted by the parenthetical (Point) in the first line. Each function we define on a Point will require a 1-2 line overriding function in this subclass, so we will only need a small amount of extra bookkeeping. For example, negation is quite easy.

class Point(object):
   def __neg__(self):
      return Point(self.curve, self.x, -self.y)

class Ideal(Point):
   def __neg__(self):
      return self

Note that Python allows one to override the prefix-minus operation by defining __neg__ on a custom object. There are similar functions for addition (__add__), subtraction, and pretty much every built-in python operation. And of course addition is where things get more interesting. For the ideal point it’s trivial.

class Ideal(Point):
   def __add__(self, Q):
      return Q

Why does this make sense? Because (as we’ve said last time) the ideal point is the additive identity in the group structure of the curve. So by all of our analysis, $ P + 0 = 0 + P = P$, and the code is satisfyingly short.

For distinct points we have to follow the algorithm we used last time. Remember that the trick was to form the line $ L(x)$ passing through the two points being added, substitute that line for $ y$ in the elliptic curve, and then figure out the coefficient of $ x^2$ in the resulting polynomial. Then, using the two existing points, we could solve for the third root of the polynomial using Vieta’s formula.

In order to do that, we need to analytically solve for the coefficient of the $ x^2$ term of the equation $ L(x)^2 = x^3 + ax + b$. It’s tedious, but straightforward. First, write

$ \displaystyle L(x) = \left ( \frac{y_2 – y_1}{x_2 – x_1} \right ) (x – x_1) + y_1$

The first step of expanding $ L(x)^2$ gives us

$ \displaystyle L(x)^2 = y_1^2 + 2y_1 \left ( \frac{y_2 – y_1}{x_2 – x_1} \right ) (x – x_1) + \left [ \left (\frac{y_2 – y_1}{x_2 – x_1} \right ) (x – x_1) \right ]^2$

And we notice that the only term containing an $ x^2$ part is the last one. Expanding that gives us

$ \displaystyle \left ( \frac{y_2 – y_1}{x_2 – x_1} \right )^2 (x^2 – 2xx_1 + x_1^2)$

And again we can discard the parts that don’t involve $ x^2$. In other words, if we were to rewrite $ L(x)^2 = x^3 + ax + b$ as $ 0 = x^3 – L(x)^2 + ax + b$, we’d expand all the terms and get something that looks like

$ \displaystyle 0 = x^3 – \left ( \frac{y_2 – y_1}{x_2 – x_1} \right )^2 x^2 + C_1x + C_2$

where $ C_1, C_2$ are some constants that we don’t need. Now using Vieta’s formula and calling $ x_3$ the third root we seek, we know that

$ \displaystyle x_1 + x_2 + x_3 = \left ( \frac{y_2 – y_1}{x_2 – x_1} \right )^2$

Which means that $ x_3 = \left ( \frac{y_2 – y_1}{x_2 – x_1} \right )^2 – x_2 – x_1$. Once we have $ x_3$, we can get $ y_3$ from the equation of the line $ y_3 = L(x_3)$.

Note that this only works if the two points we’re trying to add are different! The other two cases were if the points were the same or lying on a vertical line. These gotchas will manifest themselves as conditional branches of our add function.

class Point(object):
   def __add__(self, Q):
      if isinstance(Q, Ideal):
         return self

      x_1, y_1, x_2, y_2 = self.x, self.y, Q.x, Q.y

      if (x_1, y_1) == (x_2, y_2):
         # use the tangent method
         if x_1 == x_2:
            return Ideal(self.curve) # vertical line

         # Using Vieta's formula for the sum of the roots
         m = (y_2 - y_1) / (x_2 - x_1)
         x_3 = m*m - x_2 - x_1
         y_3 = m*(x_3 - x_1) + y_1

         return Point(self.curve, x_3, -y_3)

First, we check if the two points are the same, in which case we use the tangent method (which we do next). Supposing the points are different, if their $ x$ values are the same then the line is vertical and the third point is the ideal point. Otherwise, we use the formula we defined above. Note the subtle and crucial minus sign at the end! The point $ (x_3, y_3)$ is the third point of intersection, but we still have to do the reflection to get the sum of the two points.

Now for the case when the points $ P, Q$ are actually the same. We’ll call it $ P = (x_1, y_1)$, and we’re trying to find $ 2P = P+P$. As per our algorithm, we compute the tangent line $ J(x)$ at $ P$. In order to do this we need just a tiny bit of calculus. To find the slope of the tangent line we implicitly differentiate the equation $ y^2 = x^3 + ax + b$ and get

$ \displaystyle \frac{dy}{dx} = \frac{3x^2 + a}{2y}$

The only time we’d get a vertical line is when the denominator is zero (you can verify this by taking limits if you wish), and so $ y=0$ implies that $ P+P = 0$ and we’re done. The fact that this can ever happen for a nonzero $ P$ should be surprising to any reader unfamiliar with groups! But without delving into a deep conversation about the different kinds of group structures out there, we’ll have to settle for such nice surprises.

In the other case $ y \neq 0$, we plug in our $ x,y$ values into the derivative and read off the slope $ m$ as $ (3x_1^2 + a)/(2y_1)$. Then using the same point slope formula for a line, we get $ J(x) = m(x-x_1) + y_1$, and we can use the same technique (and the same code!) from the first case to finish.

There is only one minor wrinkle we need to smooth out: can we be sure Vieta’s formula works? In fact, the real problem is this: how do we know that $ x_1$ is a double root of the resulting cubic? Well, this falls out again from that very abstract and powerful theorem of Bezout. There is a lot of technical algebraic geometry (and a very interesting but complicated notion of dimension) hiding behind the curtain here. But for our purposes it says that our tangent line intersects the elliptic curve with multiplicity 2, and this gives us a double root of the corresponding cubic.

And so in the addition function all we need to do is change the slope we’re using. This gives us a nice and short implementation

def __add__(self, Q):
      if isinstance(Q, Ideal):
         return self

      x_1, y_1, x_2, y_2 = self.x, self.y, Q.x, Q.y

      if (x_1, y_1) == (x_2, y_2):
         if y_1 == 0:
            return Ideal(self.curve)

         # slope of the tangent line
         m = (3 * x_1 * x_1 + self.curve.a) / (2 * y_1)
         if x_1 == x_2:
            return Ideal(self.curve)

         # slope of the secant line
         m = (y_2 - y_1) / (x_2 - x_1)

      x_3 = m*m - x_2 - x_1
      y_3 = m*(x_3 - x_1) + y_1

      return Point(self.curve, x_3, -y_3)

What’s interesting is how little the data of the curve comes into the picture. Nothing depends on $ b$, and only one of the two cases depends on $ a$. This is one reason the Weierstrass normal form is so useful, and it may bite us in the butt later in the few cases we don’t have it (for special number fields).

Here are some examples.

&gt;&gt;&gt; C = EllipticCurve(a=-2,b=4)
&gt;&gt;&gt; P = Point(C, 3, 5)
&gt;&gt;&gt; Q = Point(C, -2, 0)
&gt;&gt;&gt; P+Q
(0.0, -2.0)
&gt;&gt;&gt; Q+P
(0.0, -2.0)
&gt;&gt;&gt; Q+Q
&gt;&gt;&gt; P+P
(0.25, 1.875)
&gt;&gt;&gt; P+P+P
Traceback (most recent call last):
Exception: The point (-1.958677685950413, 0.6348610067618328) is not on the given curve y^2 = x^3 + -2x + 4!

&gt;&gt;&gt; x = -1.958677685950413
&gt;&gt;&gt; y = 0.6348610067618328
&gt;&gt;&gt; y*y - x*x*x + 2*x - 4

And so we crash headfirst into our first floating point arithmetic issue. We’ll vanquish this monster more permanently later in this series (in fact, we’ll just scrap it entirely and define our own number system!), but for now here’s a quick fix:

&gt;&gt;&gt; import fractions
&gt;&gt;&gt; frac = fractions.Fraction
&gt;&gt;&gt; C = EllipticCurve(a = frac(-2), b = frac(4))
&gt;&gt;&gt; P = Point(C, frac(3), frac(5))
&gt;&gt;&gt; P+P+P
(Fraction(-237, 121), Fraction(845, 1331))

Now that we have addition and negation, the rest of the class is just window dressing. For example, we want to be able to use the subtraction symbol, and so we need to implement __sub__

def __sub__(self, Q):
   return self + -Q

Note that because the Ideal point is a subclass of point, it inherits all of these special functions while it only needs to override __add__ and __neg__. Thank you, polymorphism! The last function we want is a scaling function, which efficiently adds a point to itself $ n$ times.

class Point(object):
   def __mul__(self, n):
      if not isinstance(n, int):
         raise Exception(&quot;Can't scale a point by something which isn't an int!&quot;)
            if n &lt; 0:
                return -self * -n
            if n == 0:
                return Ideal(self.curve)
                Q = self
                R = self if n &amp; 1 == 1 else Ideal(self.curve)

                i = 2
                while i &lt;= n:
                    Q = Q + Q

                    if n &amp; i == i:
                        R = Q + R

                    i = i &lt;&lt; 1
   return R

   def __rmul__(self, n):
      return self * n

class Ideal(Point):
    def __mul__(self, n):
        if not isinstance(n, int):
            raise Exception(&quot;Can't scale a point by something which isn't an int!&quot;)
            return self

The scaling function allows us to quickly compute $ nP = P + P + \dots + P$ ($ n$ times). Indeed, the fact that we can do this more efficiently than performing $ n$ additions is what makes elliptic curve cryptography work. We’ll take a deeper look at this in the next post, but for now let’s just say what the algorithm is doing.

Given a number written in binary $ n = b_kb_{k-1}\dots b_1b_0$, we can write $ nP$ as

$ \displaystyle b_0 P + b_1 2P + b_2 4P + \dots + b_k 2^k P$

The advantage of this is that we can compute each of the $ P, 2P, 4P, \dots, 2^kP$ iteratively using only $ k$ additions by multiplying by 2 (adding something to itself) $ k$ times. Since the number of bits in $ n$ is $ k= \log(n)$, we’re getting a huge improvement over $ n$ additions.

The algorithm is given above in code, but it’s a simple bit-shifting trick. Just have $ i$ be some power of two, shifted by one at the end of every loop. Then start with $ Q_0$ being $ P$, and replace $ Q_{j+1} = Q_j + Q_j$, and in typical programming fashion we drop the indices and overwrite the variable binding at each step (Q = Q+Q). Finally, we have a variable $ R$ to which $ Q_j$ is added when the $ j$-th bit of $ n$ is a 1 (and ignored when it’s 0). The rest is bookkeeping.

Note that __mul__ only allows us to write something like P * n, but the standard notation for scaling is n * P. This is what __rmul__ allows us to do.

We could add many other helper functions, such as ones to allow us to treat points as if they were lists, checking for equality of points, comparison functions to allow one to sort a list of points in lex order, or a function to transform points into more standard types like tuples and lists. We have done a few of these that you can see if you visit the code repository, but we’ll leave flushing out the class as an exercise to the reader.

Some examples:

&gt;&gt;&gt; import fractions
&gt;&gt;&gt; frac = fractions.Fraction
&gt;&gt;&gt; C = EllipticCurve(a = frac(-2), b = frac(4))
&gt;&gt;&gt; P = Point(C, frac(3), frac(5))
&gt;&gt;&gt; Q = Point(C, frac(-2), frac(0))
&gt;&gt;&gt; P-Q
(Fraction(0, 1), Fraction(-2, 1))
&gt;&gt;&gt; P+P+P+P+P
(Fraction(2312883, 1142761), Fraction(-3507297955, 1221611509))
&gt;&gt;&gt; 5*P
(Fraction(2312883, 1142761), Fraction(-3507297955, 1221611509))
&gt;&gt;&gt; Q - 3*P
(Fraction(240, 1), Fraction(3718, 1))
&gt;&gt;&gt; -20*P
(Fraction(872171688955240345797378940145384578112856996417727644408306502486841054959621893457430066791656001, 520783120481946829397143140761792686044102902921369189488390484560995418035368116532220330470490000), Fraction(-27483290931268103431471546265260141280423344817266158619907625209686954671299076160289194864753864983185162878307166869927581148168092234359162702751, 11884621345605454720092065232176302286055268099954516777276277410691669963302621761108166472206145876157873100626715793555129780028801183525093000000))

As one can see, the precision gets very large very quickly. One thing we’ll do to avoid such large numbers (but hopefully not sacrifice security) is to work in finite fields, the simplest version of which is to compute modulo some prime.

So now we have a concrete understanding of the algorithm for adding points on elliptic curves, and a working Python program to do this for rational numbers or floating point numbers (if we want to deal with precision issues). Next time we’ll continue this train of thought and upgrade our program (with very little work!) to work over other simple number fields. Then we’ll delve into the cryptographic issues, and talk about how one might encode messages on a curve and use algebraic operations to encode their messages.

Until then!

Elliptic Curves as Algebraic Structures

Last time we looked at the elementary formulation of an elliptic curve as the solutions to the equation

$ y^2 = x^3 + ax + b$

where $ a,b$ are such that the discriminant is nonzero:

$ -16(4a^3 + 27b^2) \neq 0$

We have yet to explain why we want our equation in this form, and we will get to that, but first we want to take our idea of intersecting lines as far as possible.

Fair warning: this post will start out at the same level as the previous post, but we intend to gradually introduce some mathematical maturity. If you don’t study mathematics, you’ll probably see terminology and notation somewhere between mysterious and incomprehensible. In particular, we will spend a large portion of this post explaining projective coordinates, and we use the blackboard-bold $ \mathbb{R}$ to denote real numbers.

Skimming difficult parts, asking questions in the comments, and pushing through to the end are all encouraged.

The Algorithm to Add Points

The deep idea, and the necessary insight for cryptography, is that the points on an elliptic curve have an algebraic structure. What I mean by this is that you can “add” points in a certain way, and it will satisfy all of the properties we expect of addition with true numbers. You may have guessed it based on our discussion in the previous post: adding two points will involve taking the line passing between them and finding the third point of intersection with the elliptic curve. But in order to make “adding points” rigorous we need to deal with some special cases (such as the vertical line problem we had last time).

So say we have two points $ P, Q$ on an elliptic curve $ E$ defined by $ y^2 = x^3 + ax + b$. By saying they’re “on the curve” we mean their coordinates satisfy the equation defining the curve. Then to add $ P + Q$, we do the following geometric algorithm:

  1. Form the line $ y = L(x)$ connecting $ P$ and $ Q$.
  2. Compute the third intersection point of $ L$ with $ E$ (the one that’s not $ P$ or $ Q$). Call it $ R$.
  3. Reflect $ R$ across the $ x$-axis to get the final point $ P + Q$.

Here’s that shown visually on our practice curve $ E: y^2 = x^3 – x + 1$.


Adding P and Q on an elliptic curve.

This algorithm might seem dubious, but it’s backed up by solid mathematics. For example, it’s almost immediately obvious that step 1 will always work (that you can always form such a line), the only exception being when $ P = Q$. And it’s almost a theorem that step 2 will always work (that there is always a third point of intersection), the only exception being the vertical line. If we ignore these exceptional cases, then the correctness of the algorithm is easy to prove, because we can just generalize the idea from last time.

Solving the joint system of the curve and the line is equivalent to solving

$ L(x)^2 = x^3 + ax + b$

Since $ L(x)$ is a degree 1 polynomial, this equation is a cubic polynomial in $ x$

$ x^3 – L(x)^2 + ax + b = 0$

If we already have two solutions to this equation with distinct $ x$-values (two points on the curve that don’t form a vertical line) then there has to be a third. Why? Because having a root of a polynomial means you can factor, and we have two distinct roots, so we know that our polynomial has as a divisor

$ (x – p_1)(x – q_1)$

But then the remainder must be a linear polynomial, and because the leading term is $ x^3$ it has to look like $ (x – r_1)$ for some $ r_1$. And so $ (r_1, L(r_1))$ is our third point. Moreover, $ p_1 + r_1 + q_1$ must be equal to the opposite of the coefficient of $ x^2$ in the equation above, so we can solve for it without worry about how to factor cubic polynomials. When we get down to some nitty-gritty code we’ll need to be more precise here with equations and such, but for now this is good enough.

Pause, Breathe, Reflect

It’s time to take a step back. There is a big picture overarching our work here. We have this mathematical object, a bunch of points on a curve in space, and we’re trying to say that it has algebraic structure. As we’ve been saying, we want to add points on the curve using our algorithm and always get a point on the curve as a result.

But beyond the algorithm, two important ideas are at work here. The first is that any time you make a new mathematical definition with the intent of overloading some operators (in this case, + and -), you want to make sure the operators behave like we expect them to. Otherwise “add” is a really misleading name!

The second idea is that we’re encoding computational structure in an elliptic curve. The ability to add and negate opens up a world of computational possibilities, and so any time we find algebraic structure in a mathematical object we can ask questions about the efficiency of computing functions within that framework (and reversing them!). This is precisely what cryptographers look for in a new encryption scheme: functions which are efficient to compute, but very hard to reverse if all you know is the output of the function.

So what are the properties we need to make sure addition behaves properly?

  1. We need there to be an the additive identity, which we’ll call zero, for which $ P + 0 = 0 + P = P$.
  2. We need every point $ P$ to have an inverse $ -P$ for which $ P + (-P) = 0$.
  3. We want adding to commute, so that $ P + Q = Q + P$. This property of an algebraic structure is called abelian or commutative.
  4. We need addition to be associative, so that $ (P + Q) + R = P + (Q + R)$.

In fact, if you just have a general collection of things (a set, in the mathematical parlance) and an operation which together satisfy these four properties, we call that a commutative group. By the end of this series we’ll to switch to the terminology of groups to get a mathematically mature viewpoint on elliptic curves, because it turns out that not all types of algebraic structure are the same (there are lots of different groups). But we’ll introduce it slowly as we see why elliptic curves form groups under the addition of their points.

There are still some things about our adding algorithm above that aren’t quite complete. We still need to know:

  1. What will act as zero.
  2. How to get the additive inverse of a point.
  3. How to add a point to itself.
  4. What to do if the two points form a vertical line.

The first, second, and fourth items are all taken care of in one fell swoop (and this is the main bit of mathematical elbow grease we have been hinting at), but the third point has a quick explanation if you’re willing to postpone a technical theorem. If you want to double a point, or add $ P + P = 2P$, then you can’t “take the line joining those two points” because you need two distinct points to define a line. But you can take the tangent line to the curve at $ P$ and look for the second point of intersection. Again we still have to worry about the case of a vertical line. But ignoring that, the reason there will always be a second point of intersection is called Bezout’s theorem, and this theorem is so strong and abstract that it’s very difficult to discuss it with what we presently know, but it has to do with counting multiplicity of roots of polynomial equations. Seeing that it’s mostly a technical tool, we’ll just be glad it’s there.

So with that, let’s get to the most important bit.

Projective Space and the Ideal Line

The shortest way to describe what will act as zero in our elliptic curve is to say that we invent a new point which is the “intersection” of all vertical lines. Because it’s the intersection of vertical lines, we’ll sometimes call it “the point at infinity.” We’ll also call it “zero” because it’s supposed to be the additive identity, we’ll demand it lies on every elliptic curve, and we’ll enforce that if we “reflect” zero across the $ x$-axis we still get zero. And then everything works!

If you want to add two points that form a vertical line, well, now the third point of intersection is zero and reflecting zero still gives you zero. If you want to get the additive inverse of a point $ P = (x,y)$, just have it be the point $ -P = (x, -y)$ reflected across the $ x$-axis; the two points form a vertical line and so by our algorithm they “add up” to zero. So it’s neat and tidy.

But wait, wait, wait. Points at infinity? All vertical lines intersect? This isn’t like any geometry I’ve ever seen before. How do we know we can do this without getting some massive contradictions!?

This is the best question one could ask, and we refuse to ignore it. Many articles aimed at a general technically literate audience get very fuzzy at this part. They make it seem like the mathematics we’re doing is magic, and they ignore the largest mathematical elephant in the room. Of course it’s not magic, and all of this has a solid foundation called projective space. We’re going to explore its basics now.

There is a long history of arguing over Euclid’s geometric axioms and how the postulate that parallel lines never intersect doesn’t follow from the other axioms. This is exactly what we’re going for, but blah blah blah let’s just see the construction already!

The crazy brilliant idea is that we want to make a geometric space where points are actually lines.

What will happen is that our elliptic curve equations will get an extra variable to account for this new kind of geometry, and a special value for this new variable will allow us to recover the usual pictures of elliptic curves in the plane (this is intentionally vague and will become more precise soon).

So how might one make such a geometry? Well you can do it with or without linear algebra. The way without linear algebra is to take three dimensional Euclidean space $ \mathbb{R}^3$, and look at the lines that pass through the origin. Make each of these lines its own point, and call the resulting space $ \mathbb{P}^2$, the projective plane. (For avid readers of this blog, this is exactly the same construction as $ \mathbb{R}\textup{P}^2$ we gave in our second topology primer, just seen from a very different angle. Leave a comment if you want to hear more.)

The problem with the non-linear-algebra approach is that we get no natural coordinate system, and we’re dying for coordinates (we’re here to compute stuff, after all!). So we need to talk about vectors. Every nonzero vector $ v$ in $ \mathbb{R}^3$ spans a line (by taking all multiples $ \lambda v$ for $ \lambda \in \mathbb{R}$). So instead of representing a point in projective space by a line, we can represent it by a vector, with the additional condition that two points are the same if they’re multiples of each other.

Here’s a picture of this idea. The two vectors displayed are equal to each other because they lie on the same line (they are multiples of each other).

These two vectors are equivalent because they lie on (span) the same line.

These two vectors are equivalent because they lie on (span) the same line.

Don’t run away because a very detailed explanation will follow what I’m about to say, but the super formal way of saying this is that projective space is the quotient space

$ \displaystyle \mathbb{P}^2 = (\mathbb{R}^3 – \left \{ 0 \right \} ) / v \sim \lambda v$

Still here? Okay great. Let’s talk about coordinates. If we are working with vectors in $ \mathbb{R}^3$, then we’re looking at coordinates like

$ (x, y, z)$

where $ x,y,z$ are real numbers. Trivial. But now in projective space we’re asserting two things. First, $ (0,0,0)$ is not allowed. And second, whenever we have $ (x,y,z)$, we’re declaring that it’s the same thing as $ (2x,2y,2z)$ or $ (-6x, -6y, -6z)$ or any other way to scale every component by the same amount. To denote the difference between usual vectors (parentheses) and our new coordinates, we use square brackets and colons. So a point in 2-dimensional projective space is

$ [x:y:z]$

where $ x,y,z$ are real numbers that are not all zero, and $ [x:y:z] = \lambda [x : y : z] = [\lambda x : \lambda y : \lambda z]$ for any $ \lambda \in \mathbb{R}$.

Now we can make some “canonical choices” for our coordinates, and start exploring how the assertions we made shape the geometry of this new space. For example, if $ P = [x:y:z]$ with $ z \neq 0$ then we can always scale by $ 1/z$ so that the point looks like

$ [x/z : y/z : 1]$

Now $ x/z$ and $ y/z$ can be anything (just think of it as $ [a : b : 1]$), and different choices of $ a,b$ give distinct points. Why? Because if we tried to scale to make two points equal we’d be screwing up the 1 that’s fixed in the third coordinate. So when $ z$ is nonzero we have this special representation (often called an affine slice), and it’s easy to see that all of these points form a copy of the usual Euclidean plane sitting inside $ \mathbb{P}^2$. There is a nice way to visualize exactly how this copy can be realized as “usual Euclidean space” using the picture below:


The indigo plane is the “affine slice” given by z = 1. It’s called “affine” because the indigo plane doesn’t intersect the origin. Image source:

Each line (each vector with $ z \neq 0$) intersects the indigo plane in exactly one point, so this describes a one-to-one mapping of points in the affine slice to the Euclidean plane.

But then when $ z = 0$, we get some other stuff that makes up the rest of $ \mathbb{P}^2$. Since $ x,y$ can’t both also be zero, we’ll suppose $ y$ is not zero, and then we can do the same normalization trick to see that all the points we get are

$ [a : 1 : 0]$

Since again $ a$ can be anything and you get distinct points for different choices of $ a$, this forms a copy of the real line inside of $ \mathbb{P}^2$ but outside of the affine slice. This line is sometimes called the ideal line to distinguish it from the lines that lie inside the affine slice $ z=1$. Actually, the ideal line is more than just a line. It’s (gasp) a circle! Why do I say that? Well think about what happens as $ a$ gets really large (or really negatively large). We have

$ [a : 1 : 0] = [1 : 1/a : 0]$

and the right hand side approaches $ [1 : 0 : 0]$, the last missing point! Another way to phrase our informal argument is to say $ [1 : 0 : 0]$ is the boundary of the line $ [a : 1 : 0]$, and (you guessed it) the circle we get here is the boundary of the affine slice $ [a : b : 1]$. And we can see exactly what it means for “two parallel lines” to intersect. Take the two lines given by

$ [a : 1 : 1], [b : 2 : 1]$

If we think of these as being in the affine slice of $ \mathbb{R}^2$ where $ z = 1$, it’s the lines given by $ (a, 1), (b, 2)$, which are obviously parallel. But where do they intersect as $ a,b$ get very large (or very negatively large)? at

$ [1 : 1/a : 1/a], [1 : 2/b : 1/b]$

which both become $ [1 : 0 : 0]$ in the limit. I’m being a bit imprecise here appealing to limits, but it works because projective space inherits some structure when we pass to the quotient (for the really technically inclined, it inherits a metric that comes from the sphere of radius 1). This is why we feel compelled to call it a quotient despite how confusing quotients can be, and it illustrates the power of appealing to these more abstract constructions.

In any case, now we have this image of projective space:

Our mental image of the projective plane: a big copy of the Euclidean plane (affine slice) whose boundary is the ideal line, whose boundary is in turn the single point [1:0:0].

Our mental image of the projective plane: a big copy of the Euclidean plane (affine slice z=1) whose boundary is the ideal line, whose boundary is in turn the single point [1:0:0].

It should be pretty clear that the choice of $ z=1$ to represent the affine slice is arbitrary, and we could have used $ x=1$ or $ y=1$ to realize different “copies” of the Euclidean plane sitting inside projective space. But in any case, we can use our new understanding to turn back to elliptic curves.

Homogeneous Equations and the Weierstrass Normal Form

Elliptic curves are officially “projective objects” in the sense that they are defined by homogeneous equations over projective space. That is, an elliptic curve equation is any homogeneous degree three equation whose discriminant is zero. By homogeneous I mean all the powers of the terms add up to three, so it has the general form

$ 0 = a_0 z^3 + a_1 z^2y + a_2 z^2x + a_3y^2x + \dots$

And note that now the solutions to this equation are required to be projective points $ [x : y : z]$. As an illuminating exercise, prove that $ [x : y : z]$ is a solution if and only if $ [\lambda x : \lambda y: \lambda z]$ is, i.e. that our definition of “solution” and the use of homogeneous equations is well-defined.

But to work with projective language forever is burdensome, something that only mathematicians are required to do. And in the case of elliptic curves we only use one new point from projective space (the intersection of the vertical lines). Once we get to writing programs we’ll have a special representation for points that aren’t in the affine slice, so we will be able to use regular Euclidean coordinates without much of a fuss.

That being said, we can now officially explain why we want the special form of elliptic curve $ y^2 = x^3 + ax + b$, called the Weierstrass normal form (pronounced VY-er-shtrahss). Specifically, elliptic curves can look very weird in their natural coordinates.

An elliptic curve before being converted to Weierstrass normal form.

An elliptic curve before being converted to Weierstrass normal form. Blech, it’s not even symmetric!

So to bring some order to the chaos, we want the projective point $ [0 : 1 : 0]$ to be our zero point. If we choose our axes appropriately (swapping the letters $ x, y, z$), then $ [0:1:0]$ lies at the intersection of all vertical lines. Now that point isn’t a solution to all homogeneous degree-three curves, but it is a solution to homogeneous equations that look like this (plug it in to see).

$ \displaystyle y^2z = x^3 + ax z^2 + b z^3$

Starting to look familiar? It turns out there is a theorem (that requires either heavy mathematical machinery or LOTS of algebra to prove) that says that for any homogeneous degree three equation you start with, you can always pick your “projective axes” (that is, apply suitable projective transformations) to get an equivalent equation of the form above. I mean equivalent in the sense that the transformation we applied took solutions of the original equation to solutions of this new equation and didn’t create any new ones. The casual reader should think of all this machinery as really clever changes of variables.

And then if we pick our classical Euclidean slice to be $ [x : y : 1]$, we get back to the standard form $ y^2 = x^3 + ax + b$. This is the Weierstrass normal form.

So that was a huge detour…

Back to adding points on elliptic curves. Now that we’ve defined zero one can check that addition makes sense. Zero has the needed property $ P + 0 = P$ since the “third” point of intersection of the vertical line passing through $ P$ is the reflection of $ P$ across the $ x$-axis, and reflecting that across the $ x$-axis give you $ P$. For the same reason $ P + (-P) = 0$. Even more, properties like $ 0 = -0$ naturally fall out of our definitions for projective coordinates, since $ [0:1:0] = [0:-1:0] = -1[0:1:0]$. So projective space, rather than mathematical hocus-pocus, is the correct setting to think about algebra happening on elliptic curves.

It’s also clear that addition is commutative because lines passing through points don’t care about the order of the points. The only real issue is whether addition is associative. That is whether $ (P + Q) + R = P + (Q + R)$ no matter what $ P,Q,R$ are. This turns out to be difficult, and it takes a lot of algebra and the use of that abstract Bezout’s theorem we mentioned earlier, so the reader will have to trust us that everything works out. (Although, Wikipedia has a slick animation outlining one such proof.)

So “adding points,” and that pesky “point at infinity” now officially makes sense!

What we’ve shown in the mathematical parlance is that the solutions to an elliptic curve form a group under this notion of addition. However, one thing we haven’t talked about is where these numbers come from. Recall from last time that we were interested in integer points on the elliptic curve, but it was clear from our example that adding two integer-valued points on an elliptic curve might not give you an integer-valued point.

However, if we require that our equation has coefficients in a field, and we allow our points to have coordinates in that same field, then adding two points with coordinates in the field always gives you a point with coordinates in the field. We haven’t ever formally talked about fields on this blog, but we’re all familiar with them: they have addition, multiplication, and division in the expected ways (everything except 0 has a multiplicative inverse, multiplication distributes across addition, etc.). The usual examples are $ \mathbb{R}, \mathbb{C}$ the real and complex numbers, and $ \mathbb{Q}$, the rational numbers (fractions of integers). There are also finite fields, which are the proper setting for elliptic curve cryptography, but we’ll save those for another post.

But why must it work if we use a field? Because the operations we need to perform the point-adding algorithm only use field operations (addition, multiplication, and division by other numbers in the field). So performing those operations on numbers in a field always give you back numbers in that field. Since all fields have 0 and 1, we also have the point at infinity $ [0 : 1 : 0]$.

This gives a natural definition.

Definition: Let $ k$ be a field and let $ E$ be the equation of an elliptic curve in Weierstrass form. Define $ E(k)$ to be the set of projective points on $ E$ with coordinates in $ k$ along with the ideal point $ [0:1:0]$. As we have discussed $ E(k)$ is a group under the operation of adding points, so we call it the elliptic curve group for $ E$ over $ k$.

Now that we’ve waded through a few hundred years of mathematics, it should be no surprise that this definition feels so deep and packed full of implicit understanding.

However, there are still a few mathematical peccadilloes to worry about. One is that if the chosen field $ k$ is particularly weird (as are many of those finite fields I mentioned), then we can’t transform any equation with coefficients in $ k$ into the Weierstrass normal form. This results in a few hiccups when you’re trying to do cryptography, but for simplicity’s sake we’ll avoid those areas and stick to nicer fields.

We have only scratched the surface of the algebraic structure of elliptic curves by showing elliptic curves have such structure at all. The next goal for a mathematician is to classify all possible algebraic structures for elliptic curves , and find easy ways to tell which from the coefficients of the equation. Indeed, we intend to provide a post at the end of this series (after we get to the juicy programs) that describes what’s known in this area from a more group-theoretic standpoint (i.e., for someone who has heard of the classification of finitely generated abelian groups).

But before that we’re ready to jump headfirst into some code. Next time we’ll implement the algorithms for adding points on elliptic curves using Python objects.

Until then!