Last time we looked at the elementary formulation of an elliptic curve as the solutions to the equation

where are such that the discriminant is nonzero:

We have yet to explain why we want our equation in this form, and we will get to that, but first we want to take our idea of intersecting lines as far as possible.

Fair warning: this post will start out at the same level as the previous post, but we intend to gradually introduce some mathematical maturity. If you don’t study mathematics, you’ll probably see terminology and notation somewhere between mysterious and incomprehensible. In particular, we will spend a large portion of this post explaining projective coordinates, and we use the blackboard-bold to denote real numbers.

Skimming difficult parts, asking questions in the comments, and pushing through to the end are all encouraged.

## The Algorithm to Add Points

The deep idea, and the necessary insight for cryptography, is that the points on an elliptic curve have an *algebraic structure*. What I mean by this is that you can “add” points in a certain way, and it will satisfy all of the properties we expect of addition with true numbers. You may have guessed it based on our discussion in the previous post: adding two points will involve taking the line passing between them and finding the third point of intersection with the elliptic curve. But in order to make “adding points” rigorous we need to deal with some special cases (such as the vertical line problem we had last time).

So say we have two points on an elliptic curve defined by . By saying they’re “on the curve” we mean their coordinates satisfy the equation defining the curve. Then to add , we do the following geometric algorithm:

- Form the line connecting and .
- Compute the third intersection point of with (the one that’s not or ). Call it .
- Reflect across the -axis to get the final point .

Here’s that shown visually on our practice curve .

This algorithm might seem dubious, but it’s backed up by solid mathematics. For example, it’s *almost* immediately obvious that step 1 will always work (that you can always form such a line), the only exception being when . And it’s *almost* a theorem that step 2 will always work (that there is always a third point of intersection), the only exception being the vertical line. If we ignore these exceptional cases, then the correctness of the algorithm is easy to prove, because we can just generalize the idea from last time.

Solving the joint system of the curve and the line is equivalent to solving

Since is a degree 1 polynomial, this equation is a cubic polynomial in

If we already have two solutions to this equation with distinct -values (two points on the curve that don’t form a vertical line) then there *has* to be a third. Why? Because having a root of a polynomial means you can factor, and we have *two distinct* roots, so we know that our polynomial has as a divisor

But then the remainder must be a linear polynomial, and because the leading term is it has to look like for some . And so is our third point. Moreover, must be equal to the opposite of the coefficient of in the equation above, so we can solve for it without worry about how to factor cubic polynomials. When we get down to some nitty-gritty code we’ll need to be more precise here with equations and such, but for now this is good enough.

## Pause, Breathe, Reflect

It’s time to take a step back. There is a big picture overarching our work here. We have this mathematical object, a bunch of points on a curve in space, and we’re trying to say that it has *algebraic structure*. As we’ve been saying, we want to add points on the curve using our algorithm and always get a point on the curve as a result.

But beyond the algorithm, two important ideas are at work here. The first is that any time you make a new mathematical definition with the intent of overloading some operators (in this case, + and -), you want to make sure the operators behave like we expect them to. Otherwise “add” is a really misleading name!

The second idea is that we’re encoding *computational structure* in an elliptic curve. The ability to add and negate opens up a world of computational possibilities, and so any time we find algebraic structure in a mathematical object we can ask questions about the efficiency of computing functions within that framework (and reversing them!). This is precisely what cryptographers look for in a new encryption scheme: functions which are efficient to compute, but very hard to reverse if all you know is the output of the function.

So what are the properties we need to make sure addition behaves properly?

- We need there to be an the additive identity, which we’ll call zero, for which .
- We need every point to have an inverse for which .
- We want adding to commute, so that . This property of an algebraic structure is called
*abelian*or*commutative*. - We need addition to be
*associative*, so that .

In fact, if you just have a general collection of things (a set, in the mathematical parlance) and an operation which together satisfy these four properties, we call that a *commutative group*. By the end of this series we’ll to switch to the terminology of groups to get a mathematically mature viewpoint on elliptic curves, because it turns out that not all types of algebraic structure are the same (there are lots of different groups). But we’ll introduce it slowly as we see why elliptic curves form groups under the addition of their points.

There are still some things about our adding algorithm above that aren’t quite complete. We still need to know:

- What will act as zero.
- How to get the additive inverse of a point.
- How to add a point to itself.
- What to do if the two points form a vertical line.

The first, second, and fourth items are all taken care of in one fell swoop (and this is the main bit of mathematical elbow grease we have been hinting at), but the third point has a quick explanation if you’re willing to postpone a technical theorem. If you want to double a point, or add , then you can’t “take the line joining those two points” because you need two *distinct* points to define a line. But you can take the *tangent* line to the curve at and look for the second point of intersection. Again we still have to worry about the case of a vertical line. But ignoring that, the reason there will always be a second point of intersection is called Bezout’s theorem, and this theorem is so strong and abstract that it’s very difficult to discuss it with what we presently know, but it has to do with counting *multiplicity* of roots of polynomial equations. Seeing that it’s mostly a technical tool, we’ll just be glad it’s there.

So with that, let’s get to the most important bit.

## Projective Space and the Ideal Line

The shortest way to describe what will act as zero in our elliptic curve is to say that we *invent* *a new point* which is the “intersection” of all vertical lines. Because it’s the intersection of vertical lines, we’ll sometimes call it “the point at infinity.” We’ll also call it “zero” because it’s supposed to be the additive identity, we’ll demand it lies on every elliptic curve, and we’ll enforce that if we “reflect” zero across the -axis we still get zero. And then everything works!

If you want to add two points that form a vertical line, well, now the third point of intersection is zero and reflecting zero still gives you zero. If you want to get the additive inverse of a point , just have it be the point reflected across the -axis; the two points form a vertical line and so by our algorithm they “add up” to zero. So it’s neat and tidy.

But wait, wait, wait. Points at infinity? All vertical lines intersect? This isn’t like any geometry

I’veever seen before. How do we know we can do this without getting some massive contradictions!?

This is the best question one could ask, and we refuse to ignore it. Many articles aimed at a general technically literate audience get very fuzzy at this part. They make it seem like the mathematics we’re doing is magic, and they ignore the largest mathematical elephant in the room. Of course it’s not magic, and all of this has a solid foundation called *projective space. *We’re going to explore its basics now.

There is a long history of arguing over Euclid’s geometric axioms and how the postulate that parallel lines never intersect doesn’t follow from the other axioms. This is exactly what we’re going for, but blah blah blah let’s just see the construction already!

The crazy brilliant idea is that we want to make a geometric space where points are actually *lines*.

What will happen is that our elliptic curve equations will get an extra variable to account for this new kind of geometry, and a special value for this new variable will allow us to recover the usual pictures of elliptic curves in the plane (this is intentionally vague and will become more precise soon).

So how might one make such a geometry? Well you can do it with or without linear algebra. The way without linear algebra is to take three dimensional Euclidean space , and look at the lines that pass through the origin. Make each of these lines its own point, and call the resulting space , *the projective plane*. (For avid readers of this blog, this is exactly the same construction as we gave in our second topology primer, just seen from a very different angle. Leave a comment if you want to hear more.)

The problem with the non-linear-algebra approach is that we get no natural coordinate system, and we’re *dying *for coordinates (we’re here to compute stuff, after all!). So we need to talk about vectors. Every nonzero vector in spans a line (by taking all multiples for ). So instead of representing a point in projective space by a line, we can represent it by a vector, with the additional condition that two points are the same if they’re multiples of each other.

Here’s a picture of this idea. The two vectors displayed are equal to each other because they lie on the same line (they are multiples of each other).

Don’t run away because a very detailed explanation will follow what I’m about to say, but the super formal way of saying this is that projective space is the *quotient space*

Still here? Okay great. Let’s talk about coordinates. If we are working with vectors in , then we’re looking at coordinates like

where are real numbers. Trivial. But now in projective space we’re asserting two things. First, is not allowed. And second, whenever we have , we’re declaring that it’s the *same thing* as or or any other way to scale every component by the same amount. To denote the difference between usual vectors (parentheses) and our new coordinates, we use square brackets and colons. So a point in 2-dimensional projective space is

where are real numbers that are not all zero, and for any .

Now we can make some “canonical choices” for our coordinates, and start exploring how the assertions we made shape the geometry of this new space. For example, if with then we can always scale by so that the point looks like

Now and can be anything (just think of it as ), and different choices of give *distinct* points. Why? Because if we tried to scale to make two points equal we’d be screwing up the 1 that’s fixed in the third coordinate. So when is nonzero we have this special representation (often called an *affine slice)**,* and it’s easy to see that all of these points form a copy of the usual Euclidean plane sitting inside . There is a nice way to visualize exactly how this copy can be realized as “usual Euclidean space” using the picture below:

Each line (each vector with ) intersects the indigo plane in exactly one point, so this describes a one-to-one mapping of points in the affine slice to the Euclidean plane.

But then when , we get some other stuff that makes up the rest of . Since can’t both also be zero, we’ll suppose is not zero, and then we can do the same normalization trick to see that all the points we get are

Since again can be anything and you get distinct points for different choices of , this forms a copy of the real line inside of but outside of the affine slice. This line is sometimes called the *ideal line* to distinguish it from the lines that lie inside the affine slice . Actually, the ideal line is more than just a line. It’s (gasp) a circle! Why do I say that? Well think about what happens as gets really large (or really negatively large). We have

and the right hand side approaches , the last missing point! Another way to phrase our informal argument is to say is the *boundary* of the line , and (you guessed it) the circle we get here is the boundary of the affine slice . And we can see exactly what it means for “two parallel lines” to intersect. Take the two lines given by

If we think of these as being in the affine slice of where , it’s the lines given by , which are obviously parallel. But where do they intersect as get very large (or very negatively large)? at

which both become in the limit. I’m being a bit imprecise here appealing to limits, but it works because projective space inherits some structure when we pass to the quotient (for the really technically inclined, it inherits a metric that comes from the sphere of radius 1). This is why we feel compelled to call it a quotient despite how confusing quotients can be, and it illustrates the power of appealing to these more abstract constructions.

In any case, now we have this image of projective space:

It should be pretty clear that the choice of to represent the affine slice is arbitrary, and we could have used or to realize different “copies” of the Euclidean plane sitting inside projective space. But in any case, we can use our new understanding to turn back to elliptic curves.

## Homogeneous Equations and the Weierstrass Normal Form

Elliptic curves are officially “projective objects” in the sense that they are defined by *homogeneous *equations over projective space. That is, an elliptic curve equation is any *homogeneous* degree three equation whose discriminant is zero. By homogeneous I mean all the powers of the terms add up to three, so it has the general form

And note that now the solutions to this equation are required to be *projective* points . As an illuminating exercise, prove that is a solution if and only if is, i.e. that our definition of “solution” and the use of homogeneous equations is well-defined.

But to work with projective language forever is burdensome, something that only mathematicians are required to do. And in the case of elliptic curves we only use *one* new point from projective space (the intersection of the vertical lines). Once we get to writing programs we’ll have a special representation for points that aren’t in the affine slice, so we will be able to use regular Euclidean coordinates without much of a fuss.

That being said, we can now officially explain why we want the special form of elliptic curve , called the *Weierstrass normal form *(pronounced VY-er-shtrahss). Specifically, elliptic curves can look very weird in their natural coordinates.

So to bring some order to the chaos, we want the projective point to be our zero point. If we choose our axes appropriately (swapping the letters ), then lies at the intersection of all vertical* *lines. Now that point isn’t a solution to all homogeneous degree-three curves, but it* is* a solution to homogeneous equations that look like this (plug it in to see).

Starting to look familiar? It turns out there is a theorem (that requires either heavy mathematical machinery or LOTS of algebra to prove) that says that for any homogeneous degree three equation you start with, you can always pick your “projective axes” (that is, apply suitable projective transformations) to get an equivalent equation of the form above. I mean equivalent in the sense that the transformation we applied took solutions of the original equation to solutions of this new equation and didn’t create any new ones. **The casual reader should think of all this machinery as really clever changes of variables.**

And then if we pick our classical Euclidean slice to be , we get back to the standard form . This is the Weierstrass normal form.

## So that was a huge detour…

Back to adding points on elliptic curves. Now that we’ve defined zero one can check that addition makes sense. Zero has the needed property since the “third” point of intersection of the vertical line passing through is the reflection of across the -axis, and reflecting that across the -axis give you . For the same reason . Even more, properties like naturally fall out of our definitions for projective coordinates, since . So projective space, rather than mathematical hocus-pocus, is the correct setting to think about algebra happening on elliptic curves.

It’s also clear that addition is commutative because lines passing through points don’t care about the order of the points. The only real issue is whether addition is *associative*. That is whether no matter what are. This turns out to be difficult, and it takes a lot of algebra and the use of that abstract Bezout’s theorem we mentioned earlier, so the reader will have to trust us that everything works out. (Although, Wikipedia has a slick animation outlining one such proof.)

So “adding points,” and that pesky “point at infinity” now officially makes sense!

What we’ve shown in the mathematical parlance is that the solutions to an elliptic curve form a *group* under this notion of addition. However, one thing we haven’t talked about is where these numbers come from. Recall from last time that we were interested in *integer* points on the elliptic curve, but it was clear from our example that adding two integer-valued points on an elliptic curve might not give you an integer-valued point.

However, if we require that our equation has coefficients in a *field*, and we allow our points to have coordinates in that same field, then adding two points with coordinates in the field always gives you a point with coordinates in the field. We haven’t ever formally talked about fields on this blog, but we’re all familiar with them: they have addition, multiplication, and division in the expected ways (everything except 0 has a multiplicative inverse, multiplication distributes across addition, etc.). The usual examples are the real and complex numbers, and , the rational numbers (fractions of integers). There are also finite fields, which are the proper setting for elliptic curve cryptography, but we’ll save those for another post.

But why must it work if we use a field? Because the operations we need to perform the point-adding algorithm only use field operations (addition, multiplication, and division by other numbers in the field). So performing those operations on numbers in a field always give you back numbers in that field. Since all fields have 0 and 1, we also have the point at infinity .

This gives a natural definition.

**Definition: **Let be a field and let be the equation of an elliptic curve in Weierstrass form. Define to be the set of projective points on with coordinates in along with the ideal point . As we have discussed is a group under the operation of adding points, so we call it the *elliptic curve group* for over .

Now that we’ve waded through a few hundred years of mathematics, it should be no surprise that this definition feels so deep and packed full of implicit understanding.

However, there are still a few mathematical peccadilloes to worry about. One is that if the chosen field is particularly weird (as are many of those finite fields I mentioned), then we *can’t* transform any equation with coefficients in into the Weierstrass normal form. This results in a few hiccups when you’re trying to do cryptography, but for simplicity’s sake we’ll avoid those areas and stick to nicer fields.

We have only scratched the surface of the algebraic structure of elliptic curves by showing elliptic curves have such structure at all. The next goal for a mathematician is to *classify all possible algebraic structures *for elliptic curves , and find easy ways to tell which from the coefficients of the equation. Indeed, we intend to provide a post at the end of this series (after we get to the juicy programs) that describes what’s known in this area from a more group-theoretic standpoint (i.e., for someone who has heard of the classification of finitely generated abelian groups).

But before that we’re ready to jump headfirst into some code. Next time we’ll implement the algorithms for adding points on elliptic curves using Python objects.

Until then!

Thank you for continuing with this series of posts on elliptic curves! I have to admit I felt a lot less “mathematically comfortable” this time around. One of the things that bugged me was in the assertion that the Euclidean plane (affine slice z = 1) has the ideal line as boundary. It was very clear and natural that starting with the set of points z = 0 and normalizing the y coordinate gives rise to the point at infinity [1:0:0], but I don’t see how that translates to the affine slice have a boundary.

The way I see it is that if we have a point [a:b:1] and consider arbitrarily large a and b, we can choose to normalize either on the x axis or the y axis, which would yield [1:0:0] or [0:1:0] on the limit, respectively. On a related note, if we take two parallel lines [1:a:1] and [2:b:1], wouldn’t they intersect at [0:1:0] as a, b go to infinity? Wouldn’t this make [0:1:0] also a point at infinity along with [1:0:0]?

Lastly, judging by the abstract result at the end, talking about how elliptic curves can be used to create a group on any field, not just the reals, makes me wonder how this could even work on the integers. Does this imply that all the theorems mentioned work on arbitrary fields?

LikeLike

[1:0:0] and [0:1:0] are two distinct points on the ideal line, and for the purpose of EC’s it doesn’t matter which we use (we just have to pick the axes in a way that makes the lines we care about “vertical”). The standard choice is to pick [0:1:0] because that corresponds to the Weierstrass normal form (and if we picked a different one we’d get a different kind of normal form).

But there are many other points on the ideal line z=0. They are all of the form [a:b:0], and so if you start with [x:y:1] and choose any way to make something go to infinity, for example a goes to infinity like t and b goes to infinity like 2t, then you get . The point is that any parameterization you pick which has one or both coordinates tending to infinity gives you a point on the ideal line. Which point you get depends on the relative speed of the parameterization (for parameterization given by lines, this depends solely on the slope of the line).

To your last point, the integers are not a field. Fields require every element to have a multiplicative inverse in the same field, and the only integers with integer inverses are {1, -1}. But if you do have a field then yes, all of the theorems work (except maybe the Weierstrass normal form, which I mentioned is a problem for some fields).

Did that answer your questions? Don’t hesitate to keep asking, and thanks for reading!

LikeLike

“Did that answer your questions?”

You bet! That was a perfect explanation on projective geometry. I now feel as if I ‘really get it’. Also, I think I got a little ahead of myself when implying integers are a field. I was thinking about prime fields and binary fields because I remember they being mentioned in other articles. After all we’ll need something that’s actually computable for cryptography, but I guess that’s the topic of a future post. (That pesky multiplicative inverse property though.)

LikeLike

“This algorithm might seem dubious, but it’s backed up by solid mathematics. For example, it’s almost immediately obvious that step 1 will always work (that you can always form such a line), the only exception being when P = Q. And it’s almost a theorem that step 2 will always work (that there is always a third point of intersection), the only exception being the vertical line. If we ignore these exceptional cases, then the correctness of the algorithm is easy to prove, because we can just generalize the idea from last time.”

What if you take Q such that dy/dx = 0 at Q, and P horizontal to Q? If you do that, I don’t see how you can get a third solution…

LikeLike

Ah, this is another case of the “repeated roots” issue. I mention this a bit later in the post when I talk about tangent lines, but here you get that the third point of intersection is again Q. In other words, the cubic you get will look like where . The algorithm we describe will work in this case without modification (the algebra just works because we’re guaranteed to get three solutions counting multiplicity, and Vieta’s formula still holds).

You have a good nose for this stuff!

LikeLike

First of all, excellent job trying to make this content accessible to non-mathematicians. Unfortunately, there is a small conceptual error when describing projective space. Projective space RP^2 is not a quotient vector space because it does not inherit a vector space structure from R^3. You can’t add two points [a:b:c] and [d:e:f] in a well-defined way; if you define their sum to be [a+d:b+e:c+f], this is (in general) not the same as [xa:xb:xc]+[d:e:f] even though [xa:xb:xc]=[a:b:c] when x is not zero.

What we are actually doing (at least over R) is taking a topological quotient, where we view R^3 as a topological space, and then quotient out by the equivalence relation {(a,b,c)~(xa:xb:xc) | x is non-zero, (a,b,c) in R^3 not zero}. This gives RP^2 (which is this quotient, minus the equivalence class of the origin) a topological structure. In general, quotients of topological spaces can be badly behaved: a quotient of a metric space isn’t necessarily a metric space (in fact it isn’t necessarily even Hausdorff!). As it happens RP^2 is rather well-behaved, in that we can put a metric on it coming from the sphere (as the action on the sphere is freely discontinuous everywhere) but this doesn’t necessarily follow as a general fact from the construction.

LikeLike

Yes, I was being overly sloppy there, so sloppy as to be completely wrong🙂. The metric comes from the fact that we view RP^2 as a quotient of the sphere (mod antipodal points) and the quotient is nice enough to make RP^2 locally isometric to the sphere. I never use any vector space structure, just the appeal to limits, so the metric is all I needed.

I think of this as “natural” when you view the starting point as the sphere, but I can imagine it’s not so natural when you view the starting point as R^3.

LikeLike

Hello! First thank you for writing this primer, I was looking for a gentle introduction to elliptic curves and this is great. There’s a few things I wanted to get clarified about the Weierstrass normal form and, in particular, the effect of these “projective transformations” on the shape of the elliptic curve.

So from what I understand, the elliptic curve is a set of solutions [x:y:z] that satisfy a 3rd degree homogeneous polynomial in x,y,z (and where [kx:ky:kz] is the same as [x:y:z]). Now you say we want to have the point [0:1:0] in our solution set, and make it the zero point, in order to “bring order to chaos”. So what precisely does having [0:1:0] on the curve imply about its shape that makes it better/convenient? My guess would be that it has to do with the fact that the curve will, in this case, have slope that becomes vertical the larger x becomes (and following along the curve for large x makes the projective point approach [0:1:0], so the curve looks more like a vertical line, unlike the picture which looks like a 45 degree slope line). So is this correct, and if so, does the Weierstrass form have additional desirable geometric properties? It seems that other forms would also satisfy the [0:1:0] requirement, such as including terms like xy^2 or xyz which are omitted in the Weierstrass form.

Second, if I understand it, we get the usual R^2 solution sets of the elliptic curve by setting z = 1 in the homogeneous equation and restricting to the affine slice, and also we add the point [0:1:0] as a “point at infinity” corresponding to the limit as we follow along the curve for increasing values of x (and also corresponding to the intersection of all vertical lines). But aren’t we missing some additional solutions? In the homogeneous form, we could set z = 0, and we get a homogeneous 3rd degree polynomial in x and y which defines a curve [x,y,0] on the ideal line. So how do the points on this curve which are distinct from [0:1:0] fit into the R^2 representation of the elliptic curve? Though I do notice that in the Weirstrass form, the only solutions for z = 0 are indeed [0:y:0] = [0:1:0], so maybe that’s another reason to use this form?

LikeLike

> So what precisely does having [0:1:0] on the curve imply about its shape that makes it better/convenient?

In principle nothing changes if you fix a different point as the “zero” point, you just get different “normal forms” of equations. The nice fact is that no matter what you pick as your zero point for an elliptic curve C, there is a projective mapping that turns C with your zero point into some other curve C’ whose zero point is [0:1:0]. In other words, they’re all equivalent in principle, so we can try to pick a zero point that makes the algebra less messy. [0:1:0] is a good candidate, but you’d get similar results with [1:0:0] or [0:0:1]. Lots of zeros means clean algebra. Any other potential geometric properties that we’d really care about would be intrinsic to elliptic curves regardless of which zero point you pick (a mathematician would say: otherwise you couldn’t really call them “properties” of elliptic curves!). We also get nice classification pictures like the one on Wikipedia: https://en.wikipedia.org/wiki/Elliptic_curve I think one more thing is that curves in Weierstrass normal form are easy to check for singularities, because there’s an easy formula for the discriminant.

> In the homogeneous form, we could set z = 0, and we get a homogeneous 3rd degree polynomial in x and y which defines a curve [x,y,0] on the ideal line.

Try it for few examples and see what you get. Thinking abstractly, the ideal line is something which is 1 dimensional, and a curve is also 1 dimensional; two 1-dimensional polynomial things are either the same or intersect in a discrete set of points. It’s tricky working with the notation because something even like the set is just a single point [1:2:0]

Maybe what’s confusing is this: if you give me any homogenous degree 3 equation, then by a projective mapping I can turn it into an equivalent curve C with [0:1:0] on it (and a new coordinate system), and with that being the only point in the idea line [x:y:0] that is on C. This is nontrivial to prove, and I glazed over it in the post. But if you want to go backwards to find out what this homogeneous form is, just start with the Weierstrass normal form and add z’s until everything is homogeneous of degree 3. If you want to read more about this, look up the Wikipedia entry on projective isomorphism and read about stuff like the cross ratio. https://en.wikipedia.org/wiki/Homography

LikeLike

Highly appreciated! The figures you posted for visualization were a great aid towards developing an understanding of EC in Projective Space. Thanks.

LikeLike