Visualizing an Assassin Puzzle

Over at Math3ma, Tai-Danae Bradley shared the following puzzle, which she also featured in a fantastic (spoiler-free) YouTube video. If you’re seeing this for the first time, watch the video first.

Consider a square in the xy-plane, and let A (an “assassin”) and T (a “target”) be two arbitrary-but-fixed points within the square. Suppose that the square behaves like a billiard table, so that any ray (a.k.a “shot”) from the assassin will bounce off the sides of the square, with the angle of incidence equaling the angle of reflection.

Puzzle: Is it possible to block any possible shot from A to T by placing a finite number of points in the square?

This puzzle found its way to me through Tai-Danae’s video, via category theorist Emily Riehl, via a talk by the recently deceased Fields Medalist Maryam Mirzakhani, who studied the problem in more generality. I’m not familiar with her work, but knowing mathematicians it’s probably set in an arbitrary complex n-manifold.

See Tai-Danae’s post for a proof, which left such an impression on me I had to dig deeper. In this post I’ll discuss a visualization I made—now posted at the end of Tai-Danae’s article—as well as here and below (to avoid spoilers). In the visualization, mouse movement chooses the firing direction for the assassin, and the target is in green. Dragging the target with the mouse updates the position of the guards. The source code is on Github.

Outline

The visualization uses d3 library, which was made for visualizations that dynamically update with data. I use it because it can draw SVGs real nice.

The meat of the visualization is in two geometric functions.

  1. Decompose a ray into a series of line segments—its path as it bounces off the walls—stopping if it intersects any of the points in the plane.
  2. Compute the optimal position of the guards, given the boundary square and the positions of the assassin and target.

Both of these functions, along with all the geometry that supports them, is in geometry.js. The rest of the demo is defined in main.js, in which I oafishly trample over d3 best practices to arrive miraculously at a working product. Critiques welcome 🙂

As with most programming and software problems, the key to implementing these functions while maintaining your sanity is breaking it down into manageable pieces. Incrementalism is your friend.

Vectors, rays, rectangles, and ray splitting

We start at the bottom with a Vector class with helpful methods for adding, scaling, and computing norms and inner products.

function innerProduct(a, b) {
  return a.x * b.x + a.y * b.y;
}

class Vector {
  constructor(x, y) {
    this.x = x;
    this.y = y;
  }

  normalized() { ... }
  norm() { ... }
  add(vector) { ... }
  subtract(vector) { ... }
  scale(length) { ... }
  distance(vector) { ... }
  midpoint(b) { ... }
}

This allows one to compute the distance between two points, e.g., with vector.subtract(otherVector).norm().

Next we define a class for a ray, which is represented by its center (a vector) and a direction (a vector).

class Ray {
  constructor(center, direction, length=100000) {
    this.center = center;
    this.length = length;

    if (direction.x == 0 && direction.y == 0) {
      throw "Can't have zero direction";
    }
    this.direction = direction.normalized();
  }

  endpoint() {
    return this.center.add(this.direction.scale(this.length));
  }

  intersects(point) {
    let shiftedPoint = point.subtract(this.center);
    let signedLength = innerProduct(shiftedPoint, this.direction);
    let projectedVector = this.direction.scale(signedLength);
    let differenceVector = shiftedPoint.subtract(projectedVector);

    if (signedLength > 0
        && this.length > signedLength
        && differenceVector.norm() < intersectionRadius) {
      return projectedVector.add(this.center);
    } else {
      return null;
    }
  }
}

The ray must be finite for us to draw it, but the length we've chosen is so large that, as you can see in the visualization, it's effectively infinite. Feel free to scale it up even longer.

assassin-puzzle

The interesting bit is the intersection function. We want to compute whether a ray intersects a point. To do this, we use the inner product as a decision rule to compute the distance of a point from a line. If that distance is very small, we say they intersect.

In our demo points are not infinitesimal, but rather have a small radius described by intersectionRadius. For the sake of being able to see anything we set this to 3 pixels. If it’s too small the demo will look bad. The ray won’t stop when it should appear to stop, and it can appear to hit the target when it doesn’t.

Next up we have a class for a Rectangle, which is where the magic happens. The boilerplate and helper methods:

class Rectangle {
  constructor(bottomLeft, topRight) {
    this.bottomLeft = bottomLeft;
    this.topRight = topRight;
  }

  topLeft() { ... }
  center() { ... }
  width() { .. }
  height() { ... }
  contains(vector) { ... }

The function rayToPoints that splits a ray into line segments from bouncing depends on three helper functions:

  1. rayIntersection: Compute the intersection point of a ray with the rectangle.
  2. isOnVerticalWall: Determine if a point is on a vertical or horizontal wall of the rectangle, raising an error if neither.
  3. splitRay: Split a ray into a line segment and a shorter ray that’s “bounced” off the wall of the rectangle.

(2) is trivial, computing some x- and y-coordinate distances up to some error tolerance. (1) involves parameterizing the ray and checking one of four inequalities. If the bottom left of the rectangle is (x_1, y_1) and the top right is (x_2, y_2) and the ray is written as \{ (c_1 + t v_1, c_2 + t v_2) \mid t > 0 \}, then—with some elbow grease—the following four equations provide all possibilities, with some special cases for vertical or horizontal rays:

\displaystyle \begin{aligned} c_2 + t v_2 &= y_2 & \textup{ and } \hspace{2mm} & x_1 \leq c_1 + t v_1 \leq x_2 & \textup{ (intersects top)} \\ c_2 + t v_2 &= y_1 & \textup{ and } \hspace{2mm} & x_1 \leq c_1 + t v_1 \leq x_2 & \textup{ (intersects bottom)} \\ c_1 + t v_1 &= x_1 & \textup{ and } \hspace{2mm} & y_1 \leq c_2 + t v_2 \leq y_2 & \textup{ (intersects left)} \\ c_1 + t v_1 &= x_2 & \textup{ and } \hspace{2mm} & y_1 \leq c_2 + t v_2 \leq y_2 & \textup{ (intersects right)} \\ \end{aligned}

In code:

  rayIntersection(ray) {
    let c1 = ray.center.x;
    let c2 = ray.center.y;
    let v1 = ray.direction.x;
    let v2 = ray.direction.y;
    let x1 = this.bottomLeft.x;
    let y1 = this.bottomLeft.y;
    let x2 = this.topRight.x;
    let y2 = this.topRight.y;

    // ray is vertically up or down
    if (epsilon > Math.abs(v1)) {
      return new Vector(c1, (v2 > 0 ? y2 : y1));
    }

    // ray is horizontally left or right
    if (epsilon > Math.abs(v2)) {
      return new Vector((v1 > 0 ? x2 : x1), c2);
    }

    let tTop = (y2 - c2) / v2;
    let tBottom = (y1 - c2) / v2;
    let tLeft = (x1 - c1) / v1;
    let tRight = (x2 - c1) / v1;

    // Exactly one t value should be both positive and result in a point
    // within the rectangle

    let tValues = [tTop, tBottom, tLeft, tRight];
    for (let i = 0; i  epsilon && this.contains(intersection)) {
        return intersection;
      }
    } 

    throw "Unexpected error: ray never intersects rectangle!";
  }

Next, splitRay splits a ray into a single line segment and the “remaining” ray, by computing the ray’s intersection with the rectangle, and having the “remaining” ray mirror the direction of approach with a new center that lies on the wall of the rectangle. The new ray length is appropriately shorter. If we run out of ray length, we simply return a segment with a null ray.

  splitRay(ray) {
    let segment = [ray.center, this.rayIntersection(ray)];
    let segmentLength = segment[0].subtract(segment[1]).norm();
    let remainingLength = ray.length - segmentLength;

    if (remainingLength < 10) {
      return {
        segment: [ray.center, ray.endpoint()],
        ray: null
      };
    }

    let vertical = this.isOnVerticalWall(segment[1]);
    let newRayDirection = null;

    if (vertical) {
      newRayDirection = new Vector(-ray.direction.x, ray.direction.y);
    } else {
      newRayDirection = new Vector(ray.direction.x, -ray.direction.y);
    }

    let newRay = new Ray(segment[1], newRayDirection, length=remainingLength);
    return {
      segment: segment,
      ray: newRay
    };
  }

As you have probably guessed, rayToPoints simply calls splitRay over and over again until the ray hits an input “stopping point”—a guard, the target, or the assassin—or else our finite ray length has been exhausted. The output is a list of points, starting from the original ray’s center, for which adjacent pairs are interpreted as line segments to draw.

  rayToPoints(ray, stoppingPoints) {
    let points = [ray.center];
    let remainingRay = ray;

    while (remainingRay) {
      // check if the ray would hit any guards or the target
      if (stoppingPoints) {
        let hardStops = stoppingPoints.map(p => remainingRay.intersects(p))
          .filter(p => p != null);
        if (hardStops.length > 0) {
          // find first intersection and break
          let closestStop = remainingRay.closestToCenter(hardStops);
          points.push(closestStop);
          break;
        }
      }

      let rayPieces = this.splitRay(remainingRay);
      points.push(rayPieces.segment[1]);
      remainingRay = rayPieces.ray;
    } 

    return points;
  }

That’s sufficient to draw the shot emanating from the assassin. This method is called every time the mouse moves.

Optimal guards

The function to compute the optimal position of the guards takes as input the containing rectangle, the assassin, and the target, and produces as output a list of 16 points.

/*
 * Compute the 16 optimal guards to prevent the assassin from hitting the
 * target.
 */
function computeOptimalGuards(square, assassin, target) {
...
}

If you read Tai-Danae’s proof, you’ll know that this construction is to

  1. Compute mirrors of the target across the top, the right, and the top+right of the rectangle. Call this resulting thing the 4-mirrored-targets.
  2. Replicate the 4-mirrored-targets four times, by translating three of the copies left by the entire width of the 4-mirrored-targets shape, down by the entire height, and both left-and-down.
  3. Now you have 16 copies of the target, and one assassin. This gives 16 line segments from assassin-to-target-copy. Place a guard at the midpoint of each of these line segments.
  4. Finally, apply the reverse translation and reverse mirroring to return the guards to the original square.

Due to WordPress being a crappy blogging platform I need to migrate off of, the code snippets below have been magically disappearing. I’ve included links to github lines as well.

Step 1 (after adding simple helper functions on Rectangle to do the mirroring):

  // First compute the target copies in the 4 mirrors
  let target1 = target.copy();
  let target2 = square.mirrorTop(target);
  let target3 = square.mirrorRight(target);
  let target4 = square.mirrorTop(square.mirrorRight(target));
  target1.guardLabel = 1;
  target2.guardLabel = 2;
  target3.guardLabel = 3;
  target4.guardLabel = 4;

Step 2:

  // for each mirrored target, compute the four two-square-length translates
  let mirroredTargets = [target1, target2, target3, target4];
  let horizontalShift = 2 * square.width();
  let verticalShift = 2 * square.height();
  let translateLeft = new Vector(-horizontalShift, 0);
  let translateRight = new Vector(horizontalShift, 0);
  let translateUp = new Vector(0, verticalShift);
  let translateDown = new Vector(0, -verticalShift);

  let translatedTargets = [];
  for (let i = 0; i < mirroredTargets.length; i++) {
    let target = mirroredTargets[i];
    translatedTargets.push([
      target,
      target.add(translateLeft),
      target.add(translateDown),
      target.add(translateLeft).add(translateDown),
    ]);
  }

Step 3, computing the midpoints:

  // compute the midpoints between the assassin and each translate
  let translatedMidpoints = [];
  for (let i = 0; i  t.midpoint(assassin)));
  }

Step 4, returning the guards back to the original square, is harder than it seems, because the midpoint of an assassin-to-target-copy segment might not be in the same copy of the square as the target-copy being fired at. This means you have to detect which square copy the midpoint lands in, and use that to determine which operations are required to invert. This results in the final block of this massive function.

  // determine which of the four possible translates the midpoint is in
  // and reverse the translation. Since midpoints can end up in completely
  // different copies of the square, we have to check each one for all cases.
  function untranslate(point) {
    if (point.x  square.bottomLeft.y) {
      return point.add(translateRight);
    } else if (point.x >= square.bottomLeft.x && point.y <= square.bottomLeft.y) {
      return point.add(translateUp);
    } else if (point.x < square.bottomLeft.x && point.y <= square.bottomLeft.y) {
      return point.add(translateRight).add(translateUp);
    } else {
      return point;
    }
  }

  // undo the translations to get the midpoints back to the original 4-mirrored square.
  let untranslatedMidpoints = [];
  for (let i = 0; i  square.topRight.x && point.y > square.topRight.y) {
      return square.mirrorTop(square.mirrorRight(point));
    } else if (point.x > square.topRight.x && point.y <= square.topRight.y) {
      return square.mirrorRight(point);
    } else if (point.x  square.topRight.y) {
      return square.mirrorTop(point);
    } else {
      return point;
    }
  }

  return untranslatedMidpoints.map(unmirror);

And that’s all there is to it!

Improvements, if I only had the time

There are a few improvements I’d like to make to this puzzle, but haven’t made the time (I’m writing a book, after all!).

  1. Be able to drag the guards around.
  2. Create new guards from an empty set of guards, with a button to “reveal” the solution.
  3. Include a toggle that, when pressed, darkens the entire region of the square that can be hit by the assassin. For example, this would allow you to see if the target is in the only possible safe spot, or if there are multiple safe spots for a given configuration.
  4. Perhaps darken the vulnerable spots by the number of possible paths that hit it, up to some limit.
  5. The most complicated one: generalize to an arbitrary polygon (convex or not!), for which there may be no optional solution. The visualization would allow you to look for a solution using 2-4.

Pull requests are welcome if you attempt any of these improvements.

Until next time!

Guest Post: Torus-Knotted Baklava

Guest Post: Torus-Knotted Baklava at Baking And Math, a blog by my friend and colleague, Yen.

Topological Spaces — A Primer

In our last primer we looked at a number of interesting examples of metric spaces, that is, spaces in which we can compute distance in a reasonable way. Our goal for this post is to relax this assumption. That is, we want to study the geometric structure of space without the ability to define distance. That is not to say that some notion of distance necessarily exists under the surface somewhere, but rather that we include a whole new class of spaces for which no notion of distance makes sense. Indeed, even when there is a reasonable notion of a metric, we’ll still want to blur the lines as to what kinds of things we consider “the same.”

The reader might wonder how we can say anything about space if we can’t compute distances between things. Indeed, how could it even really be “space” as we know it? The short answer is: the reader shouldn’t think of a topological space as a space in the classical sense. While we will draw pictures and say some very geometric things about topological spaces, the words we use are only inspired by their classical analogues. In fact the general topological space will be a much wilder beast, with properties ranging from absolute complacency to rampant hooliganism. Even so, topological spaces can spring out of every mathematical cranny. They bring at least a loose structure to all sorts of problems, and so studying them is of vast importance.

Just before we continue, we should give a short list of how topological spaces are applied to the real world. In particular, this author is preparing a series of posts dedicated to the topological study of data. That is, we want to study the loose structure of data potentially embedded in a very high-dimensional metric space. But in studying it from a topological perspective, we aim to eliminate the dependence on specific metrics and parameters (which can be awfully constricting, and even impertinent to the overall structure of the data). In addition, topology has been used to study graphics, image analysis and 3D modelling, networks, semantics, protein folding, solving systems of polynomial equations, and loads of topics in physics.

Recalling Metric Spaces, and Open Sets

Now we turn to generalizing metric spaces. The key property which we wish to generalize is that of open sets. For a metric space, and the reader should be thinking of the real line, the Euclidean plane, or three-dimensional Euclidean space, the open sets are easy to find. One can think of them as just “things without a boundary.” On the real line these look like open intervals (a, b) and unions of open intervals. In the plane, these would be more like open balls with a fixed center. In other words, it would be the interior of a disk.

To characterize this more mathematically, we define an open ball centered at x with radius \varepsilon in the real plane to be the set

\displaystyle B(x, \varepsilon) = \left \{ y \in \mathbb{R}^2 | d(x,y) < \varepsilon \right \}

where d is the usual Euclidean metric on points in the plane. Whenever someone says open ball, the reader should picture the following:

An open ball of radius r, centered at the point x. [Wolfram Mathworld]

Now of course this doesn’t categorize all of the open sets, since we would expect the union of two of these things to also be open. In fact, it is not hard to see that even if we take an infinite (or uncountable!) union of these open balls centered at any points with any radii, we would still get something that “has no boundary.”

In addition, it appears we can also take intersections. That is, the intersection of two open balls should be open. But we have to be a bit more careful here, because we can break our intuition quite easily. In the case of the real line, I can take an intersection of open intervals which is definitely not open. For example, take the set of intervals \left \{ (1-1/n, 1+1/n) : n \in \mathbb{N} \right \}. If we look at the intersection over all of these intervals, it is not hard to see that

\displaystyle \bigcap_{n \in \mathbb{N}} (1- 1/n, 1+1/n) = \left \{ 1 \right \}

Specifically, the number 1 is in the intersection since it is contained in all of the open intervals. But any number x > 1 cannot be in the intersection because for some large enough n it must be that 1 + 1/n < x (just solve this equation for n as a real number, and then take the ceiling). The case is similar case for x < 1, so the intersection can only be the singleton set 1. This is clearly not an open interval.

So we just found that our intuition for open sets breaks down if we allow for infinite intersections, but everything else seems to work out. Furthermore, the definition of an open ball relied on nothing about Euclidean space except that it has a metric. We’re starting to smell a good definition:

Definition: Let X be a metric space with metric d. An open set in X is either:

  • A union of any collection of open balls B(x, \varepsilon) where x \in X, or
  • finite intersection of such open balls.

A set is closed if it is the complement of an open set.

In fact, this characterization of open sets is so good that we can redefine a bunch of properties of metric spaces just in terms of open sets. This is important because in a minute we will actually define a topological space by declaring which sets are open. Before we do that, let’s remain in the friendly world of metric spaces to investigate some of those redefinitions.

Neighborhoods, Sequences, and Continuous Functions

There is an essential switch in going from metric spaces to topological spaces that one must take, and it involves the concepts of neighborhoods.

Definition: Let x \in X be a point in a metric space X. A neighborhood of x is any open set U containing x. More specifically, we can distinguish between an open neighborhood and a closed neighborhood, but without qualifiers we will always mean an open neighborhood.

In particular, the concept of a neighborhood will completely replace the idea of a metric. We will say things like, “for any neighborhood…” and “there exists a neighborhood…”, which will translate in the case of metric spaces to, “for any sufficiently close point…” and “there exists a sufficiently close point…” The main point for this discussion, however, is that if open sets were defined in some other way, the definition would still apply.

Perhaps the simplest example of such a definition is that of a sequence converging. Recall the classical definition in terms of metrics:

DefinitionLet X be a metric space with metric d, and let a_n be a sequence of elements in X. We say a_n converges to a \in X if for any \varepsilon > 0, there is some sufficiently large N so that the distance d(a_n, a) < \varepsilon whenever n > N.

In other words, after the N-th point in the sequence, the values will always stay within a tiny distance of a, and we can pick that tiny distance (\varepsilon) arbitrarily close to a. So the sequence must converge to a.

This naturally gives rise to a definition in terms of open neighborhoods of a:

DefinitionLet X, a_n, a be as in the previous definition. We say that a_n converges to a if for any open neighborhood U of a, there is some sufficiently large N so that a_n \in U for all n > N.

In particular, these two definitions are equivalent. Before we give the proof, the reader should be warned that pictures will make this proof obvious (but not rigorous), so we encourage the reader to follow along with a piece of paper. Open balls are drawn as circles despite the dimension of the space, and open neighborhoods are usually just drawn as “blobs” containing a certain point.

An open neighborhood V of a point p, and an open ball around p contained in V

To see the definitions are equivalent, suppose a_n converges as in the second definition. Then given an \varepsilon, we can choose a particular choice of open neighborhood to satisfy the constraints of the first definition: just choose the open ball B(a, \varepsilon). This will translate in terms of the metric precisely to the first definition. Conversely if the first definition holds, all we need to show is that for any open neighborhood U of any point y, we can always find an open ball B(y, \varepsilon) contained entirely in U. We can apply this to pick that open ball around a, and use the first definition to show that all of the a_n will be inside that open ball (and hence inside U) forever onward.

The fact that we can always find such an open ball follows from the triangle inequality. If the open set U in question is a union of open balls, then the point y lies within some open ball B(x, r) where x \in U. The following picture should convince the reader that we can pick a ball around y contained in B(x, r)

Finding an open ball centered at y inside an open ball centered at x. (source: Wikibooks)

Specifically pick the radius \varepsilon so that d(x,y) + \varepsilon < r, any point z inside the ball centered at y is also in the ball centered at x, and we can see this by simply drawing the triangle connecting these three points, and applying the triangle inequality to show that d(x,z) < r. A similar idea works if U is a finite intersection of open balls B_i, where we just take the smallest ball around y of those we get by applying the above picture to each B_i.

The other main definition we want to convert to the language of open sets is that of a continuous function. In particular, when we study metric spaces in pure mathematics, we are interested in the behavior of continuous functions between them (more so, even, than the properties of the spaces themselves). Indeed, when we study calculus in high school and university, this is all we care about: we want to look at minima and maxima of continuous functions, we want to study derivatives (the instantaneous rate of change) of a continuous function, and we want to prove theorems that hold for all continuous functions (such as the mean value theorem).

Identically, in topology, we are interested in the behavior of continuous functions on topological spaces. In fact, we will use special kinds of continuous functions to “declare” two spaces to be identical. We will see by the end of this post how this works, but first we need a definition of a continuous function in terms of open sets. As with neighborhoods, recall the classical definition:

Definition: A function f:X \to Y of metric spaces with metrics d_X, d_Y is called continuous if for all \varepsilon > 0 there is a \delta > 0 such that whenever d_X(x, y) < \delta the distance d_Y(f(x), f(y)) < \varepsilon.

In words, whenever x,y are close in X, it follows that f(x), f(y) are close in Y.

Naturally, the corresponding definition in terms of open sets would be something along the lines of “for any open neighborhood U of x in X, there is an open neighborhood V of f(x) in Y which contains f(U).” In fact, this is an equivalent definition (which the reader may verify), but there is a much simpler version that works better.

Definition: A function f:X \to Y is called continuous if the preimage of an open set under f is again an open set. That is, whenever V \subset Y is open, then f^{-1}(V) is open in X.

The reason this is a better definition will become apparent later (in short: a general topology need not have “good” neighborhoods of a given point y). But at least we can verify these three definitions all coincide for metric spaces. These dry computations are very similar to the one we gave for convergent sequences, so we leave it to those readers with a taste for blood. We will just simply mention that, for example, all polynomial functions are continuous with respect to this definition.

Topological Spaces, a World without Distance

We are now ready to define a general topological space.

Definition: Let X be any set. A topology on X is a family of subsets T of X for which the following three properties hold:

  • The empty set and the subset X are both in T.
  • Any union of sets in T is again in T.
  • Any finite intersection of sets in T is again in T.

We call the pair (X,T)topological space, and we call any set in T and open set of X.

Definition: A set U in a topological space X is closed if its complement is open.

As we have already seen, any metric space (X,d) is a topological space, where the topology is the set of all open balls centered at all points of X. We say the topology on X is induced by the metric d. When X is either \mathbb{R}^n or \mathbb{C}^n, we call the topology induced by the Euclidean metric the Euclidean topology on X.

But these topological spaces are very well-behaved. We will work extensively with them in our applications, but there are a few classical examples that every student of the subject must know.

If we have any set X, we may define a very silly topology on X by defining every subset of X to be open. This family of subsets trivially satisfies the requirements of a topology, and it is called the discrete topology. Perhaps the only interesting question we can ask about this topology is whether it is induced by some metric d on the underlying space. The avid reader of this blog should be able to answer this question quite easily.

The natural second example after the discrete topology is called the indiscrete topology. Here we simply define the topology as T = \left \{ \emptyset, X \right \}. Again we see that this is a well-defined topology, and it’s duller than a conversation with the empty set.

As a third and slightly less trivial example, we point the reader to our proof gallery, where we define a topology on the integers and use it to prove that there are infinitely many primes.

Note that we can also define a topology on X by specifying a family of closed sets, as long as any intersection of closed sets is closed, and a finite union of closed sets is closed. This is because of the way unions, intersections, and complements interact. (\cup U_i)^{\textup{c}} = \cap U_i^{\textup{c}} and vice versa for intersections; proving this is a simple exercise in set theory.

Here is an extended (and vastly more interesting) example. Let X = \mathbb{R}^n, and define a set U \subset X to be closed if it is the set of common roots of a collection of polynomials in n variables (which in our example below will be x and y, but in general are often written x_1, \dots, x_n). The set of roots is also called the zero locus of the collection of polynomials. This topology is called the Zariski topology on X, and it is an extremely important topology in the field of algebraic geometry.

Before we verify that this is indeed a topology on X, let us see a quick example. If X = \mathbb{R}^2, the zero locus of the single polynomial y^2 - x^3 - x^2 is the curve pictured below:

A nodal cubic curve (source Wikipedia).

The red curve is thus a closed set in the Zariski topology, and its complement is an open set. If we add in another polynomial (with a few exceptions) it is not hard to see that their common set of zeroes will either be the empty set or a finite set of points. Indeed, in the Zariski topology every finite set of points is closed. The intrepid reader can try to show that any finite set can be defined using exactly two polynomials (hint: you’ll get a better idea of how to do this in a moment, and without loss of generality, you can ensure one of the two is an interpolating polynomial).

Verifying that the Zariski topology is indeed a topology, it is clear that the empty set and the entire set are closed: the constant polynomial 1 has no roots, and the zero polynomial has all points as its roots. Now, the intersection of any two closed sets is just the union of two collections \left \{ f_{\alpha} \right \} \cup \left \{ g_{\beta} \right \}. By adding more constraints, we only keep the points with are solutions to both the f_{\alpha} and the g_{\beta} (despite the union symbol, this truly corresponds to an intersection). Moreover, it is clear that we can take arbitrary unions of families of polynomials and still get a single family of polynomials, which still defines a closed set.

On the other hand, given two closed sets defined by the families of polynomials \left \{ f_{\alpha} \right \} , \left \{ g_{\beta} \right \}, we can achieve their union by looking at the closed set defined by the set of polynomial products \left \{ f_{\alpha}g_{\beta} \right \} for all possible pairs \alpha, \beta. To show this defines the union, take any point x \in \mathbb{R}^n which is in the union of the two closed sets. In other words, x is simultaneously a zero of all f_{\alpha} or all g_{\beta}. Since every polynomial in this new collection has either an f factor or a g factor, it follows that x is simultaneously a root of all of them. Conversely, let x is a simultaneous root of all of the f_{\alpha}g_{\beta}. If it weren’t a common zero of all the f_{\alpha} and it weren’t a common zero of all the g_{\beta}, then there would be some \alpha^* for which x is not a root of f_{\alpha^*} and similarly some \beta^* for which x is not a root of g_{\beta^*}. But then x could not be a root of f_{\alpha^*}g_{\beta^*}, contradicting that x is in the closed set to begin with. Thus we have verified that this actually defines the union of two closed sets. By induction, this gives us finite unions of closed sets being closed.

So the Zariski topology is in fact a valid topology on \mathbb{R}^n, and it is not hard to see that if k is any field, then there is a well-defined Zariski topology on the set k^n. In fact, studying this topology very closely yields a numerous amount of computational tools to solve problems like robot motion planning and automated theorem proving. We plan to investigate these topics in the future of this blog once we cover a little bit of ring theory, but for now the Zariski topology serves as a wonderful example of a useful topology.

Homeomorphisms

One major aspect of mathematics is how to find the correct notion of calling two things “equivalent.” In the theory of metric spaces, the strongest possible such notion is called an isometry. That is, two metric spaces X, Y with metrics d_X, d_Y are called isometric if there exists a surjective function f: X \to Y which preserves distance (i.e. d_X(x,y) = d_Y(f(x), f(y)) for all x,y \in X, and the image f(X) is all of Y). It is not hard to see that such functions are automatically both continuous and injective. The function f is called an isometry.

Now we can call two metric spaces “the same” if they are isometric. And they really are the same for all intents and purposes: the isometry f simply relabels the points of X as points of Y, and maintains the appropriate distances. Indeed, isometry is such a strong notion of equivalence that isometries of Euclidean space are completely classified.

However, because we don’t have distances in a topological space, the next best thing is a notion of equivalence based on continuity. This gives rise to the following definition.

Definition: A function f: X \to Y between topological spaces is a homeomorphism if it is continuous, invertible, and its inverse f^{-1} is also continuous. In this case we call X and Y homeomorphic, and we write X \cong Y.

In other words, we consider two topological spaces to be “the same” if one can be continuously transformed into the other in an invertible way. In still other words, a homeomorphism is a way to show that two topologies “agree” with each other. Indeed, since a topology is the only structure we have on our spaces, saying that two topologies agree is the strongest thing that can be said. (Of course, for many topological spaces we will impose other kinds of structure, but the moral still holds.)

As a first example, it is not hard to see that one can continuously transform a square into a circle (where these are considered subsets of the plane \mathbb{R}^2 with the Euclidean topology):

Transform a circle into a square by projecting from the center of the circle (source: Quora).

To see how this is done, take any point x on the circle, and draw a ray from the center of the circle through x. This line will intersect the square somewhere, and we can define f(x) to be that point of intersection. It is easy to see that a slight perturbation of the choice of x will only slightly change the image f(x), and that this mapping is invertible. This flavor of proof is standard in topology, because giving an argument in complete rigor (that is, defining an explicit homeomorphism) is extremely tedious and neither enlightening nor satisfying. And while there are a few holes in our explanation (for instance, what exactly is the topology of the square?), the argument is morally correct and conveys to the reader one aspect of what a homeomorphism can do.

On the other hand, in our first two examples of topological space, the discrete and indiscrete topologies, homeomorphisms are nonsensical. In fact, any two spaces with the discrete topology whose underlying sets have the same cardinality are homeomorphic, and the same goes for the indiscrete topology. This is simply because every function from a discrete space is continuous, and any function to an indiscrete space is continuous. In some sense, such topological spaces are considered pathological, because no topological tools can be used to glean any information about their structure.

As expected, the composition of two homeomorphisms is again a homeomorphism. From this it follows that homeomorphism is an equivalence relation, and so we can try to classify all topological spaces (or some interesting family of topological spaces) up to homeomorphism.

Of course, there are some very simple spaces that cannot be homeomorphic. For instance (again in the Euclidean topology), a circle is not homeomorphic to a line. While we will not prove this directly (that would require more tedious computations), there are good moral reasons why it is true. We will later identify a list of so-called topological invariants. These are properties of a topological space that are guaranteed to be preserved by homeomorphisms. In other words, if a space X has one of these properties and another space Y does not, then X and Y cannot be homeomorphic. A simple-minded topological invariant relevant to the question at hand is the existence of a “hole” in the space. Since the circle has a hole but the line does not, they cannot be homeomorphic. We will spend quite a lot of time developing more advanced topological invariants, but in the next primer we will list a few elementary and useful ones.

Of course there are many beautiful and fascinating topological spaces in higher dimensions. We will close this post with a description of a few of the most famous ones in dimension two (and, of course, we are ignoring what “dimension” rigorously means).

One nice space is the torus:

The torus (credit Wikipedia)

Otherwise known as the surface of a donut, a common mathematical joke is that a topologist cannot tell the difference between a donut and a coffee cup. Indeed, the two spaces are homeomorphic, so they are the same from a topologist’s point of view:

An explicit homeomorphism between a torus and a coffee cup (source Wikipedia).

This is a testament to the flexibility of homeomorphisms.

Another nice space is the Klein Bottle:

The Klein Bottle (source Wikipedia)

The Klein bottle is a fascinating object, because it does not “live” in three dimensions. Despite that it appears to intersect itself in the picture above, this is just a visualization of the Klein Bottle. It actually lives in four-dimensional space (which is impossible to visualize) and in this setting the space does not intersect itself. We say that the Klein Bottle can be embedded into \mathbb{R}^4, but not \mathbb{R}^3, and we will make this notion rigorous in the next primer. While this is not at all obvious, the torus and the Klein bottle are not homeomorphic.

The last space we will introduce is the real projective plane. This space, commonly denoted \mathbb{R}\textup{P}^2, also does not embed in three-dimensional Euclidean space. Unlike the Klein Bottle, \mathbb{R}\textup{P}^2 has no reasonable visualization, so a picture would be futile. Instead, we can think of it as a particular modification of a sphere: take a hollow sphere and “glue together” any pair of antipodal points (that is, points which are on the same line through the center of the sphere). This operation of “gluing,” although it may seem awkward, does define a perfectly good topological space (we will cover the details in the next primer). Of course, it is extremely hard to get a good idea of what it looks like, except to say that it is “kind of” like a sphere with some awkward twists in it. Again, \mathbb{R}\textup{P}^2 is not homeomorphic to either of the torus or the Klein Bottle.

This only scratches the surface of commonly seen topological spaces (the Möbius strip comes to mind, for instance). While we don’t have nearly enough space or time on this blog to detail very many of them, next time we will investigate ways to take simple topological spaces and put them together to make more complex spaces. We will rigorize the notion of “gluing” spaces together, along with other common operations. We will also spend some time developing topological invariants which allow us to “count” the number of “holes” in a space. These invariants will become the sole focus of our applications of topology to data analysis.

Until then!