NP-hard does not mean hard

When NP-hardness pops up on the internet, say because some silly blogger wants to write about video games, it’s often tempting to conclude that the problem being proved NP-hard is actually very hard!

“Scientists proved Super Mario is NP-hard? I always knew there was a reason I wasn’t very good at it!” Sorry, these two are unrelated. NP-hardness means hard in a narrow sense this post should hopefully make clear. After that, we’ll explore what “hard” means in a mathematical sense that you can apply beyond NP-hardness to inform your work as a programmer.

When a problem is NP-hard, that simply means that the problem is sufficiently expressive that you can use the problem to express logic. By which I mean boolean formulas using AND, OR, and NOT. In the Super Mario example, the “problem” is a bundle of (1) the controls for the player (2) the allowed tiles and characters that make up a level, and (3) the goal of getting from the start to the end. Logic formulas are encoded in the creation of a level, and solving the problem (completing the level) is the same as finding conditions to make the logical formula true.

mario-clause-gadget

The clause gadget for the original Super Mario Brothers, encoding an OR of three variables.

In this sense, NP-hardness doesn’t make all of Super Mario hard. The levels designed to encode logical formulas are contrived, convoluted, and contorted. They abuse the rules of the game in order to cram boolean logic into it. These are worst case levels. It’s using Mario for a completely unintended purpose, not unlike hacking. And so NP-hardness is a worst case claim.

To reiterate, NP-hardness means that Super Mario has expressive power. So expressive that it can emulate other problems we believe are hard in the worst case. And, because the goal of mathematical “hardness” is to reason about the limitations of algorithms, being able to solve Super Mario in full generality implies you can solve any hard subproblem, no matter how ridiculous the level design.

The P != NP conjecture says that there’s no polynomial time algorithm to determine whether boolean logic formulas are satisfiable, and so as a consequence Super Mario (in full generality) also has no polynomial time algorithm.

That being said, in reality Super Mario levels do not encode logical formulas! If you use the knowledge that real-world Super Mario levels are designed in the way they are (to be solvable, fun), then you can solve Super Mario with algorithms. There are many examples.

In general, the difficulty of a problem for humans is unrelated to the difficulty for algorithms. Consider multiplication of integers. This is a trivial problem for computers to solve, but humans tend to struggle with it. It’s an amazing feat to be able to multiply two 7 digit numbers in less than 5 seconds, whereas computers can multiply two thousand-digit numbers in milliseconds.

Meanwhile, protein folding is known to be an NP-hard problem, but it’s been turned into a game sufficiently easy for humans to solve that players have contributed to scientific research. Indeed, even some of the most typically cited NP-hard problems, like traveling salesman, have heuristic, practical algorithmic solutions that allow one to solve them (very close to optimally) in hours on inputs as large as every city on earth.

So the mathematical notions of hardness are quite disconnected from practical notions of hardness. This is not even to mention that some NP-hard problems can be efficiently approximated to within any desired accuracy.

Let’s dig into the math a bit more. “Hardness” is a family of ideas about comparisons between problems based on reusability of algorithmic solutions. Loosely speaking, a problem $ R$ is hard with respect to a class of problems $ C$ if an algorithm solving $ R$ can be easily transformed into an algorithm solving any problem in $ C$. You have to say what kinds of transformations are allowed, and the transformation can be different for different target problems in $ C$, but that’s the basic idea.

In the Super Mario example, if you want to solve logical formulas, you can transform a hypothetically perfect mario-level-playing algorithm into a logic solver by encoding the formula as a level and running the mario-level-playing algorithm on it as a black box. Add an if statement to the end to translate “level can/can’t be finished” to “formula can/can’t be satisfied,” and the transformation is complete. It’s important for NP-hardness that the transformation only takes polynomial time. Other kinds of hardness might admit more or restrict to fewer resources.

And so this is what makes Mario NP-hard, because boolean logic satisfiability is NP-hard. Any problem in NP can be solved by a boolean logic solver, and hence also by a mario-level-player. The fact that boolean logic solving is NP-hard is a difficult theorem to prove. But if we assume it’s true, you can compose the transformations to get from any NP problem to Super Mario.

As a simple example of a different kind of hardness, you can let $ C$ be the class of problems solvable using only a finite amount of memory (independent of the input). You have probably heard of this class of problems by another name, but I’ll keep you guessing until the end of the post. A $ C$-hard problem $ R$ is one for which an algorithmic solution can be repurposed to solve any finite-memory-solvable problem.

We have to be careful: if the transformation between solutions allows us polynomial time (in the size of the input) like it did for NP-hardness, then we might have enough time in the transformation alone to solve the entire problem, removing the need for a solution to $ R$ in the first place! For this reason, we have to limit the amount of work that can be done in the transformation. We get a choice here that influences how interesting or useful the definition of hardness is, but let’s just pick one and say that the transformation can only use finite time (independent of the input).

To be fair, I actually don’t know if there are any hard problems with respect to this definition. There probably are, but chances are good that they are not members of $ C$, and that’s where the definition of hardness gets really interesting. If you have a problem in $ C$ which is also $ C$-hard, it’s called complete for $ C$. And once you’ve found a complete problem, from a theoretical perspective you’re a winner. You’ve found a problem which epitomizes the difficulty of solving problems in $ C$. And so it’s a central aim of researchers studying a complexity class to find complete problems. As they say in the business, “ABC: always be completing.”

As a more concrete and interesting example, the class $ P$ of all polynomial-time solvable problems has a complete problem. Here the transformations are a bit up in the air. They could either be logarithmic-space computations, or what’s called NC, which can be thought of as poly-logarithmic time (very fast) parallel computations. I only mention NC because it allows you to say “P-complete problems are hard to parallelize.”

Regardless of the choice, there are a number of very useful problems known to be P-complete. The first is the Circuit Value Problem, given a circuit (described by its gates and wires using any reasonable encoding) and an input to the circuit, what is the output?

Others include linear programming (optimize this linear function with respect to linear constraints), data compression (does the compressed version of a string $ s$ using Lempel–Ziv–Welch contain a string $ t$?), and type inference for partial types. There are many more in this compendium of Greenlaw et al. Each one is expressive enough to encode any instance of the other, and any instance of any problem in P. It’s quite curious to think that gzip can solve linear programs, but that’s surely no curiouser than super mario levels encoding boolean logic.

Just as with NP-hardness, when a problem is P-hard that doesn’t automatically mean it’s easy or hard for humans, or that typical instances can’t be easily parallelized. P-hardness is also a worst case guarantee.

Studying P-completeness is helpful in the same way NP-completeness is helpful. Completeness informs you about whether you should hope to find a perfect solution or be content with approximations and heuristics (or incorporate problem context to make it easier). Knowing a problem is P-complete means you should not expect perfect efficient parallel algorithms, or perfect efficient algorithms that use severely limited space. Knowing a problem is NP-hard means you should not expect a perfect polynomial time solution. In other words, if you are forced to work with those restrictions, the game becomes one of tradeoffs. Hardness and completeness focus and expedite your work, and clarify a principled decision making process.

Until next time!

P.S. The class of problems solvable in a finite amount of memory is just the class of regular languages. The “finite memory” is the finite state machine used to solve them.

Boolean Logic in Polynomials

Problem: Express a boolean logic formula using polynomials. I.e., if an input variable $ x$ is set to $ 0$, that is interpreted as false, while $ x=1$ is interpreted as true. The output of the polynomial should be 0 or 1 according to whether the formula is true or false as a whole.

Solution: You can do this using a single polynomial.

Illustrating with an example: the formula is $ \neg[(a \vee b) \wedge (\neg c \vee d)]$ also known as

not((a or b) and (not c or d))

The trick is to use multiplication for “and” and $ 1-x$ for “not.” So $ a \wedge b$ would be $ x_1 x_2$, and $ \neg z$ would be $ 1-z$. Indeed, if you have two binary variables $ x$ and $ y$ then $ xy$ is 1 precisely when both are 1, and zero when either variable is zero. Likewise, $ 1-x = 1$ if $ x$ is zero and zero if $ x$ is one.

Combine this with deMorgan’s rule to get any formula. $ a \vee b = \neg(\neg a \wedge \neg b)$ translates to $ 1 – (1-a)(1-b)$. For our example above,

$ \displaystyle f(x_1, x_2, x_3, x_4) = 1 – (1 – (1-a)(1-b))(1 – c(1-d))$

Which expands to

$ \displaystyle 1 – a – b + ab + (1-d)(ac + bc – abc)$

If you plug in $ a = 1, b = 0, c = 1, d = 0$ you get True in the original formula (because “not c or d” is False), and likewise the polynomial is

$ \displaystyle 1 – 1 – 0 + 0 + (1-0)(1 + 0 – 0) = 1$

You can verify the rest work yourself, using the following table as a guide:

0, 0, 0, 0 -> 1
0, 0, 0, 1 -> 1
0, 0, 1, 0 -> 1
0, 0, 1, 1 -> 1
0, 1, 0, 0 -> 0
0, 1, 0, 1 -> 0
0, 1, 1, 0 -> 1
0, 1, 1, 1 -> 0
1, 0, 0, 0 -> 0
1, 0, 0, 1 -> 0
1, 0, 1, 0 -> 1
1, 0, 1, 1 -> 0
1, 1, 0, 0 -> 0
1, 1, 0, 1 -> 0
1, 1, 1, 0 -> 1
1, 1, 1, 1 -> 0

Discussion: This trick is used all over CS theory to embed boolean logic within polynomials, and it makes the name “boolean algebra” obvious, because it’s just a subset of normal algebra.

Moreover, since boolean satisfiability—the problem of algorithmically determining if a boolean formula has a satisfying assignment (a choice of variables evaluating to true)—is NP-hard, this can be used to show certain problems relating to multivariable polynomials is also hard. For example, finding roots of multivariable polynomials (even if you knew nothing about algebraic geometry) is hard because you’d run into NP-hardness by simply considering the subset of polynomials coming from boolean formulas.

Here’s a more interesting example, related to the kinds of optimization problems that show up in modern machine learning. Say you want to optimize a polynomial $ f(x)$ subject to a set of quadratic equality constraints. This is NP-hard. Here’s why.

Let $ \varphi$ be a boolean formula, and $ f_\varphi$ its corresponding polynomial. First, each variable $ x_i$ used in the polynomial can be restricted to binary values via the constraint $ x_i(x_i – 1) = 0$.

You can even show NP-hardness if the target function to optimize is only quadratic. As an exercise, one can express the subset sum problem as a quadratic programming problem using similar choices for the constraints. According to this writeup you even express subset sum as a quadratic program with linear constraints.

The moral of the story is simply that multivariable polynomials can encode arbitrary boolean logic.

The Blum-Blum-Shub Pseudorandom Generator

Problem: Design a random number generator that is computationally indistinguishable from a truly random number generator.

Solution (in Python): note this solution uses the Miller-Rabin primality tester, though any primality test will do. See the github repository for the referenced implementation.

from randomized.primality import probablyPrime
import random

def goodPrime(p):
    return p % 4 == 3 and probablyPrime(p, accuracy=100)

def findGoodPrime(numBits=512):
    candidate = 1
    while not goodPrime(candidate):
        candidate = random.getrandbits(numBits)
    return candidate

def makeModulus():
    return findGoodPrime() * findGoodPrime()

def parity(n):
    return sum(int(x) for x in bin(n)[2:]) % 2

class BlumBlumShub(object):
    def __init__(self, seed=None):
        self.modulus = makeModulus()
        self.state = seed if seed is not None else random.randint(2, self.modulus - 1)
        self.state = self.state % self.modulus

    def seed(self, seed):
        self.state = seed

    def bitstream(self):
        while True:
            yield parity(self.state)
            self.state = pow(self.state, 2, self.modulus)

    def bits(self, n=20):
        outputBits = ''
        for bit in self.bitstream():
            outputBits += str(bit)
            if len(outputBits) == n:
                break

        return outputBits

Discussion:

An integer $ x$ is called a quadratic residue of another integer $ N$ if it can be written as $ x = a^2 \mod N$ for some $ a$. That is, if it’s the remainder when dividing a perfect square by $ N$. Some numbers, like $ N=8$, have very special patterns in their quadratic residues, only 0, 1, and 4 can occur as quadratic residues.

The core idea behind this random number generator is that, for a specially chosen modulus $ N$, telling whether a number $ x$ is a quadratic residue mod $ N$ is hard. In fact, one can directly convert an algorithm that can predict the next bit of this random number generator (by even a slight edge) into an arbitrarily accurate quadratic-residue-decider. So if computing quadratic residues is even mildly hard, then predicting the next bit in this random number generator is very hard.

More specifically, the conjectured guarantee about this random number generator is the following: if you present a polynomial time adversary with two sequences:

  1. A truly random sequence of bits of length $ k$,
  2. $ k$ bits from the output of the pseudorandom generator when seeded with a starting state shorter than $ k$ bits.

Then the adversary can’t distinguish between the two sequences with probability “significantly” more than 1/2, where by “significantly” I mean $ 1/k^c$ for any $ c>0$ (i.e., the edge over randomness vanishes faster than any inverse polynomial). It turns out, due to a theorem of Yao, that this is equivalent to not being able to guess the next bit in a pseudorandom sequence with a significant edge over a random guess, even when given the previous $ \log(N)^{10}$ bits in the sequence (or any $ \textup{poly}(\log N)$ bits in the sequence).

This emphasizes a deep philosophical viewpoint in theoretical computer science, that whether some object has a property (randomness) really only depends on the power of a computationally limited observer to identify that property. If nobody can tell the difference between fake randomness and real randomness, then the fake randomness is random. Offhand I wonder whether you can meaningfully apply this view to less mathematical concepts like happiness and status.

Anyway, the modulus $ N$ is chosen in such a way that every quadratic residue of $ N$ has a unique square root which is also a quadratic residue. This makes the squaring function a bijection on quadratic residues. In other words, with a suitably chosen $ N$, there’s no chance that we’ll end up with $ N=8$ where there are very few quadratic residues and the numbers output by the Blum-Blum-Shub generator have a short cycle. Moreover, the assumption that detecting quadratic residues mod $ N$ is hard makes the squaring function a one-way permutation.

Here’s an example of how this generator might be used:

generator = BlumBlumShub()

hist = [0] * 2**6
for i in range(10000):
    value = int(generator.bits(6), 2)
    hist[value] += 1

print(hist)

This produces random integers between 0 and 64, with the following histogram:

bbs-hist

See these notes of Junod for a detailed exposition of the number theory behind this random number generator, with full definitions and proofs.

Zero Knowledge Proofs — A Primer

In this post we’ll get a strong taste for zero knowledge proofs by exploring the graph isomorphism problem in detail. In the next post, we’ll see how this relates to cryptography and the bigger picture. The goal of this post is to get a strong understanding of the terms “prover,” “verifier,” and “simulator,” and “zero knowledge” in the context of a specific zero-knowledge proof. Then next time we’ll see how the same concepts (though not the same proof) generalizes to a cryptographically interesting setting.

Graph isomorphism

Let’s start with an extended example. We are given two graphs $ G_1, G_2$, and we’d like to know whether they’re isomorphic, meaning they’re the same graph, but “drawn” different ways.

The problem of telling if two graphs are isomorphic seems hard. The pictures above, which are all different drawings of the same graph (or are they?), should give you pause if you thought it was easy.

To add a tiny bit of formalism, a graph $ G$ is a list of edges, and each edge $ (u,v)$ is a pair of integers between 1 and the total number of vertices of the graph, say $ n$. Using this representation, an isomorphism between $ G_1$ and $ G_2$ is a permutation $ \pi$ of the numbers $ \{1, 2, \dots, n \}$ with the property that $ (i,j)$ is an edge in $ G_1$ if and only if $ (\pi(i), \pi(j))$ is an edge of $ G_2$. You swap around the labels on the vertices, and that’s how you get from one graph to another isomorphic one.

Given two arbitrary graphs as input on a large number of vertices $ n$, nobody knows of an efficient—i.e., polynomial time in $ n$—algorithm that can always decide whether the input graphs are isomorphic. Even if you promise me that the inputs are isomorphic, nobody knows of an algorithm that could construct an isomorphism. (If you think about it, such an algorithm could be used to solve the decision problem!)

A game

Now let’s play a game. In this game, we’re given two enormous graphs on a billion nodes. I claim they’re isomorphic, and I want to prove it to you. However, my life’s fortune is locked behind these particular graphs (somehow), and if you actually had an isomorphism between these two graphs you could use it to steal all my money. But I still want to convince you that I do, in fact, own all of this money, because we’re about to start a business and you need to know I’m not broke.

Is there a way for me to convince you beyond a reasonable doubt that these two graphs are indeed isomorphic? And moreover, could I do so without you gaining access to my secret isomorphism? It would be even better if I could guarantee you learn nothing about my isomorphism or any isomorphism, because even the slightest chance that you can steal my money is out of the question.

Zero knowledge proofs have exactly those properties, and here’s a zero knowledge proof for graph isomorphism. For the record, $ G_1$ and $ G_2$ are public knowledge, (common inputs to our protocol for the sake of tracking runtime), and the protocol itself is common knowledge. However, I have an isomorphism $ f: G_1 \to G_2$ that you don’t know.

Step 1: I will start by picking one of my two graphs, say $ G_1$, mixing up the vertices, and sending you the resulting graph. In other words, I send you a graph $ H$ which is chosen uniformly at random from all isomorphic copies of $ G_1$. I will save the permutation $ \pi$ that I used to generate $ H$ for later use.

Step 2: You receive a graph $ H$ which you save for later, and then you randomly pick an integer $ t$ which is either 1 or 2, with equal probability on each. The number $ t$ corresponds to your challenge for me to prove $ H$ is isomorphic to $ G_1$ or $ G_2$. You send me back $ t$, with the expectation that I will provide you with an isomorphism between $ H$ and $ G_t$.

Step 3: Indeed, I faithfully provide you such an isomorphism. If I you send me $ t=1$, I’ll give you back $ \pi^{-1} : H \to G_1$, and otherwise I’ll give you back $ f \circ \pi^{-1}: H \to G_2$. Because composing a fixed permutation with a uniformly random permutation is again a uniformly random permutation, in either case I’m sending you a uniformly random permutation.

Step 4: You receive a permutation $ g$, and you can use it to verify that $ H$ is isomorphic to $ G_t$. If the permutation I sent you doesn’t work, you’ll reject my claim, and if it does, you’ll accept my claim.

Before we analyze, here’s some Python code that implements the above scheme. You can find the full, working example in a repository on this blog’s Github page.

First, a few helper functions for generating random permutations (and turning their list-of-zero-based-indices form into a function-of-positive-integers form)

import random

def randomPermutation(n):
    L = list(range(n))
    random.shuffle(L)
    return L

def makePermutationFunction(L):
    return lambda i: L[i - 1] + 1

def makeInversePermutationFunction(L):
    return lambda i: 1 + L.index(i - 1)

def applyIsomorphism(G, f):
    return [(f(i), f(j)) for (i, j) in G]

Here’s a class for the Prover, the one who knows the isomorphism and wants to prove it while keeping the isomorphism secret:

class Prover(object):
    def __init__(self, G1, G2, isomorphism):
        '''
            isomomorphism is a list of integers representing
            an isomoprhism from G1 to G2.
        '''
        self.G1 = G1
        self.G2 = G2
        self.n = numVertices(G1)
        assert self.n == numVertices(G2)

        self.isomorphism = isomorphism
        self.state = None

    def sendIsomorphicCopy(self):
        isomorphism = randomPermutation(self.n)
        pi = makePermutationFunction(isomorphism)

        H = applyIsomorphism(self.G1, pi)

        self.state = isomorphism
        return H

    def proveIsomorphicTo(self, graphChoice):
        randomIsomorphism = self.state
        piInverse = makeInversePermutationFunction(randomIsomorphism)

        if graphChoice == 1:
            return piInverse
        else:
            f = makePermutationFunction(self.isomorphism)
            return lambda i: f(piInverse(i))

The prover has two methods, one for each round of the protocol. The first creates an isomorphic copy of $ G_1$, and the second receives the challenge and produces the requested isomorphism.

And here’s the corresponding class for the verifier

class Verifier(object):
    def __init__(self, G1, G2):
        self.G1 = G1
        self.G2 = G2
        self.n = numVertices(G1)
        assert self.n == numVertices(G2)

    def chooseGraph(self, H):
        choice = random.choice([1, 2])
        self.state = H, choice
        return choice

    def accepts(self, isomorphism):
        '''
            Return True if and only if the given isomorphism
            is a valid isomorphism between the randomly
            chosen graph in the first step, and the H presented
            by the Prover.
        '''
        H, choice = self.state
        graphToCheck = [self.G1, self.G2][choice - 1]
        f = isomorphism

        isValidIsomorphism = (graphToCheck == applyIsomorphism(H, f))
        return isValidIsomorphism

Then the protocol is as follows:

def runProtocol(G1, G2, isomorphism):
    p = Prover(G1, G2, isomorphism)
    v = Verifier(G1, G2)

    H = p.sendIsomorphicCopy()
    choice = v.chooseGraph(H)
    witnessIsomorphism = p.proveIsomorphicTo(choice)

    return v.accepts(witnessIsomorphism)

Analysis: Let’s suppose for a moment that everyone is honestly following the rules, and that $ G_1, G_2$ are truly isomorphic. Then you’ll always accept my claim, because I can always provide you with an isomorphism. Now let’s suppose that, actually I’m lying, the two graphs aren’t isomorphic, and I’m trying to fool you into thinking they are. What’s the probability that you’ll rightfully reject my claim?

Well, regardless of what I do, I’m sending you a graph $ H$ and you get to make a random choice of $ t = 1, 2$ that I can’t control. If $ H$ is only actually isomorphic to either $ G_1$ or $ G_2$ but not both, then so long as you make your choice uniformly at random, half of the time I won’t be able to produce a valid isomorphism and you’ll reject. And unless you can actually tell which graph $ H$ is isomorphic to—an open problem, but let’s say you can’t—then probability 1/2 is the best you can do.

Maybe the probability 1/2 is a bit unsatisfying, but remember that we can amplify this probability by repeating the protocol over and over again. So if you want to be sure I didn’t cheat and get lucky to within a probability of one-in-one-trillion, you only need to repeat the protocol 30 times. To be surer than the chance of picking a specific atom at random from all atoms in the universe, only about 400 times.

If you want to feel small, think of the number of atoms in the universe. If you want to feel big, think of its logarithm.

Here’s the code that repeats the protocol for assurance.

def convinceBeyondDoubt(G1, G2, isomorphism, errorTolerance=1e-20):
    probabilityFooled = 1

    while probabilityFooled > errorTolerance:
        result = runProtocol(G1, G2, isomorphism)
        assert result
        probabilityFooled *= 0.5
        print(probabilityFooled)

Running it, we see it succeeds

$ python graph-isomorphism.py
0.5
0.25
0.125
0.0625
0.03125
 ...
<SNIP>
 ...
1.3552527156068805e-20
6.776263578034403e-21

So it’s clear that this protocol is convincing.

But how can we be sure that there’s no leakage of knowledge in the protocol? What does “leakage” even mean? That’s where this topic is the most difficult to nail down rigorously, in part because there are at least three a priori different definitions! The idea we want to capture is that anything that you can efficiently compute after the protocol finishes (i.e., you have the content of the messages sent to you by the prover) you could have computed efficiently given only the two graphs $ G_1, G_2$, and the claim that they are isomorphic.

Another way to say it is that you may go through the verification process and feel happy and confident that the two graphs are isomorphic. But because it’s a zero-knowledge proof, you can’t do anything with that information more than you could have done if you just took the assertion on blind faith. I’m confident there’s a joke about religion lurking here somewhere, but I’ll just trust it’s funny and move on.

In the next post we’ll expand on this “leakage” notion, but before we get there it should be clear that the graph isomorphism protocol will have the strongest possible “no-leakage” property we can come up with. Indeed, in the first round the prover sends a uniform random isomorphic copy of $ G_1$ to the verifier, but the verifier can compute such an isomorphism already without the help of the prover. The verifier can’t necessarily find the isomorphism that the prover used in retrospect, because the verifier can’t solve graph isomorphism. Instead, the point is that the probability space of “$ G_1$ paired with an $ H$ made by the prover” and the probability space of “$ G_1$ paired with $ H$ as made by the verifier” are equal. No information was leaked by the prover.

For the second round, again the permutation $ \pi$ used by the prover to generate $ H$ is uniformly random. Since composing a fixed permutation with a uniform random permutation also results in a uniform random permutation, the second message sent by the prover is uniformly random, and so again the verifier could have constructed a similarly random permutation alone.

Let’s make this explicit with a small program. We have the honest protocol from before, but now I’m returning the set of messages sent by the prover, which the verifier can use for additional computation.

def messagesFromProtocol(G1, G2, isomorphism):
    p = Prover(G1, G2, isomorphism)
    v = Verifier(G1, G2)

    H = p.sendIsomorphicCopy()
    choice = v.chooseGraph(H)
    witnessIsomorphism = p.proveIsomorphicTo(choice)

    return [H, choice, witnessIsomorphism]

To say that the protocol is zero-knowledge (again, this is still colloquial) is to say that anything that the verifier could compute, given as input the return value of this function along with $ G_1, G_2$ and the claim that they’re isomorphic, the verifier could also compute given only $ G_1, G_2$ and the claim that $ G_1, G_2$ are isomorphic.

It’s easy to prove this, and we’ll do so with a python function called simulateProtocol.

def simulateProtocol(G1, G2):
    # Construct data drawn from the same distribution as what is
    # returned by messagesFromProtocol
    choice = random.choice([1, 2])
    G = [G1, G2][choice - 1]
    n = numVertices(G)

    isomorphism = randomPermutation(n)
    pi = makePermutationFunction(isomorphism)
    H = applyIsomorphism(G, pi)

    return H, choice, pi

The claim is that the distribution of outputs to messagesFromProtocol and simulateProtocol are equal. But simulateProtocol will work regardless of whether $ G_1, G_2$ are isomorphic. Of course, it’s not convincing to the verifier because the simulating function made the choices in the wrong order, choosing the graph index before making $ H$. But the distribution that results is the same either way.

So if you were to use the actual Prover/Verifier protocol outputs as input to another algorithm (say, one which tries to compute an isomorphism of $ G_1 \to G_2$), you might as well use the output of your simulator instead. You’d have no information beyond hard-coding the assumption that $ G_1, G_2$ are isomorphic into your program. Which, as I mentioned earlier, is no help at all.

In this post we covered one detailed example of a zero-knowledge proof. Next time we’ll broaden our view and see the more general power of zero-knowledge (that it captures all of NP), and see some specific cryptographic applications. Keep in mind the preceding discussion, because we’re going to re-use the terms “prover,” “verifier,” and “simulator” to mean roughly the same things as the classes Prover, Verifier and the function simulateProtocol.

Until then!