Searching for RH Counterexamples — Search Strategies

We’re glibly searching for counterexamples to the Riemann Hypothesis, to trick you into learning about software engineering principles. In the first two articles we configured a testing framework and showed how to hide implementation choices behind an interface. Next, we’ll improve the algorithm’s core routine. As before, I’ll link to specific git commits in the final code repository to show how the project evolves.

Superabundant numbers

A superabundant number n is one which has “maximal relative divisor sums” in the following sense: for all m < n,

\displaystyle \frac{\sigma(m)}{m} < \frac{\sigma(n)}{n}

where \sigma(n) is the sum of the divisors of n.

Erdős and Alaoglu proved in 1944 (“On highly composite and similar numbers“) that superabundant numbers have a specific prime decomposition, in which all initial primes occur with non-increasing exponents

\displaystyle n = \prod_{i=1}^k (p_i)^{a_i},

where p_i is the i-th prime, and a_1 \geq a_2 \geq \dots \geq a_k \geq 1. With two exceptions (n=4, 36), a_k  = 1.

Here’s a rough justification for why superabundant numbers should have a decomposition like this. If you want a number with many divisors (compared to the size of the number), you want to pack as many combinations of small primes into the decomposition of your number as possible. Using all 2’s leads to not enough combinations—only m+1 divisors for 2^m—but using 2′ and 3’s you get (r+1)(s+1) for 2^r3^s. Using more 3’s trades off a larger number n for the benefit of a larger \sigma(n) (up to r=s). The balance between getting more distinct factor combinations and a larger n favors packing the primes in there.

Though numbers of this form are not necessarily superabundant, this gives us an enumeration strategy better than trying all numbers. Enumerate over tuples corresponding to the exponents of the prime decomposition (non-increasing lists of integers), and save those primes to make it easier to compute the divisor sum.

Non-increasing lists of integers can be enumerated in the order of their sum, and for each sum N, the set of non-increasing lists of integers summing to N is called the partitions of N. There is a simple algorithm to compute them, implemented in this commit. Note this does not enumerate them in order of the magnitude of the number \prod_{i=1}^k (p_i)^{a_i}.

The implementation for the prime-factorization-based divisor sum computation is in this commit. In addition, to show some alternative methods of testing, we used the hypothesis library to autogenerate tests. It chooses a random (limited size) prime factorization, and compares the prime-factorization-based algorithm to the naive algorithm. There’s a bit of setup code involved, but as a result we get dozens of tests and more confidence it’s right.

Search Strategies

We now have two search strategies over the space of natural numbers, though one is obviously better. We may come up with a third, so it makes sense to separate the search strategy from the main application by an interface. Generally, if you have a hard-coded implementation, and you realize that you need to change it in a significant way, that’s a good opportunity to extract it and hide it behind an interface.

A good interface choice is a bit tricky here, however. In the original implementation, we could say, “process the batch of numbers (search for counterexamples) between 1 and 2 million.” When that batch is saved to the database, we would start on the next batch, and all the batches would be the same size, so (ignoring that computing \sigma(n) the old way takes longer as n grows) each batch required roughly the same time to run.

The new search strategy doesn’t have a sensible way to do this. You can’t say “start processing from K” because we don’t know how to easily get from K to the parameter of the enumeration corresponding to K (if one exists). This is partly because our enumeration isn’t monotonic increasing (2^1 3^1 5^1 = 30 comes before 2^4 = 16). And partly because even if we did have a scheme, it would almost certainly require us to compute a prime factorization, which is slow. It would be better if we could save the data from the latest step of the enumeration, and load it up when starting the next batch of the search.

This scheme suggests a nicely generic interface for stopping and restarting a search from a particular spot. The definition of a “spot,” and how to start searching from that spot, are what’s hidden by the interface. Here’s a first pass.

SearchState = TypeVar('SearchState')

class SearchStrategy(ABC):
    @abstractmethod
    def starting_from(self, search_state: SearchState) -> SearchStrategy:
        '''Reset the search strategy to search from a given state.'''
        pass

    @abstractmethod
    def search_state(self) -> SearchState:
        '''Get an object describing the current state of the enumeration.'''
        pass

    @abstractmethod
    def next_batch(self, batch_size: int) -> List[RiemannDivisorSum]:
        '''Process the next batch of Riemann Divisor Sums'''
        pass

Note that SearchState is defined as a generic type variable because we cannot say anything about its structure yet. The implementation class is responsible for defining what constitutes a search state, and getting the search strategy back to the correct step of the enumeration given the search state as input. Later I realized we do need some structure on the SearchState—the ability to serialize it for storage in the database—so we elevated it to an interface later.

Also note that we are making SearchStrategy own the job of computing the Riemann divisor sums. This is because the enumeration details and the algorithm to compute the divisor sums are now coupled. For the exhaustive search strategy it was “integers n, naively loop over smaller divisors.” In the new strategy it’s “prime factorizations, prime-factorization-based divisor sum.” We could decouple this, but there is little reason to now because the implementations are still in 1-1 correspondence.

This commit implements the old search strategy in terms of this interface, and this commit implements the new search strategy. In the latter, I use pytest.parameterize to test against the interface and parameterize over the implementations.

The last needed bit is the ability to store and recover the search state in between executions of the main program. This requires a second database table. The minimal thing we could do is just store and update a single row for each search strategy, providing the search state as of the last time the program was run and stopped. This would do, but in my opinion an append-only log is a better design for such a table. That is, each batch computed will have a record containing the timestamp the batch started and finished, along with the starting and ending search state. We can use the largest timestamp for a given search strategy to pick up where we left off across program runs.

One can imagine this being the basis for an application like folding@home or the BOINC family of projects, where a database stores chunks of a larger computation (ranges of a search space), clients can request chunks to complete, and they are assembled into a complete database. In this case we might want to associate the chunk metadata with the computed results (say, via a foreign key). That would require a bit of work from what we have now, but note that the interfaces would remain reusable for this. For now, we will just incorporate the basic table approach. It is completed in this pull request, and tying it into the main search routine is done in this commit.

However, when running it with the superabundant search strategy, we immediately run into a problem. Superabundant numbers grow too fast, and within a few small batches of size 100 we quickly exceed the 64 bits available to numba and sqlite to store the relevant data.

>>> fac = partition_to_prime_factorization(partitions_of_n(16)[167])
>>> fac2 = [p**d for (p, d) in fac]
>>> fac2
[16, 81, 625, 2401, 11, 13, 17, 19, 23, 29, 31, 37]
>>> math.log2(reduce(lambda x,y: x*y, fac2))
65.89743638933722

Running populate_database.py results in the error

$ python -m riemann.populate_database db.sqlite3 SuperabundantSearchStrategy 100
Searching with strategy SuperabundantSearchStrategy
Starting from search state SuperabundantEnumerationIndex(level=1, index_in_level=0)
Computed [1,0, 10,4] in 0:00:03.618798
Computed [10,4, 12,6] in 0:00:00.031451
Computed [12,6, 13,29] in 0:00:00.031518
Computed [13,29, 14,28] in 0:00:00.041464
Computed [14,28, 14,128] in 0:00:00.041674
Computed [14,128, 15,93] in 0:00:00.034419
...
OverflowError: Python int too large to convert to SQLite INTEGER

We’ll see what we can do about this in a future article, but meanwhile we do get some additional divisor sums for these large numbers, and 10080 is still the best.

sqlite> select n, witness_value 
from RiemannDivisorSums 
where witness_value > 1.7 and n > 5040 
order by witness_value desc 
limit 10;

10080|1.7558143389253
55440|1.75124651488749
27720|1.74253672381383
7560|1.73991651920276
15120|1.73855867428903
160626866400|1.73744669257158
321253732800|1.73706925385011
110880|1.73484901030336
6983776800|1.73417642212953
720720|1.73306535623807

Zero Knowledge Proofs for NP

Last time, we saw a specific zero-knowledge proof for graph isomorphism. This introduced us to the concept of an interactive proof, where you have a prover and a verifier sending messages back and forth, and the prover is trying to prove a specific claim to the verifier.

A zero-knowledge proof is a special kind of interactive proof in which the prover has some secret piece of knowledge that makes it very easy to verify a disputed claim is true. The prover’s goal, then, is to convince the verifier (a polynomial-time algorithm) that the claim is true without revealing any knowledge at all about the secret.

In this post we’ll see that, using a bit of cryptography, zero-knowledge proofs capture a much wider class of problems than graph isomorphism. Basically, if you believe that cryptography exists, every problem whose answers can be easily verified have zero-knowledge proofs (i.e., all of the class NP). Here are a bunch of examples. For each I’ll phrase the problem as a question, and then say what sort of data the prover’s secret could be.

  • Given a boolean formula, is there an assignment of variables making it true? Secret: a satisfying assignment to the variables.
  • Given a set of integers, is there a subset whose sum is zero? Secret: such a subset.
  • Given a graph, does it have a 3-coloring? Secret: a valid 3-coloring.
  • Given a boolean circuit, can it produce a specific output? Secret: a choice of inputs that produces the output.

The common link among all of these problems is that they are NP-hard (graph isomorphism isn’t known to be NP-hard). For us this means two things: (1) we think these problems are actually hard, so the verifier can’t solve them, and (2) if you show that one of them has a zero-knowledge proof, then they all have zero-knowledge proofs.

We’re going to describe and implement a zero-knowledge proof for graph 3-colorability, and in the next post we’ll dive into the theoretical definitions and talk about the proof that the scheme we present is zero-knowledge. As usual, all of the code used in making this post is available in a repository on this blog’s Github page. In the follow up to this post, we’ll dive into more nitty gritty details about the proof that this works, and study different kinds of zero-knowledge.

One-way permutations

In a recent program gallery post we introduced the Blum-Blum-Shub pseudorandom generator. A pseudorandom generator is simply an algorithm that takes as input a short random string of length s and produces as output a longer string, say, of length 3s. This output string should not be random, but rather “indistinguishable” from random in a sense we’ll make clear next time. The underlying function for this generator is the “modular squaring” function x \mapsto x^2 \mod M, for some cleverly chosen M. The M is chosen in such a way that makes this mapping a permutation. So this function is more than just a pseudorandom generator, it’s a one-way permutation.

If you have a primality-checking algorithm on hand (we do), then preparing the Blum-Blum-Shub algorithm is only about 15 lines of code.

def goodPrime(p):
    return p % 4 == 3 and probablyPrime(p, accuracy=100)

def findGoodPrime(numBits=512):
    candidate = 1

    while not goodPrime(candidate):
        candidate = random.getrandbits(numBits)

    return candidate

def makeModulus(numBits=512):
    return findGoodPrime(numBits) * findGoodPrime(numBits)

def blum_blum_shub(modulusLength=512):
    modulus = makeModulus(numBits=modulusLength)

    def f(inputInt):
        return pow(inputInt, 2, modulus)

    return f

The interested reader should check out the proof gallery post for more details about this generator. For us, having a one-way permutation is the important part (and we’re going to defer the formal definition of “one-way” until next time, just think “hard to get inputs from outputs”).

The other concept we need, which is related to a one-way permutation, is the notion of a hardcore predicate. Let G(x) be a one-way permutation, and let f(x) = b be a function that produces a single bit from a string. We say that f is a hardcore predicate for G if you can’t reliably compute f(x) when given only G(x).

Hardcore predicates are important because there are many one-way functions for which, when given the output, you can guess part of the input very reliably, but not the rest (e.g., if g is a one-way function, (x, y) \mapsto (x, g(y)) is also one-way, but the x part is trivially guessable). So a hardcore predicate formally measures, when given the output of a one-way function, what information derived from the input is hard to compute.

In the case of Blum-Blum-Shub, one hardcore predicate is simply the parity of the input bits.

def parity(n):
    return sum(int(x) for x in bin(n)[2:]) % 2

Bit Commitment Schemes

A core idea that will makes zero-knowledge proofs work for NP is the ability for the prover to publicly “commit” to a choice, and later reveal that choice in a way that makes it infeasible to fake their commitment. This will involve not just the commitment to a single bit of information, but also the transmission of auxiliary data that is provably infeasible to fake.

Our pair of one-way permutation G and hardcore predicate f comes in very handy. Let’s say I want to commit to a bit b \in \{ 0,1 \}. Let’s fix a security parameter that will measure how hard it is to change my commitment post-hoc, say n = 512. My process for committing is to draw a random string x of length n, and send you the pair (G(x), f(x) \oplus b), where \oplus is the XOR operator on two bits.

The guarantee of a one-way permutation with a hardcore predicate is that if you only see G(x), you can’t guess f(x) with any reasonable edge over random guessing. Moreover, if you fix a bit b, and take an unpredictably random bit y, the XOR b \oplus y is also unpredictably random. In other words, if f(x) is hardcore, then so is x \mapsto f(x) \oplus b for a fixed bit b. Finally, to reveal my commitment, I just send the string x and let you independently compute (G(x), f(x) \oplus b). Since G is a permutation, that x is the only x that could have produced the commitment I sent you earlier.

Here’s a Python implementation of this scheme. We start with a generic base class for a commitment scheme.

class CommitmentScheme(object):
    def __init__(self, oneWayPermutation, hardcorePredicate, securityParameter):
        '''
            oneWayPermutation: int -> int
            hardcorePredicate: int -> {0, 1}
        '''
        self.oneWayPermutation = oneWayPermutation
        self.hardcorePredicate = hardcorePredicate
        self.securityParameter = securityParameter

        # a random string of length `self.securityParameter` used only once per commitment
        self.secret = self.generateSecret()

    def generateSecret(self):
        raise NotImplemented

    def commit(self, x):
        raise NotImplemented

    def reveal(self):
        return self.secret

Note that the “reveal” step is always simply to reveal the secret. Here’s the implementation subclass. We should also note that the security string should be chosen at random anew for every bit you wish to commit to. In this post we won’t reuse CommitmentScheme objects anyway.

class BBSBitCommitmentScheme(CommitmentScheme):
    def generateSecret(self):
        # the secret is a random quadratic residue
        self.secret = self.oneWayPermutation(random.getrandbits(self.securityParameter))
        return self.secret

    def commit(self, bit):
        unguessableBit = self.hardcorePredicate(self.secret)
        return (
            self.oneWayPermutation(self.secret),
            unguessableBit ^ bit,  # python xor
        )

One important detail is that the Blum-Blum-Shub one-way permutation is only a permutation when restricted to quadratic residues. As such, we generate our secret by shooting a random string through the one-way permutation to get a random residue. In fact this produces a uniform random residue, since the Blum-Blum-Shub modulus is chosen in such a way that ensures every residue has exactly four square roots.

Here’s code to check the verification is correct.

class BBSBitCommitmentVerifier(object):
    def __init__(self, oneWayPermutation, hardcorePredicate):
        self.oneWayPermutation = oneWayPermutation
        self.hardcorePredicate = hardcorePredicate

    def verify(self, securityString, claimedCommitment):
        trueBit = self.decode(securityString, claimedCommitment)
        unguessableBit = self.hardcorePredicate(securityString)  # wasteful, whatever
        return claimedCommitment == (
            self.oneWayPermutation(securityString),
            unguessableBit ^ trueBit,  # python xor
        )

    def decode(self, securityString, claimedCommitment):
        unguessableBit = self.hardcorePredicate(securityString)
        return claimedCommitment[1] ^ unguessableBit

and an example of using it

if __name__ == "__main__":
    import blum_blum_shub
    securityParameter = 10
    oneWayPerm = blum_blum_shub.blum_blum_shub(securityParameter)
    hardcorePred = blum_blum_shub.parity

    print('Bit commitment')
    scheme = BBSBitCommitmentScheme(oneWayPerm, hardcorePred, securityParameter)
    verifier = BBSBitCommitmentVerifier(oneWayPerm, hardcorePred)

    for _ in range(10):
        bit = random.choice([0, 1])
        commitment = scheme.commit(bit)
        secret = scheme.reveal()
        trueBit = verifier.decode(secret, commitment)
        valid = verifier.verify(secret, commitment)

        print('{} == {}? {}; {} {}'.format(bit, trueBit, valid, secret, commitment))

Example output:

1 == 1? True; 524 (5685, 0)
1 == 1? True; 149 (22201, 1)
1 == 1? True; 476 (34511, 1)
1 == 1? True; 927 (14243, 1)
1 == 1? True; 608 (23947, 0)
0 == 0? True; 964 (7384, 1)
0 == 0? True; 373 (23890, 0)
0 == 0? True; 620 (270, 1)
1 == 1? True; 926 (12390, 0)
0 == 0? True; 708 (1895, 0)

As an exercise, write a program to verify that no other input to the Blum-Blum-Shub one-way permutation gives a valid verification. Test it on a small security parameter like n=10.

It’s also important to point out that the verifier needs to do some additional validation that we left out. For example, how does the verifier know that the revealed secret actually is a quadratic residue? In fact, detecting quadratic residues is believed to be hard! To get around this, we could change the commitment scheme reveal step to reveal the random string that was used as input to the permutation to get the residue (cf. BBSCommitmentScheme.generateSecret for the random string that needs to be saved/revealed). Then the verifier could generate the residue in the same way. As an exercise, upgrade the bit commitment an verifier classes to reflect this.

In order to get a zero-knowledge proof for 3-coloring, we need to be able to commit to one of three colors, which requires two bits. So let’s go overkill and write a generic integer commitment scheme. It’s simple enough: specify a bound on the size of the integers, and then do an independent bit commitment for every bit.

class BBSIntCommitmentScheme(CommitmentScheme):
    def __init__(self, numBits, oneWayPermutation, hardcorePredicate, securityParameter=512):
        '''
            A commitment scheme for integers of a prespecified length `numBits`. Applies the
            Blum-Blum-Shub bit commitment scheme to each bit independently.
        '''
        self.schemes = [BBSBitCommitmentScheme(oneWayPermutation, hardcorePredicate, securityParameter)
                        for _ in range(numBits)]
        super().__init__(oneWayPermutation, hardcorePredicate, securityParameter)

    def generateSecret(self):
        self.secret = [x.secret for x in self.schemes]
        return self.secret

    def commit(self, integer):
        # first pad bits to desired length
        integer = bin(integer)[2:].zfill(len(self.schemes))
        bits = [int(bit) for bit in integer]
        return [scheme.commit(bit) for scheme, bit in zip(self.schemes, bits)]

And the corresponding verifier

class BBSIntCommitmentVerifier(object):
    def __init__(self, numBits, oneWayPermutation, hardcorePredicate):
        self.verifiers = [BBSBitCommitmentVerifier(oneWayPermutation, hardcorePredicate)
                          for _ in range(numBits)]

    def decodeBits(self, secrets, bitCommitments):
        return [v.decode(secret, commitment) for (v, secret, commitment) in
                zip(self.verifiers, secrets, bitCommitments)]

    def verify(self, secrets, bitCommitments):
        return all(
            bitVerifier.verify(secret, commitment)
            for (bitVerifier, secret, commitment) in
            zip(self.verifiers, secrets, bitCommitments)
        )

    def decode(self, secrets, bitCommitments):
        decodedBits = self.decodeBits(secrets, bitCommitments)
        return int(''.join(str(bit) for bit in decodedBits))

A sample usage:

if __name__ == "__main__":
    import blum_blum_shub
    securityParameter = 10
    oneWayPerm = blum_blum_shub.blum_blum_shub(securityParameter)
    hardcorePred = blum_blum_shub.parity

    print('Int commitment')
    scheme = BBSIntCommitmentScheme(10, oneWayPerm, hardcorePred)
    verifier = BBSIntCommitmentVerifier(10, oneWayPerm, hardcorePred)
    choices = list(range(1024))
    for _ in range(10):
        theInt = random.choice(choices)
        commitments = scheme.commit(theInt)
        secrets = scheme.reveal()
        trueInt = verifier.decode(secrets, commitments)
        valid = verifier.verify(secrets, commitments)

        print('{} == {}? {}; {} {}'.format(theInt, trueInt, valid, secrets, commitments))

And a sample output:

527 == 527? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 0), (342, 1), (54363, 1), (63975, 0), (5426, 0), (9124, 1), (23973, 0), (44832, 0), (33044, 0), (68501, 0)]
67 == 67? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 1), (342, 1), (54363, 1), (63975, 1), (5426, 0), (9124, 1), (23973, 1), (44832, 1), (33044, 0), (68501, 0)]
729 == 729? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 0), (342, 1), (54363, 0), (63975, 1), (5426, 0), (9124, 0), (23973, 0), (44832, 1), (33044, 1), (68501, 0)]
441 == 441? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 1), (342, 0), (54363, 0), (63975, 0), (5426, 1), (9124, 0), (23973, 0), (44832, 1), (33044, 1), (68501, 0)]
614 == 614? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 0), (342, 1), (54363, 1), (63975, 1), (5426, 1), (9124, 1), (23973, 1), (44832, 0), (33044, 0), (68501, 1)]
696 == 696? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 0), (342, 1), (54363, 0), (63975, 0), (5426, 1), (9124, 0), (23973, 0), (44832, 1), (33044, 1), (68501, 1)]
974 == 974? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 0), (342, 0), (54363, 0), (63975, 1), (5426, 0), (9124, 1), (23973, 0), (44832, 0), (33044, 0), (68501, 1)]
184 == 184? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 1), (342, 1), (54363, 0), (63975, 0), (5426, 1), (9124, 0), (23973, 0), (44832, 1), (33044, 1), (68501, 1)]
136 == 136? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 1), (342, 1), (54363, 0), (63975, 0), (5426, 0), (9124, 1), (23973, 0), (44832, 1), (33044, 1), (68501, 1)]
632 == 632? True; [25461, 56722, 25739, 2268, 1185, 18226, 46375, 8907, 54979, 23095] [(29616, 0), (342, 1), (54363, 1), (63975, 1), (5426, 1), (9124, 0), (23973, 0), (44832, 1), (33044, 1), (68501, 1)]

Before we move on, we should note that this integer commitment scheme “blows up” the secret by quite a bit. If you have a security parameter s and an integer with n bits, then the commitment uses roughly sn bits. A more efficient method would be to simply use a good public-key encryption scheme, and then reveal the secret key used to encrypt the message. While we implemented such schemes previously on this blog, I thought it would be more fun to do something new.

A zero-knowledge proof for 3-coloring

First, a high-level description of the protocol. The setup: the prover has a graph G with n vertices V and m edges E, and also has a secret 3-coloring of the vertices \varphi: V \to \{ 0, 1, 2 \}. Recall, a 3-coloring is just an assignment of colors to vertices (in this case the colors are 0,1,2) so that no two adjacent vertices have the same color.

So the prover has a coloring \varphi to be kept secret, but wants to prove that G is 3-colorable. The idea is for the verifier to pick a random edge (u,v), and have the prover reveal the colors of u and v. However, if we run this protocol only once, there’s nothing to stop the prover from just lying and picking two distinct colors. If we allow the verifier to run the protocol many times, and the prover actually reveals the colors from their secret coloring, then after roughly |V| rounds the verifier will know the entire coloring. Each step reveals more knowledge.

We can fix this with two modifications.

  1. The prover first publicly commits to the coloring using a commitment scheme. Then when the verifier asks for the colors of the two vertices of a random edge, he can rest assured that the prover fixed a coloring that does not depend on the verifier’s choice of edge.
  2. The prover doesn’t reveal colors from their secret coloring, but rather from a random permutation of the secret coloring. This way, when the verifier sees colors, they’re equally likely to see any two colors, and all the verifier will know is that those two colors are different.

So the scheme is: prover commits to a random permutation of the true coloring and sends it to the verifier; the verifier asks for the true colors of a given edge; the prover provides those colors and the secrets to their commitment scheme so the verifier can check.

The key point is that now the verifier has to commit to a coloring, and if the coloring isn’t a proper 3-coloring the verifier has a reasonable chance of picking an improperly colored edge (a one-in-|E| chance, which is at least 1/|V|^2). On the other hand, if the coloring is proper, then the verifier will always query a properly colored edge, and it’s zero-knowledge because the verifier is equally likely to see every pair of colors. So the verifier will always accept, but won’t know anything more than that the edge it chose is properly colored. Repeating this |V|^2-ish times, with high probability it’ll have queried every edge and be certain the coloring is legitimate.

Let’s implement this scheme. First the data types. As in the previous post, graphs are represented by edge lists, and a coloring is represented by a dictionary mapping a vertex to 0, 1, or 2 (the “colors”).

# a graph is a list of edges, and for simplicity we'll say
# every vertex shows up in some edge
exampleGraph = [
    (1, 2),
    (1, 4),
    (1, 3),
    (2, 5),
    (2, 5),
    (3, 6),
    (5, 6)
]

exampleColoring = {
    1: 0,
    2: 1,
    3: 2,
    4: 1,
    5: 2,
    6: 0,
}

Next, the Prover class that implements that half of the protocol. We store a list of integer commitment schemes for each vertex whose color we need to commit to, and send out those commitments.

class Prover(object):
    def __init__(self, graph, coloring, oneWayPermutation=ONE_WAY_PERMUTATION, hardcorePredicate=HARDCORE_PREDICATE):
        self.graph = [tuple(sorted(e)) for e in graph]
        self.coloring = coloring
        self.vertices = list(range(1, numVertices(graph) + 1))
        self.oneWayPermutation = oneWayPermutation
        self.hardcorePredicate = hardcorePredicate
        self.vertexToScheme = None

    def commitToColoring(self):
        self.vertexToScheme = {
            v: commitment.BBSIntCommitmentScheme(
                2, self.oneWayPermutation, self.hardcorePredicate
            ) for v in self.vertices
        }

        permutation = randomPermutation(3)
        permutedColoring = {
            v: permutation[self.coloring[v]] for v in self.vertices
        }

        return {v: s.commit(permutedColoring[v])
                for (v, s) in self.vertexToScheme.items()}

    def revealColors(self, u, v):
        u, v = min(u, v), max(u, v)
        if not (u, v) in self.graph:
            raise Exception('Must query an edge!')

        return (
            self.vertexToScheme[u].reveal(),
            self.vertexToScheme[v].reveal(),
        )

In commitToColoring we randomly permute the underlying colors, and then compose that permutation with the secret coloring, committing to each resulting color independently. In revealColors we reveal only those colors for a queried edge. Note that we don’t actually need to store the permuted coloring, because it’s implicitly stored in the commitments.

It’s crucial that we reject any query that doesn’t correspond to an edge. If we don’t reject such queries then the verifier can break the protocol! In particular, by querying non-edges you can determine which pairs of nodes have the same color in the secret coloring. You can then chain these together to partition the nodes into color classes, and so color the graph. (After seeing the Verifier class below, implement this attack as an exercise).

Here’s the corresponding Verifier:

class Verifier(object):
    def __init__(self, graph, oneWayPermutation, hardcorePredicate):
        self.graph = [tuple(sorted(e)) for e in graph]
        self.oneWayPermutation = oneWayPermutation
        self.hardcorePredicate = hardcorePredicate
        self.committedColoring = None
        self.verifier = commitment.BBSIntCommitmentVerifier(2, oneWayPermutation, hardcorePredicate)

    def chooseEdge(self, committedColoring):
        self.committedColoring = committedColoring
        self.chosenEdge = random.choice(self.graph)
        return self.chosenEdge

    def accepts(self, revealed):
        revealedColors = []

        for (w, bitSecrets) in zip(self.chosenEdge, revealed):
            trueColor = self.verifier.decode(bitSecrets, self.committedColoring[w])
            revealedColors.append(trueColor)
            if not self.verifier.verify(bitSecrets, self.committedColoring[w]):
                return False

        return revealedColors[0] != revealedColors[1]

As expected, in the acceptance step the verifier decodes the true color of the edge it queried, and accepts if and only if the commitment was valid and the edge is properly colored.

Here’s the whole protocol, which is syntactically very similar to the one for graph isomorphism.

def runProtocol(G, coloring, securityParameter=512):
    oneWayPermutation = blum_blum_shub.blum_blum_shub(securityParameter)
    hardcorePredicate = blum_blum_shub.parity

    prover = Prover(G, coloring, oneWayPermutation, hardcorePredicate)
    verifier = Verifier(G, oneWayPermutation, hardcorePredicate)

    committedColoring = prover.commitToColoring()
    chosenEdge = verifier.chooseEdge(committedColoring)

    revealed = prover.revealColors(*chosenEdge)
    revealedColors = (
        verifier.verifier.decode(revealed[0], committedColoring[chosenEdge[0]]),
        verifier.verifier.decode(revealed[1], committedColoring[chosenEdge[1]]),
    )
    isValid = verifier.accepts(revealed)

    print("{} != {} and commitment is valid? {}".format(
        revealedColors[0], revealedColors[1], isValid
    ))

    return isValid

And an example of running it

if __name__ == "__main__":
    for _ in range(30):
        runProtocol(exampleGraph, exampleColoring, securityParameter=10)

Here’s the output

0 != 2 and commitment is valid? True
1 != 0 and commitment is valid? True
1 != 2 and commitment is valid? True
2 != 0 and commitment is valid? True
1 != 2 and commitment is valid? True
2 != 0 and commitment is valid? True
0 != 2 and commitment is valid? True
0 != 2 and commitment is valid? True
0 != 1 and commitment is valid? True
0 != 1 and commitment is valid? True
2 != 1 and commitment is valid? True
0 != 2 and commitment is valid? True
2 != 0 and commitment is valid? True
2 != 0 and commitment is valid? True
1 != 0 and commitment is valid? True
1 != 0 and commitment is valid? True
0 != 2 and commitment is valid? True
2 != 1 and commitment is valid? True
0 != 2 and commitment is valid? True
0 != 2 and commitment is valid? True
2 != 1 and commitment is valid? True
1 != 0 and commitment is valid? True
1 != 0 and commitment is valid? True
2 != 1 and commitment is valid? True
2 != 1 and commitment is valid? True
1 != 0 and commitment is valid? True
0 != 2 and commitment is valid? True
1 != 2 and commitment is valid? True
1 != 2 and commitment is valid? True
0 != 1 and commitment is valid? True

So while we haven’t proved it rigorously, we’ve seen the zero-knowledge proof for graph 3-coloring. This automatically gives us a zero-knowledge proof for all of NP, because given any NP problem you can just convert it to the equivalent 3-coloring problem and solve that. Of course, the blowup required to convert a random NP problem to 3-coloring can be polynomially large, which makes it unsuitable for practice. But the point is that this gives us a theoretical justification for which problems have zero-knowledge proofs in principle. Now that we’ve established that you can go about trying to find the most efficient protocol for your favorite problem.

Anticipatory notes

When we covered graph isomorphism last time, we said that a simulator could, without participating in the zero-knowledge protocol or knowing the secret isomorphism, produce a transcript that was drawn from the same distribution of messages as the protocol produced. That was all that it needed to be “zero-knowledge,” because anything the verifier could do with its protocol transcript, the simulator could do too.

We can do exactly the same thing for 3-coloring, exploiting the same “reverse order” trick where the simulator picks the random edge first, then chooses the color commitment post-hoc.

Unfortunately, both there and here I’m short-changing you, dear reader. The elephant in the room is that our naive simulator assumes the verifier is playing by the rules! If you want to define security, you have to define it against a verifier who breaks the protocol in an arbitrary way. For example, the simulator should be able to produce an equivalent transcript even if the verifier deterministically picks an edge, or tries to pick a non-edge, or tries to send gibberish. It takes a lot more work to prove security against an arbitrary verifier, but the basic setup is that the simulator can no longer make choices for the verifier, but rather has to invoke the verifier subroutine as a black box. (To compensate, the requirements on the simulator are relaxed quite a bit; more on that next time)

Because an implementation of such a scheme would involve a lot of validation, we’re going to defer the discussion to next time. We also need to be more specific about the different kinds of zero-knowledge, since we won’t be able to achieve perfect zero-knowledge with the simulator drawing from an identical distribution, but rather a computationally indistinguishable distribution.

We’ll define all this rigorously next time, and discuss the known theoretical implications and limitations. Next time will be cuffs-off theory, baby!

Until then!

The Blum-Blum-Shub Pseudorandom Generator

Problem: Design a random number generator that is computationally indistinguishable from a truly random number generator.

Solution (in Python): note this solution uses the Miller-Rabin primality tester, though any primality test will do. See the github repository for the referenced implementation.

from randomized.primality import probablyPrime
import random

def goodPrime(p):
    return p % 4 == 3 and probablyPrime(p, accuracy=100)

def findGoodPrime(numBits=512):
    candidate = 1
    while not goodPrime(candidate):
        candidate = random.getrandbits(numBits)
    return candidate

def makeModulus():
    return findGoodPrime() * findGoodPrime()

def parity(n):
    return sum(int(x) for x in bin(n)[2:]) % 2

class BlumBlumShub(object):
    def __init__(self, seed=None):
        self.modulus = makeModulus()
        self.state = seed if seed is not None else random.randint(2, self.modulus - 1)
        self.state = self.state % self.modulus

    def seed(self, seed):
        self.state = seed

    def bitstream(self):
        while True:
            yield parity(self.state)
            self.state = pow(self.state, 2, self.modulus)

    def bits(self, n=20):
        outputBits = ''
        for bit in self.bitstream():
            outputBits += str(bit)
            if len(outputBits) == n:
                break

        return outputBits

Discussion:

An integer x is called a quadratic residue of another integer N if it can be written as x = a^2 \mod N for some a. That is, if it’s the remainder when dividing a perfect square by N. Some numbers, like N=8, have very special patterns in their quadratic residues, only 0, 1, and 4 can occur as quadratic residues.

The core idea behind this random number generator is that, for a specially chosen modulus N, telling whether a number x is a quadratic residue mod N is hard. In fact, one can directly convert an algorithm that can predict the next bit of this random number generator (by even a slight edge) into an arbitrarily accurate quadratic-residue-decider. So if computing quadratic residues is even mildly hard, then predicting the next bit in this random number generator is very hard.

More specifically, the conjectured guarantee about this random number generator is the following: if you present a polynomial time adversary with two sequences:

  1. A truly random sequence of bits of length k,
  2. k bits from the output of the pseudorandom generator when seeded with a starting state shorter than k bits.

Then the adversary can’t distinguish between the two sequences with probability “significantly” more than 1/2, where by “significantly” I mean 1/k^c for any c>0 (i.e., the edge over randomness vanishes faster than any inverse polynomial). It turns out, due to a theorem of Yao, that this is equivalent to not being able to guess the next bit in a pseudorandom sequence with a significant edge over a random guess, even when given the previous \log(N)^{10} bits in the sequence (or any \textup{poly}(\log N) bits in the sequence).

This emphasizes a deep philosophical viewpoint in theoretical computer science, that whether some object has a property (randomness) really only depends on the power of a computationally limited observer to identify that property. If nobody can tell the difference between fake randomness and real randomness, then the fake randomness is random. Offhand I wonder whether you can meaningfully apply this view to less mathematical concepts like happiness and status.

Anyway, the modulus N is chosen in such a way that every quadratic residue of N has a unique square root which is also a quadratic residue. This makes the squaring function a bijection on quadratic residues. In other words, with a suitably chosen N, there’s no chance that we’ll end up with N=8 where there are very few quadratic residues and the numbers output by the Blum-Blum-Shub generator have a short cycle. Moreover, the assumption that detecting quadratic residues mod N is hard makes the squaring function a one-way permutation.

Here’s an example of how this generator might be used:

generator = BlumBlumShub()

hist = [0] * 2**6
for i in range(10000):
    value = int(generator.bits(6), 2)
    hist[value] += 1

print(hist)

This produces random integers between 0 and 64, with the following histogram:

bbs-hist

See these notes of Junod for a detailed exposition of the number theory behind this random number generator, with full definitions and proofs.

Learning to Love Complex Numbers

This post is intended for people with a little bit of programming experience and no prior mathematical background.

So let’s talk about numbers.

Numbers are curious things. On one hand, they represent one of the most natural things known to humans, which is quantity. It’s so natural to humans that even newborn babies are in tune with the difference between quantities of objects between 1 and 3, in that they notice when quantity changes much more vividly than other features like color or shape.

But our familiarity with quantity doesn’t change the fact that numbers themselves (as an idea) are a human invention. And they’re not like most human inventions, the kinds where you have to tinker with gears or circuits to get a machine that makes your cappuccino. No, these are mathematical inventions. These inventions exist only in our minds.

Numbers didn’t always exist. A long time ago, back when the Greeks philosophers were doing their philosophizing, negative numbers didn’t exist! In fact, it wasn’t until 1200 AD that the number zero was first considered in Europe. Zero, along with negative numbers and fractions and square roots and all the rest, were invented primarily to help people solve more problems than they could with the numbers they had available. That is, numbers were invented primarily as a way for people to describe their ideas in a useful way. People simply  wondered “is there a number whose square gives you 2?” And after a while they just decided there was and called it \sqrt{2} because they didn’t have a better name for it. 

But with these new solutions came a host of new problems. You see, although I said mathematical inventions only exist in our minds, once they’re invented they gain a life of their own. You start to notice patterns in your mathematical objects and you have to figure out why they do the things they do. And numbers are a perfectly good example of this: once I notice that I can multiply a number by itself, I can ask how often these “perfect squares” occur. That is, what’s the pattern in the numbers 1^2, 2^2, 3^2, 4^2, \dots? If you think about it for a while, you’ll find that square numbers have a very special relationship with odd numbers.

Other times, however, the things you invent turn out to make no sense at all, and you can prove they never existed in the first place! It’s an odd state of affairs, but we’re going to approach the subject of complex numbers from this mindset. We’re going to come up with a simple idea, the idea that negative numbers can be perfect squares, and explore the world of patterns it opens up. Along the way we’ll do a little bit of programming to help explore, give some simple proofs to solidify our intuition, and by the end we’ll see how these ideas can cause wonderful patterns like this one:

mandelbrot

The number i

Let’s bring the story back around to squares. One fact we all remember about numbers is that squaring a number gives you something non-negative. 7^2 = 49, (-2)^2 = 4, 0^2 = 0, and so on. But it certainly doesn’t have to be this way. What if we got sick of that stupid fact and decided to invent a new number whose square was negative? Which negative, you ask? Well it doesn’t really matter, because I can always stretch it larger or smaller so that it’s square is -1.

Let’s see how: if you say that your made-up number x makes x^2 = -7, then I can just use \frac{x}{\sqrt{7}} to get a number whose square is -1. If you’re going to invent a number that’s supposed to interact with our usual numbers, then you have to be allowed to add, subtract, and multiply x with regular old real numbers, and the usual properties would have to still work. So it would have to be true that (x / \sqrt{7})^2 = x^2 / \sqrt{7}^2 = -7/7 = -1.

So because it makes no difference (this is what mathematicians mean by, “without loss of generality”) we can assume that the number we’re inventing will have a square of negative one. Just to line up with history, let’s call the new number i. So there it is: i exists and i^2 = -1. And now that we are “asserting” that i plays nicely with real numbers, we get these natural rules for adding and subtracting and multiplying and dividing. For example

  • 1 + i is a new number, which we’ll just call 1+i. And if we added two of these together, (1+ i) + (1+i), we can combine the real parts and the i parts to get 2 + 2i. Same goes for subtraction. In general a complex number looks like a + bi, because as we’ll see in the other points you can simplify every simple arithmetic expression down to just one “real number” part and one “real number times i” part.
  • We can multiply 3 \cdot i, and we’ll just call it 3i, and we require that multiplication distributes across addition (that the FOIL rule works). So that, for example, (2 - i)(1 + 3i) = (2 + 6i - i - 3i^2) = (2 + 3) + (6i - i) = (5 + 5i).
  • Dividing is a significantly more annoying. Say we want to figure out what 1 / (1+i) is (in fact, it’s not even obvious that this should look like a regular number! But it does). The 1 / a notation just means we’re looking for a number which, when we multiply by the denominator a, we get back to 1. So we’re looking to find out when (a + bi)(1 + i) = 1 + 0i where a and b are variables we’re trying to solve for. If we multiply it out we get (a-b) + (a + b)i = 1 + 0i, and since the real part and the i part have to match up, we know that a - b = 1 and a + b = 0. If we solve these two equations, we find that a = 1/2, b = -1/2 works great. If we want to figure out something like (2 + 3i) / (1 - i), we just find out what 1 / (1- i) is first, and then multiply the result by (2+3i).

So that was tedious and extremely boring, and we imagine you didn’t even read it (that’s okay, it really is boring!). All we’re doing is establishing ground rules for the game, so if you come across some arithmetic that doesn’t make sense, you can refer back to this list to see what’s going on. And once again, for the purpose of this post, we’re asserting that all these laws hold. Maybe some laws follow from others, but as long as we don’t come up with any nasty self-contradictions we’ll be fine.

And now we turn to the real questions: is i the only square root of -1? Does i itself have a square root? If it didn’t, we’d be back to where we started, with some numbers (the non-i numbers) having square roots while others don’t. And so we’d feel the need to make all the i numbers happy by making up more numbers to be their square roots, and then worrying what if these new numbers don’t have square roots and…gah!

I’ll just let you in on the secret to save us from this crisis. It turns out that i does have a square root in terms of other i numbers, but in order to find it we’ll need to understand i from a different angle, and that angle turns out to be geometry.

Geometry? How is geometry going to help me understand numbers!?

It’s a valid question and part of why complex numbers are so fascinating. And I don’t mean geometry like triangles and circles and parallel lines (though there will be much talk of angles), I mean transformations in the sense that we’ll be “stretching,” “squishing,” and “rotating” numbers. Maybe another time I can tell you why for me “geometry” means stretching and rotating; it’s a long but very fun story.

The clever insight is that you can represent complex numbers as geometric objects in the first place. To do it, you just think of a + bi as a pair of numbers (a,b), (the pair of real part and i part), and then plot that point on a plane. For us, the x-axis will be the “real” axis, and the y-axis will be the i-axis. So the number (3 - 4i) is plotted 3 units in the positive x direction and 4 units in the negative y direction. Like this:

single-complex-number

The “j” instead of “i” is not a typo, but a disappointing fact about the programming language we used to make this image. We’ll talk more about why later.

We draw it as an arrow for a good reason. Stretching, squishing, rotating, and reflecting will all be applied to the arrow, keeping its tail fixed at the center of the axes. Sometimes the arrow is called a “vector,” but we won’t use that word because here it’s synonymous with “complex number.”

So let’s get started squishing stuff.

Stretching, Squishing, Rotating

Before we continue I should clear up some names. We call a number that has an i in it a complex number, and we call the part without the i the real part (like 2 in 2-i) and the part with i the complex part.

Python is going to be a great asset for us in exploring complex numbers, so let’s jump right into it. It turns out that Python natively supports complex numbers, and I wrote a program for drawing complex numbers. I used it to make the plot above. The program depends on a library I hate called matplotlib, and so the point of the program is to shield you from as much pain as possible and focus on complex numbers. You can use the program by downloading it from this blog’s Github page, along with everything else I made in writing this post. All you need to know how to do is call a function, and I’ve done a bit of window dressing removal to simplify things (I really hate matplotlib).

Here’s the function header:

# plotComplexNumbers : [complex] -> None
# display a plot of the given list of complex numbers
def plotComplexNumbers(numbers):
   ...

Before we show some examples of how to use it, we have to understand how to use complex numbers in Python. It’s pretty simple, except that Python was written by people who hate math, and so they decided the complex number would be represented by j instead of i (people who hate math are sometimes called “engineers,” and they use j out of spite. Not really, though).

So in Python it’s just like any other computation. For example:

>>> (1 + 1j)*(4 - 2j) == (6+2j)
True
>>> 1 / (1+1j)
(0.5-0.5j)

And so calling the plotting function with a given list of complex numbers is as simple as importing the module and calling the function

from plotcomplex import plot
plot.plotComplexNumbers([(-1+1j), (1+2j), (-1.5 - 0.5j), (.6 - 1.8j)])

Here’s the result

example-complex-plot

So let’s use plots like this one to explore what “multiplication by i” does to a complex number. It might not seem exciting at first, but I promise there’s a neat punchline.

Even without plotting it’s pretty easy to tell what multiplying by i does to some numbers. It takes 1 to i, moves i to i^2 = -1, it takes -1 to -i, and -i to -i \cdot i = 1.

What’s the pattern in these? well if we plot all these numbers, they’re all at right angles in counter-clockwise order. So this might suggest that multiplication by i does some kind of rotation. Is that always the case? Well lets try it with some other more complicated numbers. Click the plots below to enlarge.

Well, it looks close but it’s hard to tell. Some of the axes are squished and stretched, so it might be that our images don’t accurately represent the numbers (the real world can be such a pain). Well when visual techniques fail, we can attempt to prove it.

Clearly multiplying by i does some kind of rotation, maybe with other stuff too, and it shouldn’t be so hard to see that multiplying by i does the same thing no matter which number you use (okay, the skeptical readers will say that’s totally hard to see, but we’ll prove it super rigorously in a minute). So if we take any number and multiply it by i once, then twice, then three times, then four, and if we only get back to where we started at four multiplications, then each rotation had to be a quarter turn.

Indeed,

\displaystyle (a + bi) i^4 = (ai - b) i^3 = (-a - bi) i^2 = (-ai + b) i = a + bi

This still isn’t all that convincing, and we want to be 100% sure we’re right. What we really need is a way to arithmetically compute the angle between two complex numbers in their plotted forms. What we’ll do is find a way to measure the angle of one complex number with the x-axis, and then by subtraction we can get angles between arbitrary points. For example, in the figure below \theta = \theta_1 - \theta_2.

angle-example

One way to do this is with trigonometry: the geometric drawing of a + bi is the hypotenuse of a right triangle with the x-axis.

triangle-example

And so if r is the length of the arrow, then by the definition of sine and cosine, \cos(\theta) = a/r, \sin(\theta) = b/r. If we have r, \theta, and r > 0, we can solve for a unique a and b, so instead of representing a complex number in terms of the pair of numbers (a,b), we can represent it with the pair of numbers (r, \theta). And the conversion between the two is just

a + bi = r \cos(\theta) + (r \sin(\theta)) i

The (r, \theta) representation is called the polar representation, while the (a,b) representation is called the rectangular representation or the Cartesian representation. Converting between polar and Cartesian coordinates fills the pages of many awful pre-calculus textbooks (despite the fact that complex numbers don’t exist in classical calculus). Luckily for us Python has built-in functions to convert between the two representations for us.

>>> import cmath
>>> cmath.polar(1 + 1j)
(1.4142135623730951, 0.7853981633974483)
>>> z = cmath.polar(1 + 1j)
>>> cmath.rect(z[0], z[1])
(1.0000000000000002+1j)

It’s a little bit inaccurate on the rounding, but it’s fine for our purposes.

So how do we compute the angle between two complex numbers? Just convert each to the polar form, and subtract the second coordinates. So if we get back to our true goal, to figure out what multiplication by i does, we can just do everything in polar form. Here’s a program that computes the angle between two complex numbers.

def angleBetween(z, w):
   zPolar, wPolar = cmath.polar(z), cmath.polar(w)
   return wPolar[1] - zPolar[1]

print(angleBetween(1 + 1j, (1 + 1j) * 1j))
print(angleBetween(2 - 3j, (2 - 3j) * 1j))
print(angleBetween(-0.5 + 7j, (-0.5 + 7j) * 1j))

Running it gives

1.5707963267948966
1.5707963267948966
-4.71238898038469

Note that the decimal form of \pi/2 is 1.57079…, and that the negative angle is equivalent to \pi/2 if you add a full turn of 2\pi to it. So programmatically we can see that for every input we try multiplying by i rotates 90 degrees.

But we still haven’t proved it works. So let’s do that now. To say what the angle is between r \cos (\theta) + ri \sin (\theta) and i \cdot [r \cos (\theta) + ri \sin(\theta)] = -r \sin (\theta) + ri \cos(\theta), we need to transform the second number into the usual polar form (where the i is on the sine part and not the cosine part). But we know, or I’m telling you now, this nice fact about sine and cosine:

\displaystyle \sin(\theta + \pi/2) = cos(\theta)
\displaystyle \cos(\theta + \pi / 2) = -\sin(\theta)

This fact is maybe awkward to write out algebraically, but it’s just saying that if you shift the whole sine curve a little bit you get the cosine curve, and if you keep shifting it you get the opposite of the sine curve (and if you kept shifting it even more you’d eventually get back to the sine curve; they’re called periodic for this reason).

So immediately we can rewrite the second number as r \cos(\theta + \pi/2) + i r \sin (\theta + \pi/2). The angle is the same as the original angle plus a right angle of \pi/2. Neat!

Applying this same idea to (a + bi) \cdot (c + di), it’s not much harder to prove that multiplying two complex numbers in general multiplies their lengths and adds their angles. So if a complex number z has its magnitude r smaller than 1, multiplying by z squishes and rotates whatever is being multiplied. And if the magnitude is greater than 1, it stretches and rotates. So we have a super simple geometric understanding of how arithmetic with complex numbers works. And as we’re about to see, all this stretching and rotating results in some really weird (and beautifully mysterious!) mathematics and programs.

But before we do that we still have one question to address, the question that started this whole geometric train of thought: does i have a square root? Indeed, I’m just looking for a number such that, when I square its length and double its angle, I get i = \cos(\pi/2) + i \sin(\pi/2). Indeed, the angle we want is \pi/4, and the length we want is r = 1, which means \sqrt{i} = \cos(\pi/4) + i \sin(\pi/4). Sweet! There is another root if you play with the signs, see if you can figure it out.

In fact it’s a very deeper and more beautiful theorem (“theorem” means “really important fact”) called the fundamental theorem of algebra. And essentially it says that the complex numbers are complete. That is, we can always find square roots, cube roots, or anything roots of numbers involving i. It actually says a lot more, but it’s easier to appreciate the rest of it after you do more math than we’re going to do in this post.

On to pretty patterns!

The Fractal

So here’s a little experiment. Since every point in the plane is the end of some arrow representing a complex number, we can imagine transforming the entire complex plane by transforming each number by the same rule. The most interesting simple rule we can think of: squaring! So though it might strain your capacity for imagination, try to visualize the idea like this. Squaring a complex number is the same as squaring it’s length and doubling its angle. So imagine: any numbers whose arrows are longer than 1 will grow much bigger, arrows shorter than 1 will shrink, and arrows of length exactly one will stay the same length (arrows close to length 1 will grow/shrink much more slowly than those far away from 1). And complex numbers with small positive angles will increase their angle, but only a bit, while larger angles will grow faster.

Here’s an animation made by Douglas Arnold showing what happens to the set of complex numbers a + bi with 0 \leq a, b \leq 1 or -1 < a,b < 0. Again, imagine every point is the end of a different arrow for the corresponding complex number. The animation is for a single squaring, and the points move along the arc they would travel if one rotated/stretched them smoothly.

complex-squaring

So that’s pretty, but this is by all accounts a well-behaved transformation. It’s “predictable,” because for example we can always tell which complex numbers will get bigger and bigger (in length) and which will get smaller.

What if, just for the sake of tinkering, we changed the transformation a little bit? That is, instead of sending z = a+bi to z^2 (I’ll often write this z \mapsto z^2), what if we sent

\displaystyle z \mapsto z^2 + 1

Now it’s not so obvious: which numbers will grow and which will shrink? Notice that it’s odd because adding 1 only changes the real part of the number. So a number whose length is greater than 1 can become small under this transformation. For example, i is sent to 0, so something slightly larger would also be close to zero. Indeed, 5i/4 \mapsto -9/16.

So here’s an interesting question: are there any complex numbers that will stay small even if I keep transforming like this forever? Specifically, if I call f(z) = z^2, and I call f^2(z) = f(f(z)), and likewise call f^k(z) for k repeated transformations of z, is there a number z so that for every k, the value f^k(z) < 2? “Obvious” choices like z=0 don’t work, and neither do random guesses like z=i or z=1. So should we guess the answer is no?

Before we jump to conclusions let’s write a program to see what happens for more than our random guesses. The program is simple: we’ll define the “square plus one” function, and then repeatedly apply that function to a number for some long number of times (say, 250 times). If the length of the number stays under 2 after so many tries, we’ll call it “small forever,” and otherwise we’ll call it “not small forever.”

def squarePlusOne(z):
   return z*z + 1

def isSmallForever(z, f):
   k = 0

   while abs(z) < 2: z = f(z) k += 1 if k > 250:
         return True

   return False

This isSmallForever function is generic: you can give it any function f and it will repeatedly call f on z until the result grows bigger than 2 in length. Note that the abs function is a built-in Python function for computing the length of a complex number.

Then I wrote a classify function, which you can give a window and a small increment, and it will produce a grid of zeros and ones marking the results of isSmallForever. The details of the function are not that important. I also wrote a function that turns the grid into a picture. So here’s an example of how we’d use it:

from plotcomplex.plot import gridToImage

def classifySquarePlusOne(z):
   return isSmallForever(z, squarePlusOne)

grid = classify(classifySquarePlusOne) # the other arguments are defaulted to [-2,2], [-2,2], 0.1
gridToImage(grid)

And here’s the result. Points colored black grow beyond 2, and white points stay small for the whole test.

Looks like they'll always grow big.

Looks like they’ll always grow big.

So it looks like repeated squaring plus one will always make complex numbers grow big. That’s not too exciting, but we can always make it more exciting. What happens if we replace the 1 in z^2 + 1 with a different complex number? For example, if we do z^2 - 1 then will things always grow big?

You can randomly guess and see that 0 will never grow big, because 0^2 - 1 = -1 and (-1)^2 - 1 = 0. It will just oscillate forever. So with -1 some numbers will grow and some will not! Let’s use the same routine above to see which:

def classifySquareMinusOne(z):
      return isSmallForever(z, squareMinusOne)

grid = classify(classifySquareMinusOne)
gridToImage(grid)

And the result:

second-attempt

Now that’s a more interesting picture! Let’s ramp up the resolution

grid = classify(classifySquareMinusOne, step=0.001)
gridToImage(grid)

second-attempt-zoomed

Gorgeous. If you try this at home you’ll notice, however, that this took a hell of a long time to run. Speeding up our programs is very possible, but it’s a long story for another time. For now we can just be patient.

Indeed, this image has a ton of interesting details! It looks almost circular in the middle, but if we zoom in we can see that it’s more like a rippling wave

second-attempt-zoomed2

It’s pretty incredible, and a huge question is jumping out at me: what the heck is causing this pattern to occur? What secret does -1 know that +1 doesn’t that makes the resulting pattern so intricate?

But an even bigger question is this. We just discovered that some values of c make z \mapsto z^2 + c result in interesting patterns, and some do not! So the question is which ones make interesting patterns? Even if we just, say, fix the starting point to zero: what is the pattern in the complex numbers that would tell me when this transformation makes zero blow up, and when it keeps zero small?

Sounds like a job for another program. This time we’ll use a nice little Python feature called a closure, which we define a function that saves the information that exists when it’s created for later. It will let us write a function that takes in c and produces a function that transforms according to z \mapsto z^2+c.

def squarePlusC(c):
   def f(z):
      return z*z + c

   return f

And we can use the very same classification/graphing function from before to do this.

def classifySquarePlusC(c):
   return isSmallForever(0, squarePlusC(c))

grid = classify(classifySquarePlusC, xRange=(-2, 1), yRange=(-1, 1), step=0.005)
gridToImage(grid)

And the result:

mandelbrot

Stunning. This wonderful pattern, which is still largely not understood today, is known as the Mandelbrot set. That is, the white points are the points in the Mandlebrot set, and the black points are not in it. The detail on the border of this thing is infinitely intricate. For example, we can change the window in our little program to zoom in on a particular region.

mandelbrot-zoomed

And if you keep zooming in you keep getting more and more detail. This was true of the specific case of z^2 - 1, but somehow the patterns in the Mandelbrot set are much more varied and interesting. And if you keep going down eventually you’ll see patterns that look like the original Mandelbrot set. We can already kind of see that happening above. The name for this idea is a fractal, and the z^2 - 1 image has it too. Fractals are a fascinating and mysterious subject studied in a field called discrete dynamical systems. Many people dedicate their entire lives to studying these things, and it’s for good reason. There’s a lot to learn and even more that’s unknown!

So this is the end of our journey for now. I’ve posted all of the code we used in the making of this post so you can continue to play, but here are some interesting ideas.

  • The Mandelbrot set (and most fractals) are usually colored. The way they’re colored is as follows. Rather than just say true or false when zero blows up beyond 2 in length, you return the number of iterations k that happened. Then you pick a color based on how big k is. There’s a link below that lets you play with this. In fact, adding colors shows that there is even more intricate detail happening outside the Mandelbrot set that’s too faint to see in our pictures above. Such as this.
  • Some very simple questions about fractals are very hard to answer. For example, is the Mandelbrot set connected? That is, is it possible to “walk” from every point in the Mandelbrot set to every other point without leaving the set? Despite the scattering of points in the zoomed in picture above that suggest the answer is no, the answer is actually yes! This is a really difficult thing to prove, however.
  • The patterns in many fractals are often used to generate realistic looking landscapes and generate pseudo randomness. So fractals are not just mathematical curiosities.
  • You should definitely be experimenting with this stuff! What happens if you change the length threshold from 2 to some bigger number? What about a smaller number? What if you do powers different than 2? There’s so much to explore!
  • The big picture thing to take away from this is that it’s not the numbers themselves that are particularly interesting, it’s the transformations of the numbers that generate these patterns! The interesting questions are what kinds of things are the same under these transformations, and what things are different. This is a very general idea in mathematics, and the more math you do the more you’ll find yourself wondering about useful and bizarre transformations.

For the chance to keep playing with the Mandelbrot set, check out this Mandelbrot grapher that works in your browser. It lets you drag rectangles to zoom further in on regions of interest. It’s really fun.

Until next time!