Sending and Authenticating Messages with Elliptic Curves

Last time we saw the Diffie-Hellman key exchange protocol, and discussed the discrete logarithm problem and the related Diffie-Hellman problem, which form the foundation for the security of most protocols that use elliptic curves. Let’s continue our journey to investigate some more protocols.

Just as a reminder, the Python implementations of these protocols are not at all meant for practical use, but for learning purposes. We provide the code on this blog’s Github page, but for the love of security don’t actually use them.

Shamir-Massey-Omura

Recall that there are lots of ways to send encrypted messages if you and your recipient share some piece of secret information, and the Diffie-Hellman scheme allows one to securely generate a piece of shared secret information. Now we’ll shift gears and assume you don’t have a shared secret, nor any way to acquire one. The first cryptosystem in that vein is called the Shamir-Massey-Omura protocol. It’s only slightly more complicated to understand than Diffie-Hellman, and it turns out to be equivalently difficult to break.

The idea is best explained by metaphor. Alice wants to send a message to Bob, but all she has is a box and a lock for which she has the only key. She puts the message in the box and locks it with her lock, and sends it to Bob. Bob can’t open the box, but he can send it back with a second lock on it for which Bob has the only key. Upon receiving it, Alice unlocks her lock, sends the box back to Bob, and Bob can now open the box and retrieve the message.

To celebrate the return of Game of Thrones, we’ll demonstrate this protocol with an original Lannister Infographic™.

Assuming the box and locks are made of magical unbreakable Valyrian steel, nobody but Jamie will be able to read the message.

Assuming the box and locks are made of magically unbreakable Valyrian steel, nobody but Bob (also known as Jamie) will be able to read the message.

Now fast forward through the enlightenment, industrial revolution, and into the age of information. The same idea works, and it’s significantly faster over long distances. Let $ C$ be an elliptic curve over a finite field $ k$ (we’ll fix $ k = \mathbb{Z}/p$ for some prime $ p$, though it works for general fields too). Let $ n$ be the number of points on $ C$.

Alice’s message is going to be in the form of a point $ M$ on $ C$. She’ll then choose her secret integer $ 0 < s_A < p$ and compute $ s_AM$ (locking the secret in the box), sending the result to Bob. Bob will likewise pick a secret integer $ s_B$, and send $ s_Bs_AM$ back to Alice.

Now the unlocking part: since $ s_A \in \mathbb{Z}/p$ is a field, Alice can “unlock the box” by computing the inverse $ s_A^{-1}$ and computing $ s_BM = s_A^{-1}s_Bs_AM$. Now the “box” just has Bob’s lock on it. So Alice sends $ s_BM$ back to Bob, and Bob performs the same process to evaluate $ s_B^{-1}s_BM = M$, thus receiving the message.

Like we said earlier, the security of this protocol is equivalent to the security of the Diffie-Hellman problem. In this case, if we call $ z = s_A^{-1}$ and $ y = s_B^{-1}$, and $ P = s_As_BM$, then it’s clear that any eavesdropper would have access to $ P, zP$, and $ yP$, and they would be tasked with determining $ zyP$, which is exactly the Diffie-Hellman problem.

Now Alice’s secret message comes in the form of a point on an elliptic curve, so how might one translate part of a message (which is usually represented as an integer) into a point? This problem seems to be difficult in general, and there’s no easy answer. Here’s one method originally proposed by Neal Koblitz that uses a bit of number theory trickery.

Let $ C$ be given by the equation $ y^2 = x^3 + ax + b$, again over $ \mathbb{Z}/p$. Suppose $ 0 \leq m < p/100$ is our message. Define for any $ 0 \leq j < 100$ the candidate $ x$-points $ x_j = 100m + j$. Then call our candidate $ y^2$-values $ s_j = x_j^3 + ax_j + b$. Now for each $ j$ we can compute $ x_j, s_j$, and so we’ll pick the first one for which $ s_j$ is a square in $ \mathbb{Z}/p$ and we’ll get a point on the curve. How can we tell if $ s_j$ is a square? One condition is that $ s_j^{(p-1)/2} \equiv 1 \mod p$. This is a basic fact about quadratic residues modulo primes; see these notes for an introduction and this Wikipedia section for a dense summary.

Once we know it’s a square, we can compute the square root depending on whether $ p \equiv 1 \mod 4$ or $ p \equiv 3 \mod 4$. In the latter case, it’s just $ s_j^{(p+1)/4} \mod p$. Unfortunately the former case is more difficult (really, the difficult part is $ p \equiv 1 \mod 8$). You can see Section 1.5 of this textbook for more details and three algorithms, or you could just pick primes congruent to 3 mod 4.

I have struggled to find information about the history of the Shamir-Massey-Omura protocol; every author claims it’s not widely used in practice, and the only reason seems to be that this protocol doesn’t include a suitable method for authenticating the validity of a message. In other words, some “man in the middle” could be intercepting messages and tricking you into thinking he is your intended recipient. Coupling this with the difficulty of encoding a message as a point seems to be enough to make cryptographers look for other methods. Another reason could be that the system was patented in 1982 and is currently held by SafeNet, one of the US’s largest security providers. All of their products have generic names so it’s impossible to tell if they’re actually using Shamir-Massey-Omura. I’m no patent lawyer, but it could simply be that nobody else is allowed to implement the scheme.

Digital Signatures

Indeed, the discussion above raises the question: how does one authenticate a message? The standard technique is called a digital signature, and we can implement those using elliptic curve techniques as well. To debunk the naive idea, one cannot simply attach some static piece of extra information to the message. An attacker could just copy that information and replicate it to forge your signature on another, potentially malicious document. In other words, a signature should only work for the message it was used to sign. The technique we’ll implement was originally proposed by Taher Elgamal, and is called the ElGamal signature algorithm. We’re going to look at a special case of it.

So Alice wants to send a message $ m$ with some extra information that is unique to the message and that can be used to verify that it was sent by Alice. She picks an elliptic curve $ E$ over $ \mathbb{F}_q$ in such a way that the number of points on $ E$ is $ br$, where $ b$ is a small integer and $ r$ is a large prime.

Then, as in Diffie-Hellman, she picks a base point $ Q$ that has order $ r$ and a secret integer $ s$ (which is permanent), and computes $ P = sQ$. Alice publishes everything except $ s$:

Public information: $ \mathbb{F}_q, E, b, r, Q, P &fg=000000$

Let Alice’s message $ m$ be represented as an integer at most $ r$ (there are a few ways to get around this if your message is too long). Now to sign $ m$ Alice picks a message specific $ k < r$ and computes what I’ll call the auxiliary point $ A = kQ$. Let $ A = (x, y)$. Alice then computes the signature $ g = k^{-1}(m + s x) \mod r$. The signed message is then $ (m, A, g)$, which Alice can safely send to Bob.

Before we see how Bob verifies the message, notice that the signature integer involves everything: Alice’s secret key, the message-specific secret integer $ k$, and most importantly the message. Remember that this is crucial: we want the signature to work only for the message that it was used to sign. If the same $ k$ is used for multiple messages then the attacker can find out your secret key! (And this has happened in practice; see the end of the post.)

So Bob receives $ (m, A, g)$, and also has access to all of the public information listed above. Bob authenticates the message by computing the auxiliary point via a different route. First, he computes $ c = g^{-1} m \mod r$ and $ d = g^{-1}x \mod r$, and then $ A’ = cQ + dP$. If the message was signed by Alice then $ A’ = A$, since we can just write out the definition of everything:

authentication-formula

Now to analyze the security. The attacker wants to be able to take any message $ m’$ and produce a signature $ A’, g’$ that will pass validation with Alice’s public information. If the attacker knew how to solve the discrete logarithm problem efficiently this would be trivial: compute $ s$ and then just sign like Alice does. Without that power there are still a few options. If the attacker can figure out the message-specific integer $ k$, then she can compute Alice’s secret key $ s$ as follows.

Given $ g = k^{-1}(m + sx) \mod r$, compute $ kg \equiv (m + sx) \mod r$. Compute $ d = gcd(x, r)$, and you know that this congruence has only $ d$ possible solutions modulo $ r$. Since $ s$ is less than $ r$, the attacker can just try all options until they find $ P = sQ$. So that’s bad, but in a properly implemented signature algorithm finding $ k$ is equivalently hard to solving the discrete logarithm problem, so we can assume we’re relatively safe from that.

On the other hand one could imagine being able to conjure the pieces of the signature $ A’, g’$ by some method that doesn’t involve directly finding Alice’s secret key. Indeed, this problem is less well-studied than the Diffie-Hellman problem, but most cryptographers believe it’s just as hard. For more information, this paper surveys the known attacks against this signature algorithm, including a successful attack for fields of characteristic two.

Signature Implementation

We can go ahead and implement the signature algorithm once we’ve picked a suitable elliptic curve. For the purpose of demonstration we’ll use a small curve, $ E: y^2 = x^3 + 3x + 181$ over $ F = \mathbb{Z}/1061$, whose number of points happens to have the a suitable prime factorization ($ 1047 = 3 \cdot 349$). If you’re interested in counting the number of points on an elliptic curve, there are many theorems and efficient algorithms to do this, and if you’ve been reading this whole series something then an algorithm based on the Baby-Step Giant-Step idea would be easy to implement. For the sake of brevity, we leave it as an exercise to the reader.

Note that the code we present is based on the elliptic curve and finite field code we’re been implementing as part of this series. All of the code used in this post is available on this blog’s Github page.

The basepoint we’ll pick has to have order 349, and $ E$ has plenty of candidates. We’ll use $ (2, 81)$, and we’ll randomly generate a secret key that’s less than $ 349$ (eight bits will do). So our setup looks like this:

if __name__ == &quot;__main__&quot;:
   F = FiniteField(1061, 1)

   # y^2 = x^3 + 3x + 181
   curve = EllipticCurve(a=F(3), b=F(181))
   basePoint = Point(curve, F(2), F(81))
   basePointOrder = 349
   secretKey = generateSecretKey(8)
   publicKey = secretKey * basePoint

Then so sign a message we generate a random key, construct the auxiliary point and the signature, and return:

def sign(message, basePoint, basePointOrder, secretKey):
   modR = FiniteField(basePointOrder, 1)
   oneTimeSecret = generateSecretKey(len(bin(basePointOrder)) - 3) # numbits(order) - 1

   auxiliaryPoint = oneTimeSecret * basePoint
   signature = modR(oneTimeSecret).inverse() *
         (modR(message) + modR(secretKey) * modR(auxiliaryPoint[0]))

   return (message, auxiliaryPoint, signature)

So far so good. Note that we generate the message-specific $ k$ at random, and this implies we need a high-quality source of randomness (what’s called a cryptographically-secure pseudorandom number generator). In absence of that there are proposed deterministic methods for doing it. See this draft proposal of Thomas Pornin, and this paper of Daniel Bernstein for another.

Now to authenticate, we follow the procedure from earlier.

def authentic(signedMessage, basePoint, basePointOrder, publicKey):
   modR = FiniteField(basePointOrder, 1)
   (message, auxiliary, signature) = signedMessage

   sigInverse = modR(signature).inverse() # sig can be an int or a modR already
   c, d = sigInverse * modR(message), sigInverse * modR(auxiliary[0])

   auxiliaryChecker = int(c) * basePoint + int(d) * publicKey
   return auxiliaryChecker == auxiliary

Continuing with our example, we pick a message represented as an integer smaller than $ r$, sign it, and validate it.

&gt;&gt;&gt; message = 123
&gt;&gt;&gt; signedMessage = sign(message, basePoint, basePointOrder, secretKey)
&gt;&gt;&gt; signedMessage
(123, (220 (mod 1061), 234 (mod 1061)), 88 (mod 349))
&gt;&gt;&gt; authentic(signedMessage, basePoint, basePointOrder, publicKey)
True

So there we have it, a nice implementation of the digital signature algorithm.

When Digital Signatures Fail

As we mentioned, it’s extremely important to avoid using the same $ k$ for two different messages. If you do, then you’ll get two signed messages $ (m_1, A_1, g_1), (m_2, A_2, g_2)$, but by definition the two $ g$’s have a ton of information in common! An attacker can recognize this immediately because $ A_1 = A_2$, and figure out the secret key $ s$ as follows. First write

$ \displaystyle g_1 – g_2 \equiv k^{-1}(m_1 + sx) – k^{-1}(m_2 + sx) \equiv k^{-1}(m_1 – m_2) \mod r$.

Now we have something of the form $ \text{known}_1 \equiv (k^{-1}) \text{known}_2 \mod r$, and similarly to the attack described earlier we can try all possibilities until we find a number that satisfies $ A = kQ$. Then once we have $ k$ we have already seen how to find $ s$. Indeed, it would be a good exercise for the reader to implement this attack.

The attack we just described it not an idle threat. Indeed, the Sony corporation, producers of the popular Playstation video game console, made this mistake in signing software for Playstation 3. A digital signature algorithm makes sense to validate software, because Sony wants to ensure that only Sony has the power to publish games. So Sony developers act as one party signing the data on a disc, and the console will only play a game with a valid signature. Note that the asymmetric setup is necessary because if the console had shared a secret with Sony (say, stored as plaintext within the hardware of the console), anyone with physical access to the machine could discover it.

Now here come the cringing part. Sony made the mistake of using the same $ k$ to sign every game! Their mistake was discovered in 2010 and made public at a cryptography conference. This video of the humorous talk includes a description of the variant Sony used and the attacker describe how the mistake should have been corrected. Without a firmware update (I believe Sony’s public key information was stored locally so that one could authenticate games without an internet connection), anyone could sign a piece of software and create games that are indistinguishable from something produced by Sony. That includes malicious content that, say, installs software that sends credit card information to the attacker.

So here we have a tidy story: a widely used cryptosystem with a scare story of what will go wrong when you misuse it. In the future of this series, we’ll look at other things you can do with elliptic curves, including factoring integers and testing for primality. We’ll also see some normal forms of elliptic curves that are used in place of the Weierstrass normal form for various reasons.

Until next time!

Encryption & RSA

This post assumes working knowledge of elementary number theory. Luckily for the non-mathematicians, we cover all required knowledge and notation in our number theory primer.

So Three Thousand Years of Number Theory Wasn’t Pointless

It’s often tough to come up with concrete applications of pure mathematics. In fact, before computers came along mathematics was used mostly for navigation, astronomy, and war. In the real world it almost always coincided with the physical sciences. Certainly the esoteric field of number theory didn’t help to track planets or guide ships. It was just for the amusement and artistic expression of mathematicians.

Despite number theory’s apparent uselessness, mathematicians invested a huge amount of work in it, searching for distributions of primes and inventing ring theory in the pursuit of algebraic identities. Indeed some of the greatest open problems in mathematics today are still number theoretical: the infamous Goldbach Conjecture, the Twin Prime Conjecture, and the Collatz Conjecture all have simple statements, but their proofs or counterexamples have eluded mathematicians for hundreds of years. Solutions to these problems, which are generally deemed beyond the grasp of an average mathematician, would certainly bring with them large prizes and international fame.

Putting aside its inherent beauty, until recently there was no use for number theory at all. But nowadays we have complex computer simulated models, statistical analysis, graphics, computing theory, signal processing, and optimization problems. So even very complex mathematics finds its way into most of what we do on a daily basis.

And, of course, number theory also has its place: in cryptography.

The history of cryptography is long and fascinating. The interested reader will find a wealth of information through the article and subsequent links on Wikipedia. We focus on one current method whose security is mathematically sound.

The Advent of Public Keys

Until 1976 (two years before the RSA method was born), all encryption methods followed the same pattern:

  1. At an earlier date, the sender and recipient agree on a secret parameter called a key, which is used both to encrypt and decrypt the message.
  2. The message is encrypted by the sender, sent to the recipient, and then decrypted in privacy.

This way, any interceptor could not read the message without knowing the key and the encryption method. Of course, there were various methods of attacking the ciphers, but for the most part this was a safe method.

The problem is protecting the key. Since the two communicating parties had to agree on a key that nobody else could know, they either had to meet in person or trust an aide to communicate the key separately. Risky business for leaders of distant allied nations.

Then, in 1976, two researchers announced a breakthrough: the sender and recipient need not share the same key! Instead, everybody who wanted private communication has two keys: one private, and one public. The public key is published in a directory, while the private key is kept secret, so that only the recipient need know it.

Anyone wishing to send a secure message would then encrypt the message with the recipient’s public key. The message could only be decrypted with the recipient’s private key. Even the sender couldn’t decrypt his own message!

The astute reader might question whether such an encryption method is possible: certainly every deterministic computation is reversible. Indeed, in theory it is possible to reverse the encryption method. However, as we will see it is computationally unfeasible. With the method we are about to investigate (disregarding any future mathematical or quantum breakthroughs), it would take a mind-bogglingly long time to do so. And, of course, the method works through the magic of number theory.

RSA

Rivest, Shamir, and Adleman. They look like pretty nice guys.

RSA, an acronym which stands for the algorithm’s inventors, Rivest, Shamir, and Adleman, is such a public-key encryption system. It is one of the most widely-used ciphers, and it depends heavily on the computational intractability of two problems in number theory: namely factoring integers and taking modular roots.

But before we get there, let us develop the method. Recall Euler’s totient function, $ \varphi(n)$.

Definition: Let $ n$ be a positive integer. $ \varphi(n)$ is the number of integers between 1 and $ n$ relatively prime to $ n$.

There is a famous theorem due to Euler that states if $ a, n$ are relatively prime integers, then

$ \displaystyle a^{\varphi(n)} \equiv 1 \mod{n}$

In other words, if we raise $ a$ to that power, its remainder after dividing by $ n$ is 1. Group theorists will recognize this immediately from Lagrange’s Theorem. While it is possible to prove it with elementary tools, we will not do so here. We cover the full proof of this theorem in our number theory primer.

In particular, we notice the natural next result that $ a^{k \varphi(n) + 1} \equiv a \mod{n}$ for any $ k$, since this is just

$ \displaystyle (a^{\varphi(n)})^k \cdot a \equiv 1^ka \equiv a \mod{n}$.

If we could break up $ k \varphi(n) + 1$ into two smaller numbers, say $ e,d$, then we could use exponentiation as our encryption and decryption method. While that is the entire idea of RSA in short, it requires a bit more detail:

Let $ M$ be our message, encoded as a single number less than $ n$. We call $ n$ the modulus, and for the sake of argument let us say $ M$ and $ n$ are relatively prime. Then by Euler’s theorem, $ M^{\varphi(n)} \equiv 1 \mod{n}$. In particular, let us choose a public key $ e$ (for encryption), and raise $ M^e \mod{n}$. This is the encrypted message. Note that both $ e$ and $ n$ are known to the encryptor, and hence the general public. Upon receiving such a message, the recipient may use his private key $ d = e^{-1} \mod{\varphi(n)}$ to decrypt the message. We may pick $ e$ to be relatively prime to $ \varphi(n)$, to ensure that such a $ d$ exists. Then $ ed \equiv 1 \mod{\varphi(n)}$, and so by Euler’s theorem

$ \displaystyle (M^e)^d = M^{ed} = M^{k \varphi(n) + 1} \equiv M \mod{n}$

By exponentiating the encrypted text with the right private key, we recover the original message, and our secrets are safe from prying eyes.

Now for the messy detail: Where did $ n$ come from? And how we can actually compute all this junk?

First, in order to ensure $ M < n$ for a reasonably encoded message $ M$, we require that $ n$ is large. Furthermore, since we make both $ n$ and $ e$ public, we have to ensure that $ \varphi(n)$ is hard to compute, for if an attacker could determine $ \varphi(n)$ from $ n$, then $ e^{-1} \mod{\varphi(n)}$ would be trivial to compute. In addition, one could theoretically compute all the $ e$th roots of $ M^e$ modulo $ n$.

We solve these problems by exploiting their computational intractability. We find two enormous primes $ p,q$, and set $ n = pq$. First, recall that the best known way to compute $ \varphi(n)$ is by the following theorem:

Theorem: For $ p,q$ primes, $ \varphi(p^k) = p^k – p^{k-1}$, and $ \varphi(p^j q^k) = \varphi(p^j)\varphi(q^k)$.

In this way, we can compute $ \varphi(n)$ easily if we know it’s prime factorization. Therein lies the problem and the solution: factorizing large numbers is hard. Indeed, it is an unsolved problem in computer science as to whether integers can be factored by a polynomial-time algorithm. Quickly finding arbitrary roots mod $ n$ is a similarly hard problem.

To impress the difficulty of integer factorization, we visit its world record. In 2009, a team of researchers successfully factored a 678-bit (232-digit) integer, and it required a network of hundreds of computers and two years to do. The algorithms were quite sophisticated and at some times fickle, failing when one node in the network went down. On the other hand, our $ p,q$ will each be 2048-bit numbers, and so their product is astronomical in comparison. In fact, even 1024-bit numbers are thousands of times harder to factor than 678-bit numbers, meaning that with the hugest of networks, it would take far longer than our lifespans just to factor a “weak” RSA modulus with the methods known today. In this respect, and for the foreseeable future, RSA is watertight.

Since we constructed $ n$ as the product of two primes, we know

$ \varphi(n) = \varphi(p)\varphi(q) = (p-1)(q-1)$,

so we can compute $ \varphi(n)$ trivially. Then, if we pick any $ e < \varphi(n)$ which is relatively prime to $ \varphi(n)$ (for instance, $ e$ itself could be prime), then we may compute the public key $ d$ via the extended Euclidean algorithm.

For a clean-cut worked example of RSA key generation and encryption, see the subsection on Wikipedia. We admit that an example couldn’t be done much better than theirs, and we use the same notation here as the writers do there.

Big Random Primes

There is one remaining problem that requires our attention if we wish to implement an RSA encryption scheme. We have to generate huge primes.

To do so, we note that we don’t actually care what the primes are, only how big they are. Generating large random odd numbers is easy: we can simply randomly generate each of its 2,048 bits, ensuring the smallest bit is a 1. Since we recall that primes are distributed roughly according to $ x / \log(x)$, we see that the chance of getting a prime at random is roughly $ 2 / \log(2^{2048})$, which is about $ 1 / 710$. Thus, on average we can expect to generate 710 random numbers before we get a prime.

Now that we know we’ll probably find a prime number fast, we just have to determine which is prime. There is essentially only one sure-fire primality test: the Sieve of Eratosthenes, in which we simply test all the primes from 2 to the square root of $ n$. If none divide $ n$, then $ n$ is prime.

Unfortunately, this is far too slow, and would require us to generate a list of primes that is unreasonably large (indeed, if we already had that list of primes we wouldn’t need to generate any more!). So we turn to probabilistic tests. In other words, there are many algorithms which determine the likelihood of a candidate being composite (not prime), and then repeat the test until that likelihood is sufficiently close to $ 0$, and hence a certainty. Generally this bound is $ 2^{-100}$, and the existing algorithms achieve it in polynomial time.

Unfortunately an in-depth treatment of one such primality test is beyond the scope of this post. In addition, most contemporary programming languages come equipped with one such primality test, so we put their implementations aside for a later date. To read more about probabilistic primality tests, see the list of them on Wikipedia. They are all based on special cases of Euler’s theorem, and the distribution of multiplicative inverses modulo $ n$.

Implementation

In a wild and unprecedented turn of events, we did not use Mathematica to implement RSA! The reason for this is so that anyone (especially the author’s paranoid father) can run it. So we implemented it in Java. As always, the entire source code (and this time, an executable jar file) is available on this blog’s Github page.

Despite its relative verbosity, Java has a few advantages. The first of these is the author’s familiarity with its GUI (graphical user interface) libraries. The second benefit is that all of the important functions we need are part of the BigInteger class. BigInteger is a built-in Java class that allows us to work with numbers of unbounded size. Recall that in Mathematica unbounded arithmetic is built into the language, but older and more general-purpose languages like Java and C adhere to fixed-length arithmetic. Disregarding the debates over which is better, we notice that BigInteger has the functions:

static BigInteger probablePrime(int bitLength, Random rnd)
BigInteger modPow(BigInteger exponent, BigInteger m)
BigInteger modInverse(BigInteger m)

For clarity, the first function generates numbers which are not prime with probability at most $ 2^{-100}$, the second computes exponents modulo “m”, and the third computes the multiplicative inverse modulo “m”. The “modPow” and “modInverse” functions operate on the context object, or the implicit “this” argument (recall Java is object-oriented [see upcoming primer on object-oriented programming]).

Indeed, this is all we need to write our program! But there are a few more specifics:

First, we need a good random number generator to feed BigInteger’s “probablePrime” function. It turns out that Java’s built-in random number generator is just not secure enough. To remedy this, we could use the “java.security.secureRandom” class, part of Java’s cryptography package; but for the sake of brevity, we instead import an implementation of the Mersenne Twister, a fast prime number generator which is not secure.

Second, there are known factoring methods for $ n=pq$ if $ p \pm 1$ or $ q \pm 1$ has only small prime factors. These are due to Pollard and Williams. So we include a special method called “isDivisibleByLargePrime”, and screen our candidate prime numbers against its negation. The largest prime we test for is 65537, the 6543rd prime. The details of this function are in the source code, which again is available on this blog’s Github page. It is not very interesting.

Third, we notice that the choice of public key is arbitrary. Since everyone is allowed to know it, it shouldn’t matter what we pick. Of course, this is a bit too naive, and it has been proven that if the public key $ e$ is small (say, $ 3$), then the RSA encryption is less secure. After twenty years of attacks trying to break RSA, it has been generally accepted that public keys with moderate bit-length and small Hamming weight (few 1’s in their binary expansion) are secure. The most commonly used public key is 65537, which is the prime $ 2^{16} +1 = \textup{0x10001}$. So in our implementation, we fix the public key at 65537.

Finally, in order to make our String representations of moduli, public keys, and private keys slightly shorter, we use alphadecimal notation (base 36) for inputting and outputting our numbers. This has the advantage that it uses all numerals and characters, thus maximizing expression without getting funky symbols involved.

Results

Here is a snapshot of the resulting Java application:


As you can see, the encrypted messages are quite long, but very copy-pasteable. Also, generating the keys can take up to a minute, so have patience when pressing the “Generate Keys” button under the tab of the same name.

There you have it! Enjoy the applet; it’s for everyone to use, but despite all my due diligence in writing the software, I wouldn’t recommend anyone to rely on it for national security.

Feel free to leave me a comment with a super-secret RSA-encoded message! Here is my encryption modulus and my public key:

public key: 1ekh

encryption modulus:
5a1msbciqctepss5dfpp76rdb5selhybzhzerklbomyvwlohasuy81r9rbd3scvujpqn9e7m5hp52qv4fli9f4bupg23060wmaf94zq94s4j22hwyi764kk0k6w8is05tyg7029ub6ux4vb4bq9xxzw9nkfs0pfteq7wnm6hya7b2l2i1w8jh25913qye67gz8z4xrep1dcj8qpyxyi56ltukqyv366hei4nme6h9x00z16cbf8g76me1keccicsgd268u3iocvb6c5lw00j234270608f24qelu8xfjcddc7n9u0w2tf46bl1yzsjb8kb5ys9gh51l0ryetge1lwh1ontenraq35wv5f4ea57zae1ojcsxp3cxpx84mbg0duoln2vny7uixl3ti4n2flvfats4wz0h1c34cgxdyixb7ylci6t4dk8raqcbdi3yxclktvb7yv1sb61nxk1kylfp0h7wqtrogf8c039mc6bqe8b7eixb72pfz4sajw1rf170ck2vysy1z6bgyngrhyn8kpepd0btcyuyj0scdshlexlg4uolls8p6frxj8dt4ykcnddeuvf7dfz1qyqpjk7ljwr1hdgaecyqa6gsxryodrup1qpwgieot6v8c5bbizxm45qj4nez5cpe9q12m0pyeszic5dtb1yc0rm7lzzddg254uf390rk4irecfxibpv2lnk9zper3kg729w32n7u0qn8mamtmh80fo6hd0n5a50d9kzft1g3bky2t2d2f1a574ukigx1dfgqrtehnmx5nd

Until next time!