This primer exists for the background necessary to read our post on RSA encryption, but it also serves as a general primer to number theory.
Oh, Numbers, Numbers, Numbers
We start with some easy definitions.
Definition: The set of integers, denoted , is the set
.
Definition: Let be integers, then
divides
, denoted
, if there exists an integer
such that
. We also less commonly say
is divisible by
. A composite number has a divisor greater than 1, and hence two strictly smaller divisors.
This definition of “dividing” allows us to bypass the more complicated world of fractions and rational numbers, and keep most of what we do in the integers. The novice reader is encouraged to prove the following propositions about divisibility:
- If
then
for any
.
- If
and
then
.
- If
and
then
.
- Find a counterexample where
but neither
nor
holds.
Definition: A positive integer is prime if it has exactly two distinct divisors:
and
. For the remainder of this post, we will always use
to denote primes.
It follows immediately from the definition that if are positive integers and
is prime, then
if and only if
or
.
We could work toward the fundamental theorem of arithmetic, that every positive integer factors uniquely as a product of primes, but this would lead us down the road of proving Bezout’s theorem, and we want to work quickly toward other things. Instead, we simply prove the existence of such a factoring, and take its uniqueness for granted:
Theorem: Every positive integer factors as a product of primes.
Proof. Let be the set of positive integers which do not have a factoring as a product of primes. In particular,
since 1 is the product of zero primes, and no prime is in
. So
is entirely composed of composite numbers. Take the smallest element
, and since it is composite it may be written as
, where
are both strictly smaller than
. But since each of
are not in
, they factor into products of primes. By combining these factorings, we achieve a factoring of
, a contradiction.
Definition: The greatest common divisor of two numbers , abbreviated gcd and denoted
or less commonly
, is the largest divisor of both
and
. We say two numbers are relatively prime if
.
Note that two relatively prime numbers might not be prime. In fact, , but neither 8 nor 9 are prime. We also sometimes say (though it is probably grammatically incorrect)
is relatively prime to
.
We may prove some interesting facts about greatest common divisors, which we leave as exercises to the ambitious reader:
for any
.
is the smallest positive linear combination of
with integer coefficients.
- (Bezout’s theorem, a corollary to the above two statements) There is a linear combination of
with integer coefficients that equals
.
The last theorem shows up in group theory as a statement about generators of additive integer groups .
Congruences and Modular Arithmetic
Definition: is congruent to
modulo
, denoted
, if
.
This definition has a more familiar form for computer scientists, namely the line of code:
a % n == b % n
In plain language, two numbers are congruent modulo if they have the same remainder when divided by
. Usually, however, we make
, so that
is the remainder of
when divided by
.
It is a cheap fact that the relation is an equivalence relation on integers for any fixed
. In other words:
- If
then
- If
and
then
.
This allows us to partition the integers into their congruence classes. In other words, the fact that this is an equivalence relation allows us to identify a number with its remainder modulo . For the sake of consistency, when stating the set of all congruence classes mod
, we stick to the classes represented by the positive integers
.
There is a vast amount of work on solving systems of congruences. The interested reader should investigate the Chinese Remainder Theorem and Hensel’s Lemma.
Definition: A number is an inverse to a number
modulo
if
.
Inverses help to reduce computations by pairing a number with its inverse modulo . Therefore, it is interesting to know when numbers have inverses, and whether one is unique up to congruence.
Proposition: An inverse for exists if and only if
(
and
are relatively prime).
Proof. Suppose such an inverse exists. Then by the definition of congruence,
, and hence
, so
. In particular, 1 is a linear combination of
and
, and hence it must be the smallest positive linear combination. This is equivalent to
, which is proved by the exercise above.
On the other hand, suppose . We may reverse the computation above to find
. Specifically,
is the coefficient of
in the linear combination of
that makes 1. This proves the theorem
.
It would be great if we could determine how many numbers have inverses mod . In fact, we will, but first we’d like to investigate a few interesting theorems.
Theorem: (Wilson) for any prime
.
Proof. We immediately see that two of these factors are easy: and
. We claim that the product of the remaining numbers is always 1.
Since each number is relatively prime to
(indeed, all numbers are relatively prime to a prime), each number
has an inverse modulo
. As long as this inverse is not
itself, we may pair off each number with its inverse, and see that the entire product is just 1.
To prove that , suppose
. Then
. But since
, we must have two numbers
whose product is divisible by
. But neither of
has
as a factor. So we conclude that either
or
, giving
, a contradiction.
So we may indeed pair the numbers off as we described above, and see that
Here is another fundamental result that uses Wilson’s theorem as a stepping stone:
Theorem: (Fermat’s Little Theorem) If is prime and
, then
.
Proof. The set are the nonzero remainders mod
. If we multiply each by
, we get another complete set of nonzero residues mod
, namely
.
Since both of these sets are all the nonzero residues mod , their products are congruent. In particular,
Since we may factor out the from each term, and there are
terms, we arrive at
. But Wilson’s theorem proved that
is nonzero mod
, and hence has a multiplicative inverse. Multiplying both sides of the equation by its inverse (which is obviously -1), we get
, as desired.
Fermat’s Little Theorem allows us to quickly compute large powers of integers modulo any prime. Specifically, we may break into
By Fermat’s Little Theorem, this is just
, and from here we may compute
. Certainly this is much faster than incrementally multiplying 3 to itself 143 times and every so often dividing by 11. We subtly allude to the usefulness of number theory for computing large exponents, which is an important theme in our post on RSA encryption.
Euler’s Phi Function
Our final topic for this primer is Euler’s phi function (also called the totient function), which counts the number of positive integers which are relatively prime to .
Definition: is the number of positive integers between 1 and
which are relatively prime to
.
Here we have another, more general version of Fermat’s Little Theorem, which uses an almost identical proof:
Theorem: For any positive integer , if
then
.
Proof. Again we use the argument from Fermat’s Little Theorem, except here the sets are not all integers from to
, but only those which are relatively prime. Then we notice that if
are relatively prime to
, then so is their product.
Using the product trick once more, we label the relatively prime integers , and see that
Since the product of all relatively prime integers is again relatively prime, it has a multiplicative inverse mod . While we might not know what it is, it certainly exists, so we may cancel both sides of the congruence to get
. Therefore, we win.
Now we would like to have a good way to compute for a general
. First, we see that for powers of primes this is easy:
Proposition: for any prime
and positive integer
. In particular,
.
Proof. The only numbers which are not relatively prime to are all the multiples of
. Since every
th number is a multiple of
, there are hence
numbers which are not relatively prime to
. Subtracting this from
gives our desired formula.
Finally, we have a theorem that lets us compute for an arbitrary
, from
of its prime power factors.
Proposition: if
and
are relatively prime.
Proof. To each number relatively prime to
, and each number
relatively prime to
, we see that
is relatively prime to
. Supposing to the contrary that some prime
divides
, we see that it must divide one of
but not both, since
. Suppose without loss of generality that
. Then since
, we may see that
, a contradiction.
In other words, we have shown that is a function from the set of pairs of relatively prime numbers of
to the set of relatively prime numbers to
. It suffices to show this map is injective and surjective.
For injectivity, we require that no two distinct are congruent. Supposing we have two distinct
and two
, with
. Then rearranging terms we get
latex m$ divides both
and
under the modulus, we see that
divides
. But since
, we get that
, so by definition
, contradicting our assumption that the
‘s were distinct. We get a similar result for the
‘s, and this proves injectivity.
For surjectivity, let be a positive integer relatively prime to
. Since
, we may write
for some
. This is achieved by finding a linear combination of
equal to 1 (their gcd), and then multiplying through by
. We claim that
. This is true since if some prime divided both of
, then it would also divide
, and also
. But
was assumed to be relatively prime to
, so this cannot happen. Hence,
is relatively prime to
. An identical argument gives that
is relatively prime to
, so the pair
, proving surjectivity.
This, we may compute multiplicatively, by first finding its prime factorization, and then computing
for each prime factor, which is easy.
Unfortunately, finding prime factorizations quickly is very hard. Unless we know the factorization of a large ahead of time (large as in hundreds-of-digits long), computing
is effectively impossible. We cover the implications of this in more detail in our post on RSA encryption.
Until next time!
When you say – “We also sometimes say (though it is probably grammatically incorrect) a is relatively prime to b.” a and b are actually called “Coprime” numbers.
LikeLike
Both are used interchangeably. The possible grammatical incorrectness is that you would say “a is prime relative to b,” but people say “relatively prime to b” more often.
LikeLike
Thanks ! Can you please elaborate on – “The set { 1, 2, ……, p-1 } are the nonzero remainders mod p. If we multiply each by a, we get another complete set of nonzero residues mod p, namely { a, 2a, ……, (p-1)a }.”
How can we prove that the second set {a, 2a , …. (p-1)a} is another complete set of nonzero residues mod p ?
LikeLike
This depends heavily on the fact that
is prime. Indeed, if you missed something, then you would get some product
, both less than
, for which
, which is impossible. Another way to say it is that multiplication by a number less than
is an injective function.
LikeLike
To each number a relatively prime to n, and each number b relatively prime to m, we see that am+bn is relatively prime to mn. Supposing to the contrary that some prime p divides (an+bm, mn), we see that it must divide one of m, n but not both, since (m,n) = 1.Suppose without loss of generality that p \mid n. Then since p \mid an+bm, we may see that p \mid m, a contradiction.
You seemed to confuse the m and n.(In the second and third formulas) It’s am+bn,not an+bm.
LikeLike
For surjectivity, let k be a positive integer relatively prime to nm. Since (m,n) = 1, we may write am+bn = k for some a,b \in \mathbb{Z}. This is achieved by finding a linear combination of m,n equal to 1 (their gcd), and then multiplying through by k. We claim that (a,n) = 1. This is true since if some prime divided both of a,n, then it would also divide am+bn = k, and also nm. But k was assumed to be relatively prime to nm, so this cannot happen. Hence, a is relatively prime to m(!!WRONG.It should be n). An identical argument gives that b is relatively prime to n(!! It should be m), so the pair (a,b) \mapsto am+bn, proving surjectivity.
Thank you for your article.It’s so clear.I have learnt a lot.
LikeLike