So here we are. We have finally made it to a place where we can transition with confidence from the classical continuous Fourier transform to the discrete version, which is the foundation for applications of Fourier analysis to programming. Indeed, we are quite close to unfurling the might of the Fast Fourier Transform algorithm, which efficiently computes the discrete Fourier transform. But because of its focus on algorithmic techniques, we will save it for a main content post and instead focus here on the intuitive connections between the discrete and continuous realms.
The goal has roughly three parts:
- Find a reasonable discrete approximation to a continuous function.
- Find a reasonable discrete approximation to the Fourier transform of a continuous function.
- Find a way to transition between the two discrete representations.
We should also note that there will be some notational clashes in the sequel. Rigorously, item 3 will result in an operation which we will denote by , but will be distinct from the classical Fourier transform on continuous functions. Indeed, they will have algebraic similarities, but one operates on generalized functions, and the other on finite sequences. We will keep the distinction clear from context to avoid adding superfluous notation. Moreover, we will avoid the rigor from the last post on tempered distributions. Instead we simply assume all functions are understood to be distributions and use the classical notation. Again, this is safe because our dabbling with the classical transform will be solely for intuition.
Of course, the point of this entire post is that all of the facts we proved about the continuous Fourier transform have direct translations into the discrete case. Up to a constant factor (sometimes) and up to notational conventions, the two theories will be identical. This is part of the beauty of the subject; sit back and enjoy it.
There is a very nice theorem about classical Fourier transforms that has to do with reconstructing a function from a discrete sample of its points. Since we do not actually need this theorem for anything other than motivation, we will not prove it here. Instead, we’ll introduce the definitions needed to state it, and see why it motivates a good definition for the discrete “approximation” to a function. For a much more thorough treatment of the sampling theorem and the other issues we glaze over in this post, see these lecture notes.
Definition: A function is time limited if it has bounded support. A function is bandlimited if its Fourier transform has bounded support. That is, if there is some so that the Fourier transform of is only nonzero when . We call the bandwidth of .
To be honest, before seeing a mathematical treatment of signal processing, this author had no idea what bandwidth actually referred to. It’s nice to finally solve those life mysteries.
In any case, it often occurs that one has a signal for which one can only measure values, but one doesn’t have access to an exact description of . The sampling theorem allows us to reconstruct exactly by choosing certain sample points. In a simplified form, the theorem is as follows:
Theorem: Suppose is a function of bandwidth . Then one can reconstruct exactly from the set of points as ranges over (that is, the sampling rate is at least ).
Unsurprisingly, the proof uses the Fourier transform in a nontrivial way. Moreover, there is a similar theorem for the Fourier transform , which can be reconstructed exactly from its samples if the sampling rate is at least , where bounds the support of . Note that the sampling rate in one domain is determined by the limitations on the other domain.
What’s more, if we are daring enough to claim is both time limited and bandlimited (and we sample as little as possible in each domain), then the number of sample points is the same for both domains. In particular, suppose is only nonzero on and its Fourier transform on . Note that since is both timelimited and bandlimited, there is no fault in shifting both so their smallest value is at the origin. Then, if are the respective numbers of sampled points in the time and frequency domains, then , and so .
Now this gives us a good reason to define the discrete approximation to a signal as
where . This accomplishes our first task. Now, in order to determine what the discrete Fourier transform should look like, we can represent this discrete approximation as a distribution using shifted deltas:
Here the deltas ensure the function is zero at points not among the , capturing the essence of the finite sequence above. Taking the Fourier transform of this (recalling that the Fourier transform of the shifted delta is a complex exponential and using linearity), we get
Denote the above function by . Now is unfortunately still continuous, so we take the same approach we did for and sample it via the samples in the frequency domain at . This gives the following list of values for the discrete approximation to the transformed version of :
And now the rest falls together. We can denote by the tuple of values , and the list of formulas above gives a way to transition from one domain to the other.
A Discrete Life
Now we move completely away from the continuous realm. The above discussion was nice in giving us an intuition for why the following definitions are reasonable. However, in order to actually use these ideas on discrete sequences, we can’t be restricted to sampling continuous functions. In fact, most of our planned work on this blog will not use discrete approximations to continuous functions, but just plain old sequences. So we just need to redefine the above ideas in a purely discrete realm.
The first step is to get rid of any idea of the sampling rate, and instead write everything in terms of the indices. This boils down to the a convenient coincidence. If , then by our earlier remarks that (the number of sample points taken), then . This allows us to rewrite the complex exponentials as , which is entirely in terms of the number of points used and the indices of the summation.
In other words, we can finally boil all of this discussion down to the following definition. Before we get to it, we should note that we use the square bracket notation for lists. That is, if is a list, then is the -th element of that list. Usually one would use subscripts to denote this, but we’d like to stay close to our notation for the continuous case, at least to make the parallels between the two theories more obvious.
Definition: Let be a vector in . Then the discrete Fourier transform of is defined by the vector , where
The phrase “discrete Fourier transform” is often abbreviated to DFT.
Now although we want to showcase the connections between the discrete and continuous Fourier transforms, we should note that they are completely disjoint. If one so desired (and many books follow this route) one could start with the discrete Fourier transform and later introduce the continuous version. We have the advantage of brevity here, in that we know what the theorems should say before we actually prove them. The point is that while the two transforms have connections and similarities, their theory is largely independent, and moreover the definition of the Fourier transform can be given without any preface. Much as we claimed with the continuous Fourier transform, the discrete Fourier transform is simply an operation on lists of data (i.e., on vectors).
Pushing the discrete to the extreme, we can represent the list of complex exponentials as a discrete signal too.
Definition: The discrete complex exponential of period is the list
We will omit the subscript when it is understood (at least, for the rest of this primer). And hence and moreover . If we denote by the vector of entry-wise powers of , then we can write the discrete Fourier transform in its most succinct form as:
or, recognizing everything without indexing in “operator notation,”
There are other ways to represent the discrete Fourier transform as an operation. In fact, since we know the classical Fourier transform is a linear operator, we should be able to come up with a matrix representation of the discrete transform. We will do just this, and as a result we will find the inverse discrete Fourier transform, but first we should look at some examples.
The Transforms of Discrete Signals
Perhaps the simplest possible signal one could think of is the delta function, whose discretized form is
with the corresponding shifted form
where the 1 appears in the -th spot. The reader should be convinced that this is indeed the right definition because it’s continuous cousin’s defining properties translate nicely. Specifically, for all , and .
Now to find its Fourier transform, we simply use the definition:
The only time that is nonzero is for , so this is simply the vector , which is the vector consisting entirely of 1′s. Hence, as in the continuous case (or at least, for the continuous definition of the 1 distribution),
In an identical manner, one can prove that , just as it was in for the continuous transform.
Now let’s do an example which deviates slightly from the continuous case. Recall that the continuous Fourier transform of the complex exponential was a delta function (to convince the reader, simply see that a single complex exponential can only have a single angular variable, and hence a singular frequency). In the discrete case, we get something similar.
so looking at the -th entry of this vector, we get
but because , we see that the sum is just the usual complex inner product . Further, as the differing powers of are orthogonal (hint: their complex inner product forms a geometric series), we see that they’re only nonzero when . In this case, it is easy to show that the inner product is precisely . Summarizing,
This is naturally , so our final statement is just . Note that here we have the extra factor of floating around. The reason for this boils down to the fact that the norm of the complex exponential is , and not 1. That is, the powers of do not form an orthonormal basis of . There are various alternative definitions that try to compensate for this, but to the best of this author’s knowledge a factor of always shows up in some of the resulting formulas. In other words, we should just accept this deviation from the continuous theory as collateral damage.
The mischievous presence of also shows up in an interesting way in the inverse discrete Fourier transform.
The DFT and its Inverse, as a Matrix
It’s a trivial exercise to check by hand that the discrete Fourier transform is a linear operation on vectors. i.e., for all vectors and all . As we know from our primer on linear algebra, all such mappings are expressible as matrix multiplication by a fixed matrix.
The “compact” form we represented the discrete Fourier transform above sheds light onto this question. Specifically, the formula
Is basically just the definition of a matrix multiplication. Viewing as the column vector
It is easy to see that the correct matrix to act on this vector is
A perhaps easier way to digest this matrix is by viewing each row of the matrix as the vector .
Now let’s just admire this matrix for a moment. What started many primers ago as a calculus computation requiring infinite integrals and sometimes divergent answers is now trivial to compute. This is our first peek into how to compute discrete Fourier transforms in a program, but unfortunately we are well aware of the fact that computing this naively requires computations. Our future post on the Fast Fourier Transform algorithm will take advantage of the structure of this matrix to improve this by an order or magnitude to .
But for now, let’s find the inverse Fourier transform as a matrix. From the first of the two matrices above, it is easy to see that the matrix for is symmetric. Indeed, the roles of are identical in . Furthermore, looking at the second matrix, we see that is almost unitary. Recall that a matrix is unitary if , where is the identity matrix and is the conjugate transpose of . For the case of , we have that . We encourage the reader to work this out by hand, noting that each entry in the matrix resulting from is an inner product of powers of .
In other words, we have , so that . Since is symmetric, this simplifies to .
Expanding this out to a summation, we get what we might have guessed was the inverse transform:
The only difference between this formula and our intuition is the factor of .
The last thing we’d like to mention in this primer is that the wonderful results on duality for the continuous Fourier transform translate to the discrete (again, with a factor of ). While we leave the details as an exercise to the reader,
In order to do this, we need some additional notation. We can think of as an infinite sequence which would repeat itself every entries (by Euler’s identity). Then we can “index” higher than , and get the identity . Similarly, we could “periodize” a discrete signal so that is defined by and we allow .
This periodization allows us to define naturally as . Then it becomes straightforward to check that , and as a corollary . This recovers some of our most powerful tools from the classical case for computing Fourier transforms of specific functions.
Next time (and this has been a long time coming) we’ll finally get to the computational meat of Fourier analysis. We’ll derive and implement the Fast Fourier Transform algorithm, which computes the discrete Fourier transform efficiently. Then we’ll take our first foray into playing with real signals, and move on to higher-dimensional transforms for image analysis.