Hubbry Logo
logo
Convolution theorem
Community hub

Convolution theorem

logo
0 subscribers
Read side by side
from Wikipedia

In mathematics, the convolution theorem states that under suitable conditions the Fourier transform of a convolution of two functions (or signals) is the product of their Fourier transforms. More generally, convolution in one domain (e.g., time domain) equals point-wise multiplication in the other domain (e.g., frequency domain). Other versions of the convolution theorem are applicable to various Fourier-related transforms.

Functions of a continuous variable

[edit]

Consider two functions and with Fourier transforms and :

where denotes the Fourier transform operator. The transform may be normalized in other ways, in which case constant scaling factors (typically or ) will appear in the convolution theorem below. The convolution of and is defined by:

In this context the asterisk denotes convolution, instead of standard multiplication. The tensor product symbol is sometimes used instead.

The convolution theorem states that:[1][2]: eq.8 

Applying the inverse Fourier transform produces the corollary:[2]: eqs.7, 10 

Convolution theorem

The theorem also generally applies to multi-dimensional functions.

This theorem also holds for the Laplace transform, the two-sided Laplace transform and, when suitably modified, for the Mellin transform and Hartley transform (see Mellin inversion theorem). It can be extended to the Fourier transform of abstract harmonic analysis defined over locally compact abelian groups.

Periodic convolution (Fourier series coefficients)

[edit]

Consider -periodic functions   and   which can be expressed as periodic summations:

  and  

In practice the non-zero portion of components and are often limited to duration but nothing in the theorem requires that.

The Fourier series coefficients are:

where denotes the Fourier series integral.

  • The product: is also -periodic, and its Fourier series coefficients are given by the discrete convolution of the and sequences:
  • The convolution:

is also -periodic, and is called a periodic convolution.

The corresponding convolution theorem is:

Functions of a discrete variable (sequences)

[edit]

By a derivation similar to Eq.1, there is an analogous theorem for sequences, such as samples of two continuous functions, where now denotes the discrete-time Fourier transform (DTFT) operator. Consider two sequences and with transforms and :

The § Discrete convolution of and is defined by:

The convolution theorem for discrete sequences is:[3][4]: p.60 (2.169) 

Periodic convolution

[edit]

and as defined above, are periodic, with a period of 1. Consider -periodic sequences and :

  and  

These functions occur as the result of sampling and at intervals of and performing an inverse discrete Fourier transform (DFT) on samples (see § Sampling the DTFT). The discrete convolution:

is also -periodic, and is called a periodic convolution. Redefining the operator as the -length DFT, the corresponding theorem is:[5][4]: p. 548 

And therefore:

Under the right conditions, it is possible for this -length sequence to contain a distortion-free segment of a convolution. But when the non-zero portion of the or sequence is equal or longer than some distortion is inevitable.  Such is the case when the sequence is obtained by directly sampling the DTFT of the infinitely long § Discrete Hilbert transform impulse response.[A]

For and sequences whose non-zero duration is less than or equal to a final simplification is:

Circular convolution

This form is often used to efficiently implement numerical convolution by computer. (see § Fast convolution algorithms and § Example)

As a partial reciprocal, it has been shown [6] that any linear transform that turns convolution into a product is the DFT (up to a permutation of coefficients).

Convolution theorem for inverse Fourier transform

[edit]

There is also a convolution theorem for the inverse Fourier transform:

Here, "" represents the Hadamard product, and "" represents a convolution between the two matrices.

so that

Convolution theorem for tempered distributions

[edit]

The convolution theorem extends to tempered distributions. Here, is an arbitrary tempered distribution:

But must be "rapidly decreasing" towards and in order to guarantee the existence of both, convolution and multiplication product. Equivalently, if is a smooth "slowly growing" ordinary function, it guarantees the existence of both, multiplication and convolution product.[7][8][9]

In particular, every compactly supported tempered distribution, such as the Dirac delta, is "rapidly decreasing". Equivalently, bandlimited functions, such as the function that is constantly are smooth "slowly growing" ordinary functions. If, for example, is the Dirac comb both equations yield the Poisson summation formula and if, furthermore, is the Dirac delta then is constantly one and these equations yield the Dirac comb identity.

See also

[edit]

Notes

[edit]

References

[edit]

Additional resources

[edit]

For a visual representation of the use of the convolution theorem in signal processing, see:

Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The convolution theorem is a fundamental principle in Fourier analysis that states the Fourier transform of the convolution of two functions equals the pointwise product of their individual Fourier transforms, and conversely, the inverse Fourier transform of the product of two Fourier transforms equals the convolution of the original functions.[1] This relationship, often expressed mathematically as F{fg}=F{f}F{g}\mathcal{F}\{f * g\} = \mathcal{F}\{f\} \cdot \mathcal{F}\{g\} where fg(t)=f(τ)g(tτ)dτf * g (t) = \int_{-\infty}^{\infty} f(\tau) g(t - \tau) \, d\tau, enables efficient computation of convolutions in the frequency domain, particularly using fast Fourier transform algorithms.[2] The theorem holds for functions in various domains, including time and spatial signals, and is commutative, meaning fg=gff * g = g * f.[2] It extends to multidimensional cases and discrete signals, where the discrete Fourier transform (DFT) replaces the continuous version, facilitating numerical implementations.[3] Originally derived in the context of integral transforms, the theorem's proof relies on the linearity and properties of the Fourier transform, such as the transform of a shifted function.[1] In applications, the convolution theorem is pivotal in signal processing for tasks like filtering, where convolving a signal with an impulse response in the time domain corresponds to multiplying their spectra in the frequency domain, simplifying deconvolution and system analysis.[2] It also underpins image processing techniques, such as blurring or template matching, by converting computationally intensive spatial convolutions into efficient frequency-domain multiplications.[3] More broadly, in machine learning, the theorem supports convolutional neural networks (CNNs) by enabling fast computation of large-kernel convolutions via the fast Fourier transform (FFT), reducing complexity from O(N2M2)O(N^2 M^2) to O(N2logN)O(N^2 \log N) for kernel size MM and input size NN.[3] These uses highlight the theorem's role in transforming complex integral operations into algebraic ones, advancing fields from physics to computer vision.[1]

Core statements

Continuous aperiodic case

In the continuous aperiodic case, the convolution of two functions f,gL1(R)f, g \in L^1(\mathbb{R}) (the space of absolutely integrable functions on the real line) is defined as
(fg)(t)=f(τ)g(tτ)dτ. (f * g)(t) = \int_{-\infty}^{\infty} f(\tau) g(t - \tau) \, d\tau.
This operation measures the overlap between ff and a reflected, shifted version of gg, and it extends to L2(R)L^2(\mathbb{R}) (square-integrable functions) via density arguments and Plancherel's theorem.[4][5]
The convolution theorem states that the Fourier transform of the convolution equals the pointwise product of the individual Fourier transforms:
F{fg}(ω)=F{f}(ω)F{g}(ω), \mathcal{F}\{f * g\}(\omega) = \mathcal{F}\{f\}(\omega) \cdot \mathcal{F}\{g\}(\omega),
where the Fourier transform is defined as
F{f}(ω)=f(t)e2πiωtdt. \mathcal{F}\{f\}(\omega) = \int_{-\infty}^{\infty} f(t) e^{-2\pi i \omega t} \, dt.
This holds for functions in L1(R)L^1(\mathbb{R}) by direct computation using Fubini's theorem for interchanging integrals, and for L2(R)L^2(\mathbb{R}) by continuity of the Fourier transform as a unitary operator on that space.[4][1]
The inverse form of the theorem asserts the duality: the inverse Fourier transform of the product of the transforms yields the convolution,
fg=F1{F{f}F{g}}. f * g = \mathcal{F}^{-1} \left\{ \mathcal{F}\{f\} \cdot \mathcal{F}\{g\} \right\}.
Here, the inverse transform is
F1{f^}(t)=f^(ω)e2πiωtdω, \mathcal{F}^{-1}\{\hat{f}\}(t) = \int_{-\infty}^{\infty} \hat{f}(\omega) e^{2\pi i \omega t} \, d\omega,
ensuring the operation is reversible under suitable integrability conditions, such as when F{f}\mathcal{F}\{f\} and F{g}\mathcal{F}\{g\} are in L1(R)L^1(\mathbb{R}). This duality simplifies computations in signal processing and partial differential equations by converting integration in the time domain to multiplication in the frequency domain.[4][1]
A illustrative example is the convolution of two Gaussian functions, f(t)=eπt2f(t) = e^{-\pi t^2} and g(t)=eπt2g(t) = e^{-\pi t^2}, each of whose Fourier transform is itself, F{f}(ω)=eπω2\mathcal{F}\{f\}(\omega) = e^{-\pi \omega^2} and similarly for gg. By the theorem, F{fg}(ω)=e2πω2\mathcal{F}\{f * g\}(\omega) = e^{-2\pi \omega^2}, so the inverse transform yields fg(t)=12eπt22f * g (t) = \frac{1}{\sqrt{2}} e^{-\frac{\pi t^2}{2}}, another Gaussian with variance scaled by the sum of the originals. More generally, for f(t)=eat2f(t) = e^{-a t^2} and g(t)=ebt2g(t) = e^{-b t^2} with a,b>0a, b > 0, the result is (fg)(t)=πa+beabt2a+b(f * g)(t) = \sqrt{\frac{\pi}{a+b}} e^{-\frac{a b t^2}{a+b}}, demonstrating how the product of exponential transforms produces a convolved Gaussian.[2][4] The convolution theorem for the Fourier transform was first linked to the transform by Émile Borel in 1899 through work on divergent series and was fully established by Percy Daniell in 1920 via generalizations of Stieltjes-Volterra products.[6]

Discrete aperiodic case

In the discrete aperiodic case, the convolution of two sequences ff and gg, defined over the integers Z\mathbb{Z}, is given by
(fg)[n]=k=f[k]g[nk] (f * g)[n] = \sum_{k=-\infty}^{\infty} f[k] g[n - k]
for all nZn \in \mathbb{Z}, where the sequences belong to the space 1(Z)\ell^1(\mathbb{Z}) (absolutely summable) or 2(Z)\ell^2(\mathbb{Z}) (square-summable).[7] This operation models the output of a linear time-invariant system in discrete-time signal processing when one sequence acts as the input and the other as the impulse response.[8] The discrete-time Fourier transform (DTFT) provides the frequency-domain representation for such aperiodic sequences, defined as
F{f}(ω)=F(ω)=n=f[n]ejωn, \mathcal{F}\{f\}(\omega) = F(\omega) = \sum_{n=-\infty}^{\infty} f[n] e^{-j \omega n},
where ω[π,π]\omega \in [-\pi, \pi] is the normalized angular frequency.[9] Similarly for gg, yielding G(ω)G(\omega). The convolution theorem states that the DTFT of the convolved sequence is the pointwise product of the individual DTFTs:
F{fg}(ω)=F(ω)G(ω). \mathcal{F}\{f * g\}(\omega) = F(\omega) \cdot G(\omega).
[7][8] This property simplifies computations by transforming time-domain convolution into frequency-domain multiplication, which is particularly efficient for long sequences in digital signal processing applications. The inverse relation follows from the inverse DTFT, which reconstructs the time-domain sequence via
fg=F1{FG}, f * g = \mathcal{F}^{-1}\{F \cdot G\},
where the inverse transform is
(fg)[n]=12πππF(ω)G(ω)ejωndω. (f * g)[n] = \frac{1}{2\pi} \int_{-\pi}^{\pi} F(\omega) G(\omega) e^{j \omega n} \, d\omega.
[9] This bidirectional equivalence holds under the specified sequence spaces, enabling seamless switching between domains. For convergence, the DTFT of sequences in 1(Z)\ell^1(\mathbb{Z}) is guaranteed to exist and be continuous for all ω\omega, as the absolute summability ensures the infinite sum converges uniformly.[10] For 2(Z)\ell^2(\mathbb{Z}) sequences, the DTFT exists in the mean-square sense but may exhibit discontinuities, though the convolution theorem still applies when both inputs are in these spaces.[10] A representative example illustrates the theorem: consider the convolution of two identical rectangular sequences, each of length M+1M+1, defined as x[n]=1x[n] = 1 for 0nM0 \leq n \leq M and 0 otherwise. The time-domain convolution yields a triangular sequence of length 2M+12M+1, peaking at M+1M+1 and linearly tapering to the sides. In the frequency domain, the DTFT of each rectangular sequence is the Dirichlet kernel X(ω)=sin(ω(M+1)/2)sin(ω/2)ejωM/2X(\omega) = \frac{\sin(\omega (M+1)/2)}{\sin(\omega /2)} e^{-j \omega M /2}, so the DTFT of the convolution is X(ω)2|X(\omega)|^2, demonstrating how multiplication produces the frequency response of the triangular output.[11][7]

Periodic variants

Continuous periodic functions

For P-periodic continuous functions ff and gg, the periodic convolution is defined by
(fg)(t)=1P0Pf(τ)g(tτ)dτ. (f \ast g)(t) = \frac{1}{P} \int_{0}^{P} f(\tau) g(t - \tau) \, d\tau.
This operation produces another P-periodic function, capturing the "overlap" of ff and gg over one period while ensuring consistency with Fourier analysis on the circle. The normalization factor 1/P1/P aligns the integral with the measure of the fundamental domain [0,P][0, P], preventing scaling issues that would arise in the frequency domain.[12] The Fourier series of a P-periodic function hh is expressed as h(t)=n=cn(h)e2πint/Ph(t) = \sum_{n=-\infty}^{\infty} c_n(h) e^{2\pi i n t / P}, where the complex coefficients are
cn(h)=1P0Ph(t)e2πint/Pdt. c_n(h) = \frac{1}{P} \int_{0}^{P} h(t) e^{-2\pi i n t / P} \, dt.
The convolution theorem adapts to this setting by stating that the Fourier coefficients of the periodic convolution satisfy
cn(fg)=cn(f)cn(g) c_n(f \ast g) = c_n(f) \, c_n(g)
for every integer nn. This pointwise multiplication in the frequency domain simplifies computations for periodic signals, such as filtering or system responses in engineering contexts. The theorem holds under mild conditions, such as square-integrability over the period, ensuring the integrals converge.[12][13] This periodic formulation relates to the aperiodic Fourier transform through the Dirac comb, or Shah function, \ShP(t)=k=δ(tkP)\Sh_P(t) = \sum_{k=-\infty}^{\infty} \delta(t - kP). A P-periodic function can be viewed as the convolution of an aperiodic function (supported on [0,P][0, P]) with \ShP\Sh_P, yielding the infinite periodic extension. The Fourier transform of \ShP\Sh_P is (2π/P)\Sh2π/P(ω)(2\pi / P) \Sh_{2\pi / P}(\omega), another comb, so the transform of the periodic function becomes the aperiodic transform modulated by this comb. Applying the convolution theorem in the transform domain then recovers the coefficient multiplication after sampling at discrete frequencies 2πn/P2\pi n / P.[14] A concrete example illustrates the theorem: the periodic convolution of two P-periodic square waves, each alternating between +1/2+1/2 and 1/2-1/2 with period P (a standard even square wave), produces a P-periodic triangular wave ranging from 1/2-1/2 to 1/21/2. The square wave has Fourier coefficients cn=i/(nπ)c_n = -i/(n\pi) for odd nn and 0 otherwise, decaying as 1/n1/|n|; their product yields cn(fg)1/n2c_n(f \ast g) \propto 1/n^2 for odd nn, matching the known series for the triangular wave and highlighting the smoothing effect of convolution.[15][16] The 1/P1/P factor in the convolution definition is essential for the theorem's clean form, as it mirrors the normalization in the coefficient integral. Omitting it would introduce an extraneous 1/P1/P in the product cn(fg)=(1/P)cn(f)cn(g)c_n(f \ast g) = (1/P) c_n(f) c_n(g), complicating reconstructions and applications like signal processing. This choice ensures the periodic case parallels the aperiodic transform limit as PP \to \infty.[12]

Discrete circular convolution

In the discrete circular convolution, two finite-length sequences $ f[n] $ and $ g[n] $, each of length $ N $, are convolved using periodic extension with wrap-around, defined as
(fg)[n]=k=0N1f[k]g[(nk)modN],0n<N. (f \circledast g)[n] = \sum_{k=0}^{N-1} f[k] \, g[(n - k) \mod N], \quad 0 \leq n < N.
This operation treats the sequences as periodic with period $ N $, incorporating aliasing effects from the modulo arithmetic that differ from linear convolution.[17] The convolution theorem for the discrete Fourier transform (DFT) states that the DFT of the circular convolution equals the pointwise product of the individual DFTs:
DFT(fg)[k]=DFT(f)[k]DFT(g)[k],0k<N, \text{DFT}(f \circledast g)[k] = \text{DFT}(f)[k] \cdot \text{DFT}(g)[k], \quad 0 \leq k < N,
where the DFT is given by
X[k]=n=0N1x[n]e2πikn/N. X[k] = \sum_{n=0}^{N-1} x[n] \, e^{-2\pi i k n / N}.
This equivalence holds because the DFT basis functions are eigenfunctions of the circulant matrix representing circular convolution.[17][18] Conversely, the circular convolution can be recovered via the inverse DFT (IDFT):
fg=IDFT{DFT(f)DFT(g)}, f \circledast g = \text{IDFT} \bigl\{ \text{DFT}(f) \cdot \text{DFT}(g) \bigr\},
with the IDFT defined as
x[n]=1Nk=0N1X[k]e2πikn/N. x[n] = \frac{1}{N} \sum_{k=0}^{N-1} X[k] \, e^{2\pi i k n / N}.
This relation enables efficient computation using fast Fourier transform (FFT) algorithms, reducing complexity from $ O(N^2) $ for direct summation to $ O(N \log N) $.[17][18] A representative example involves the circular convolution of two 4-point finite impulse response sequences: $ h[n] = {-2, -1, 3, 1} $ and $ x[n] = {-1, 0, 2, 1} $. Their DFTs are $ H[k] = {1, -5 + 2i, 1, -5 - 2i} $ and $ X[k] = {2, -3 + i, 0, -3 - i} $, respectively. The product $ Y[k] = H[k] X[k] = {2, 13 - 11i, 0, 13 + 11i} $, and the IDFT yields $ y[n] = {7, 6, -6, -5} $, matching direct circular summation. This demonstrates utility in FFT-based filtering for periodic or block-processed signals.[19] To approximate linear convolution without aliasing using circular convolution, sequences are zero-padded to length at least $ N_1 + N_2 - 1 $, where $ N_1 $ and $ N_2 $ are original lengths, ensuring the wrap-around does not overlap valid terms.[17][20]

Proofs

Continuous case derivation

The derivation for the continuous aperiodic case assumes that f,gL1(R)f, g \in L^1(\mathbb{R}), the space of Lebesgue integrable functions over the real line, which guarantees the absolute convergence of the relevant integrals.[21] Under this assumption, the convolution fgf * g is well-defined and also belongs to L1(R)L^1(\mathbb{R}), with fg1f1g1\|f * g\|_1 \leq \|f\|_1 \|g\|_1.[21] Consider the Fourier transform of the convolution, defined as
\mathcal{F}\{[f * g](/page/F&G)\}(\omega) = \int_{\mathbb{R}} \left( \int_{\mathbb{R}} f(\tau) g(t - \tau) \, d\tau \right) e^{-2\pi i [\omega](/page/Omega) t} \, dt.
[21] The integrand satisfies f(τ)g(tτ)e2πiωt=f(τ)g(tτ)|f(\tau) g(t - \tau) e^{-2\pi i \omega t}| = |f(\tau) g(t - \tau)|, and its double integral over R2\mathbb{R}^2 equals f1g1<\|f\|_1 \|g\|_1 < \infty, so Fubini's theorem permits interchanging the order of integration:[21]
\mathcal{F}\{[f * g](/page/F&G)\}(\omega) = \int_{\mathbb{R}} f(\tau) \left( \int_{\mathbb{R}} g(t - \tau) e^{-2\pi i \omega t} \, dt \right) d\tau.
[21] For the inner integral, perform the change of variables s=tτs = t - \tau (so dt=dsdt = ds and t=s+τt = s + \tau), yielding
Rg(s)e2πiω(s+τ)ds=e2πiωτRg(s)e2πiωsds=F{g}(ω)e2πiωτ. \int_{\mathbb{R}} g(s) e^{-2\pi i \omega (s + \tau)} \, ds = e^{-2\pi i \omega \tau} \int_{\mathbb{R}} g(s) e^{-2\pi i \omega s} \, ds = \mathcal{F}\{g\}(\omega) \, e^{-2\pi i \omega \tau}.
[21] Substituting back gives
\mathcal{F}\{[f * g](/page/F&G)\}(\omega) = \mathcal{F}\{g\}(\omega) \int_{\mathbb{R}} f(\tau) e^{-2\pi i \omega \tau} \, d\tau = \mathcal{F}\{f\}(\omega) \, \mathcal{F}\{g\}(\omega),
[21] establishing the theorem.
The result extends to functions in L2(R)L^2(\mathbb{R}) via the Plancherel theorem, which shows that the Fourier transform is a unitary operator (isometry) on L2(R)L^2(\mathbb{R}).[22] Specifically, the Schwartz functions (smooth with rapid decay) are dense in L2(R)L^2(\mathbb{R}), the theorem holds on this dense subspace by the L1L^1 case (as Schwartz functions are in L1L2L^1 \cap L^2), and continuity of the Fourier transform on L2L^2 yields the extension.[22] An alternative derivation uses the Fourier inversion formula, which recovers g(tτ)g(t - \tau) as the inverse transform of F{g}\mathcal{F}\{g\}.[23] Substituting into the convolution integral and interchanging orders (justified by Tonelli's theorem for nonnegativity or Fubini for integrability) with a change of variables θ=tτ\theta = t - \tau leads to the inverse transform of the product F{f}F{g}\mathcal{F}\{f\} \cdot \mathcal{F}\{g\}, confirming the theorem.[23] For functions not in L1(R)L^1(\mathbb{R}) but in L2(R)L^2(\mathbb{R}), the Fourier transforms are defined only in the L2L^2 sense (not pointwise), and convolutions may require approximation by mollifiers; the theorem holds with equality in the L2L^2 norm, addressing convergence via the Plancherel isometry.[22]

Discrete case derivation

The discrete convolution theorem for aperiodic signals states that the discrete-time Fourier transform (DTFT) of the convolution of two sequences equals the pointwise product of their individual DTFTs, under suitable convergence conditions.[7] Consider two discrete-time signals f[n]f[n] and g[n]g[n] with DTFTs F{f}(ω)=F(ω)\mathcal{F}\{f\}(\omega) = F(\omega) and F{g}(ω)=G(ω)\mathcal{F}\{g\}(\omega) = G(\omega), where the convolution is defined as
(fg)[n]=k=f[k]g[nk]. (f * g)[n] = \sum_{k=-\infty}^{\infty} f[k] g[n - k].
The DTFT of the convolution is
F{fg}(ω)=n=(fg)[n]eiωn=n=(k=f[k]g[nk])eiωn. \mathcal{F}\{f * g\}(\omega) = \sum_{n=-\infty}^{\infty} (f * g)[n] e^{-i \omega n} = \sum_{n=-\infty}^{\infty} \left( \sum_{k=-\infty}^{\infty} f[k] g[n - k] \right) e^{-i \omega n}.
Assuming absolute convergence of the sequences, i.e., f[k]<\sum |f[k]| < \infty and g[k]<\sum |g[k]| < \infty, the double summation allows reordering by Fubini's theorem for sums.[7] This yields
F{fg}(ω)=k=f[k]n=g[nk]eiωn. \mathcal{F}\{f * g\}(\omega) = \sum_{k=-\infty}^{\infty} f[k] \sum_{n=-\infty}^{\infty} g[n - k] e^{-i \omega n}.
For the inner sum, substitute m=nkm = n - k, so n=m+kn = m + k and
n=g[nk]eiωn=eiωkm=g[m]eiωm=eiωkG(ω). \sum_{n=-\infty}^{\infty} g[n - k] e^{-i \omega n} = e^{-i \omega k} \sum_{m=-\infty}^{\infty} g[m] e^{-i \omega m} = e^{-i \omega k} G(\omega).
Thus,
F{fg}(ω)=G(ω)k=f[k]eiωk=F(ω)G(ω). \mathcal{F}\{f * g\}(\omega) = G(\omega) \sum_{k=-\infty}^{\infty} f[k] e^{-i \omega k} = F(\omega) G(\omega).
For periodic variants, the discrete Fourier transform (DFT) applies to finite-length sequences of length NN, where convolution becomes circular. The circular convolution is
(fg)[n]=k=0N1f[k]g[(nk)modN]. (f \circledast g)[n] = \sum_{k=0}^{N-1} f[k] g[(n - k) \mod N].
The DFT of this circular convolution equals the product of the individual DFTs, leveraging the orthogonality of the complex exponentials.[24] Specifically, the DFT basis functions satisfy
n=0N1e2πi(kl)n/N=NδklmodN, \sum_{n=0}^{N-1} e^{2\pi i (k - l) n / N} = N \delta_{k l \mod N},
which follows from the sum of a finite geometric series: for klmodNk \neq l \mod N, the sum is 1e2πi(kl)1e2πi(kl)/N=0\frac{1 - e^{2\pi i (k-l)}}{1 - e^{2\pi i (k-l)/N}} = 0, and for k=lmodNk = l \mod N, it is NN. To derive the theorem, compute the DFT of fgf \circledast g:
F{fg}[m]=n=0N1(k=0N1f[k]g[nk])e2πimn/N. \mathcal{F}\{f \circledast g\}[m] = \sum_{n=0}^{N-1} \left( \sum_{k=0}^{N-1} f[k] g[n - k] \right) e^{-2\pi i m n / N}.
Reordering gives
k=0N1f[k]n=0N1g[nk]e2πimn/N=k=0N1f[k]e2πimk/Nl=0N1g[l]e2πiml/N, \sum_{k=0}^{N-1} f[k] \sum_{n=0}^{N-1} g[n - k] e^{-2\pi i m n / N} = \sum_{k=0}^{N-1} f[k] e^{-2\pi i m k / N} \sum_{l=0}^{N-1} g[l] e^{-2\pi i m l / N},
using the shift property and periodicity, resulting in F{f}[m]F{g}[m]\mathcal{F}\{f\}[m] \cdot \mathcal{F}\{g\}[m].[24] The theorem extends to square-summable sequences (2\ell^2) via density arguments and an analog of Parseval's theorem, which equates the energy in time and frequency domains: n=f[n]2=12πππF(ω)2dω\sum_{n=-\infty}^{\infty} |f[n]|^2 = \frac{1}{2\pi} \int_{-\pi}^{\pi} |F(\omega)|^2 d\omega.[25] Finite-support sequences (in 12\ell^1 \cap \ell^2) satisfy the theorem directly, and 2\ell^2 sequences approximate these densely, allowing the result by continuity of the DTFT on 2\ell^2.[25]

Generalizations

Inverse Fourier transform

The inverse Fourier transform of the pointwise product of two Fourier transforms equals the convolution of the original functions, providing the dual form of the convolution theorem. Specifically, if $ F = \mathcal{F}{f} $ and $ G = \mathcal{F}{g} $, then
F1{FG}=fg, \mathcal{F}^{-1}\{ F \cdot G \} = f * g,
where the convolution is defined as $ (f * g)(t) = \int_{-\infty}^{\infty} f(\tau) g(t - \tau) , d\tau $, assuming $ f $ and $ g $ are sufficiently integrable functions such as those in the Schwartz class or $ L^1(\mathbb{R}) $.[1][4] This inverse form follows directly from the forward convolution theorem and the Fourier inversion theorem, which ensures that applying the forward and inverse transforms recovers the original function under appropriate conditions. By taking the inverse Fourier transform of both sides of the forward theorem $ \mathcal{F}{f * g} = F \cdot G $, and invoking the linearity and invertibility of the transform, the equivalence is established, highlighting the symmetry between the time and frequency domains.[1][26] In the discrete setting, the analogous result holds for finite sequences using the discrete Fourier transform (DFT) and its inverse (IDFT). For sequences $ x[n] $ and $ y[n] $ of length $ N $, the circular convolution is given by
IDFT{DFT{x}DFT{y}}=xy, \text{IDFT}\{ \text{DFT}\{x\} \cdot \text{DFT}\{y\} \} = x \circledast y,
where $ (x \circledast y)[n] = \sum_{m=0}^{N-1} x[m] y[(n - m) \mod N] $, assuming zero-padding if necessary to avoid wrap-around effects in linear convolution approximations.[27][28] A practical illustration arises in signal smoothing, where multiplying the Fourier transform of a signal by a low-pass filter's transfer function (e.g., a rectangular window in frequency) attenuates high frequencies, and the inverse transform yields a convolved time-domain signal that smooths the original via the filter's impulse response, such as a sinc function.[4] This duality is particularly advantageous in filter design, as pointwise multiplication in the frequency domain is computationally simpler than direct time-domain convolution, enabling efficient implementation via fast Fourier transform algorithms.[27]

Tempered distributions

Tempered distributions are continuous linear functionals on the Schwartz space S(R)\mathcal{S}(\mathbb{R}), the space of smooth functions that decay rapidly at infinity along with all their derivatives.[29] These distributions, denoted S(R)\mathcal{S}'(\mathbb{R}), extend the notion of functions to include generalized objects like the Dirac delta, enabling the Fourier transform to be defined via f^,ϕ=f,ϕ^\langle \hat{f}, \phi \rangle = \langle f, \hat{\phi} \rangle for ϕS(R)\phi \in \mathcal{S}(\mathbb{R}).[30] The convolution theorem generalizes to tempered distributions as follows: if fS(R)f \in \mathcal{S}'(\mathbb{R}) is a tempered distribution and gCc(R)g \in C_c^\infty(\mathbb{R}) is a smooth function with compact support, then the convolution fgf * g is a smooth function defined by (fg)(x)=f,τxg~(f * g)(x) = \langle f, \tau_x \tilde{g} \rangle, where τxg~(y)=g(xy)\tau_x \tilde{g}(y) = g(x - y) and g~(y)=g(y)\tilde{g}(y) = g(-y), and the Fourier transform satisfies F{fg}=F{f}F{g}\mathcal{F}\{f * g\} = \mathcal{F}\{f\} \cdot \mathcal{F}\{g\} in the distributional sense.[29] This product of distributions is well-defined because F{g}\mathcal{F}\{g\} is a Schwartz function, which multiplies smoothly with any tempered distribution.[30] Convolution between two tempered distributions is defined provided at least one has compact support, ensuring the result is again a tempered distribution; for instance, if fE(R)f \in \mathcal{E}'(\mathbb{R}) (compactly supported distributions) and gS(R)g \in \mathcal{S}'(\mathbb{R}), then fgS(R)f * g \in \mathcal{S}'(\mathbb{R}).[31] A key property is that the Dirac delta distribution δ\delta, defined by δ,ϕ=ϕ(0)\langle \delta, \phi \rangle = \phi(0), acts as the identity under convolution: δf=f\delta * f = f for any tempered distribution ff.[32] As an illustrative example, since the Fourier transform of δ\delta is the constant function 1 (with appropriate normalization), the theorem yields F{δf}=F{δ}F{f}=1F{f}=F{f}\mathcal{F}\{\delta * f\} = \mathcal{F}\{\delta\} \cdot \mathcal{F}\{f\} = 1 \cdot \mathcal{F}\{f\} = \mathcal{F}\{f\}, confirming consistency with δf=f\delta * f = f.[32] A proof sketch proceeds by leveraging the density of Schwartz functions in suitable topologies and the pairing definition: for a test function ϕS(R)\phi \in \mathcal{S}(\mathbb{R}), fg,ϕ=f,g~ϕ\langle f * g, \phi \rangle = \langle f, \tilde{g} * \phi \rangle, where g~ϕ\tilde{g} * \phi is a Schwartz function; applying the Fourier transform to both sides and using the known theorem for Schwartz functions gives F{fg},ϕ^=F{f}F{g},ϕ^\langle \mathcal{F}\{f * g\}, \hat{\phi} \rangle = \langle \mathcal{F}\{f\} \cdot \mathcal{F}\{g\}, \hat{\phi} \rangle, establishing the equality in the distributional sense.[29]

Multidimensional extensions

The convolution theorem extends naturally to functions defined on Rd\mathbb{R}^d or Zd\mathbb{Z}^d, generalizing the one-dimensional case where d=1d=1. In the continuous setting, the multidimensional convolution of two integrable functions ff and gg is defined as
(fg)(x)=Rdf(y)g(xy)dy, (f * g)(\mathbf{x}) = \int_{\mathbb{R}^d} f(\mathbf{y}) g(\mathbf{x} - \mathbf{y}) \, d\mathbf{y},
where x,yRd\mathbf{x}, \mathbf{y} \in \mathbb{R}^d.[33] The multidimensional Fourier transform of a function f:RdCf: \mathbb{R}^d \to \mathbb{C} is given by
F{f}(ω)=Rdf(x)e2πiωxdx, \mathcal{F}\{f\}(\boldsymbol{\omega}) = \int_{\mathbb{R}^d} f(\mathbf{x}) e^{-2\pi i \boldsymbol{\omega} \cdot \mathbf{x}} \, d\mathbf{x},
with ωRd\boldsymbol{\omega} \in \mathbb{R}^d and \cdot denoting the dot product; the inverse transform follows analogously with the positive exponent.[33] Under suitable integrability conditions (e.g., f,gL1(Rd)f, g \in L^1(\mathbb{R}^d)), the convolution theorem states that the Fourier transform of the convolution equals the pointwise product of the individual transforms:
F{fg}(ω)=F{f}(ω)F{g}(ω).[](https://see.stanford.edu/materials/lsoftaee261/chap8.pdf)[](https://qiml.radiology.wisc.edu/wpcontent/uploads/sites/760/2020/10/notes007FourierPropND.pdf) \mathcal{F}\{f * g\}(\boldsymbol{\omega}) = \mathcal{F}\{f\}(\boldsymbol{\omega}) \cdot \mathcal{F}\{g\}(\boldsymbol{\omega}).[](https://see.stanford.edu/materials/lsoftaee261/chap8.pdf)[](https://qiml.radiology.wisc.edu/wp-content/uploads/sites/760/2020/10/notes_007_FourierPropND.pdf)
This property holds componentwise in the frequency domain, facilitating efficient computation for higher-dimensional signals such as those in imaging or physics simulations.[34] In the discrete case, particularly for multidimensional arrays like digital images, the theorem applies via the multidimensional discrete Fourier transform (DFT). For two M×NM \times N arrays f(x,y)f(x, y) and h(x,y)h(x, y) in two dimensions (d=2d=2), the 2D circular convolution is computed efficiently in the frequency domain: the 2D DFT of the convolution equals the product of the 2D DFTs, F(u,v)H(u,v)F(u, v) \cdot H(u, v), followed by an inverse 2D DFT to recover the spatial result.[35] To avoid wraparound artifacts from periodicity, zero-padding is typically applied to extend the arrays to at least (2M1)×(2N1)(2M-1) \times (2N-1).[35] This is widely used in image processing for filtering operations, where the fast Fourier transform (FFT) implementation reduces complexity from O(M2N2)O(M^2 N^2) to O(MNlog(MN))O(M N \log(M N)).[35] A representative application is 2D Gaussian convolution for image blurring, where the kernel is a radially symmetric Gaussian G(x)=12πσ2ex22σ2G(\mathbf{x}) = \frac{1}{2\pi \sigma^2} e^{-\frac{\|\mathbf{x}\|^2}{2\sigma^2}}. The 2D Fourier transform of this kernel is also Gaussian, F{G}(ω)=e2π2σ2ω2\mathcal{F}\{G\}(\boldsymbol{\omega}) = e^{-2\pi^2 \sigma^2 \|\boldsymbol{\omega}\|^2}, which is radial (depending only on ω\|\boldsymbol{\omega}\|) and acts as a low-pass filter by attenuating high frequencies.[36] Convolving an image II with GG thus multiplies F{I}\mathcal{F}\{I\} by F{G}\mathcal{F}\{G\} in the frequency domain, yielding a blurred image upon inversion; larger σ\sigma widens the Gaussian in space and narrows it in frequency, removing finer details.[36] For separable functions, where f(x)=k=1dfk(xk)f(\mathbf{x}) = \prod_{k=1}^d f_k(x_k) over Cartesian coordinates, the multidimensional Fourier transform factors as F{f}(ω)=k=1dF{fk}(ωk)\mathcal{F}\{f\}(\boldsymbol{\omega}) = \prod_{k=1}^d \mathcal{F}\{f_k\}(\omega_k), and the convolution theorem applies componentwise across dimensions.[33] This separability simplifies computations in applications like multidimensional filtering, reducing higher-dimensional operations to iterated one-dimensional transforms.[33]

Other integral transforms

The convolution theorem extends to other integral transforms, such as the Laplace transform and the Z-transform, which share the property that the transform of a convolution equals the product of the individual transforms, albeit with adaptations for their domains./09%3A_Transform_Techniques_in_Physics/9.09%3A_The_Convolution_Theorem)[37] For the Laplace transform, the theorem applies to causal functions, where the convolution is defined as the one-sided integral (fg)(t)=0tf(τ)g(tτ)dτ(f * g)(t) = \int_0^t f(\tau) g(t - \tau) \, d\tau for t0t \geq 0, and the transform satisfies L{fg}(s)=L{f}(s)L{g}(s)\mathcal{L}\{f * g\}(s) = \mathcal{L}\{f\}(s) \cdot \mathcal{L}\{g\}(s)./09%3A_Transform_Techniques_in_Physics/9.09%3A_The_Convolution_Theorem)[38] This formulation is particularly useful in analyzing linear time-invariant systems, such as where the output response is the convolution of an input with the system's impulse response, transforming to the product of their Laplace transforms (the transfer function).[39] The Z-transform, the discrete analog of the Laplace transform, exhibits a similar property for sequences: the Z-transform of the one-sided convolution (fg)[n]=k=0nf[k]g[nk](f * g)[n] = \sum_{k=0}^n f[k] g[n - k] is Z{fg}(z)=Z{f}(z)Z{g}(z)Z\{f * g\}(z) = Z\{f\}(z) \cdot Z\{g\}(z).[40][41] Unlike the bilateral Fourier transform, both the Laplace and Z-transforms are typically unilateral (one-sided), integrating or summing from zero onward, which suits them for stability analysis in causal systems and initial value problems but lacks the full inverse duality of the Fourier case.[42] These theorems were developed in the context of control theory following the Fourier transform's establishment, with the unilateral Laplace transform gaining prominence through applications in electrical engineering and system dynamics during the early 20th century.[43][44]

Applications

Signal processing

In signal processing, the convolution theorem plays a central role in the analysis and implementation of linear time-invariant (LTI) systems, where the output signal is obtained by convolving the input signal with the system's impulse response. This operation, expressed as $ y[n] = \sum_{k=-\infty}^{\infty} x[k] h[n - k] $ for discrete-time signals, corresponds in the frequency domain to multiplication of their discrete Fourier transforms: $ Y(\omega) = X(\omega) H(\omega) $. This duality enables filter design by specifying the desired frequency response $ H(\omega) $ and then deriving the impulse response $ h[n] $ via inverse transform, facilitating efficient implementation of filters like equalizers and noise reducers.[45] A key practical advantage arises from accelerating convolution computations using the fast Fourier transform (FFT), which reduces the complexity from $ O(N^2) $ for direct time-domain convolution of length-$ N $ signals to $ O(N \log N) $ by transforming to the frequency domain, multiplying the spectra, and applying the inverse FFT. This FFT-based approach, leveraging the convolution theorem, is essential for processing long signals in real-time applications, such as echo cancellation and spectral analysis.[46][47] For instance, a low-pass filter can be designed by multiplying the input's frequency spectrum by a rectangular window function that attenuates high frequencies, with the equivalent time-domain operation being convolution with a sinc function $ h[n] = \frac{\sin(\omega_c n)}{\pi n} $ (where $ \omega_c $ is the cutoff frequency), effectively smoothing the signal while preserving low-frequency components. This method is widely used in audio equalization to remove unwanted high-frequency noise.[48][49] To handle long signals and mitigate artifacts from the circular convolution inherent in finite DFTs, techniques like overlap-add and overlap-save are employed. In the overlap-add method, the input is segmented into overlapping blocks, each convolved via FFT, and the results are added after aligning overlaps, yielding linear convolution without edge effects. Similarly, overlap-save discards corrupted overlap portions post-convolution, ensuring efficient streaming processing for extended data streams.[46][50] In modern audio processing, the convolution theorem underpins real-time systems enhanced by GPU-accelerated FFT libraries, enabling low-latency convolution reverbs and spatial audio rendering since the early 2000s, with implementations like NVIDIA's cuSignal achieving up to 10x speedups over CPU methods for multichannel processing.[51][52]

Probability and statistics

In probability theory, the convolution of two probability density functions corresponds to the density of the sum of two independent continuous random variables. If XX and YY are independent random variables with densities fXf_X and fYf_Y, the density fZf_Z of Z=X+YZ = X + Y is given by the convolution
fZ(z)=fX(x)fY(zx)dx. f_Z(z) = \int_{-\infty}^{\infty} f_X(x) f_Y(z - x) \, dx.
[53] This operation arises naturally from the joint density under independence, fX,Y(x,y)=fX(x)fY(y)f_{X,Y}(x,y) = f_X(x) f_Y(y), and marginalizing over the sum.[54] The convolution theorem simplifies analysis via characteristic functions, which are the Fourier transforms of the densities. For independent XX and YY, the characteristic function of Z=X+YZ = X + Y is the product
ϕZ(t)=E[eitZ]=ϕX(t)ϕY(t), \phi_Z(t) = \mathbb{E}[e^{i t Z}] = \phi_X(t) \phi_Y(t),
where ϕX(t)=E[eitX]\phi_X(t) = \mathbb{E}[e^{i t X}].[55] This multiplicative property holds because the expectation factors under independence, providing a direct application of the theorem without computing the integral convolution explicitly.[56] The density fZf_Z can then be recovered by the inverse Fourier transform of ϕZ\phi_Z.[54] A representative example is the Irwin–Hall distribution, which describes the sum of nn independent uniform random variables on [0,1][0, 1]. The density is the nn-fold convolution of the uniform density fU(u)=1f_U(u) = 1 for u[0,1]u \in [0,1]. The characteristic function is the product ϕ(t)=[eit1it]n\phi(t) = \left[ \frac{e^{i t} - 1}{i t} \right]^n, equivalent to a product of sinc functions since eit1it=eit/2sin(t/2)t/2\frac{e^{i t} - 1}{i t} = e^{i t / 2} \cdot \frac{\sin(t/2)}{t/2}.[57] /05%3A_Special_Distributions/5.25%3A_The_Irwin-Hall_Distribution) This product form highlights how the theorem captures the smoothing effect of repeated convolutions, leading to a piecewise polynomial density.[53] In statistics, the convolution theorem enables deconvolution to correct for measurement errors. Consider observations Yj=Xj+ϵjY_j = X_j + \epsilon_j, where {ϵj}\{\epsilon_j\} are i.i.d. errors independent of the signal {Xj}\{X_j\} with known error density fϵf_\epsilon. The density fY=fϵfXf_Y = f_\epsilon * f_X, so the characteristic functions satisfy ϕY(t)=ϕϵ(t)ϕX(t)\phi_Y(t) = \phi_\epsilon(t) \phi_X(t). Solving for the signal gives ϕX(t)=ϕY(t)/ϕϵ(t)\phi_X(t) = \phi_Y(t) / \phi_\epsilon(t), with fXf_X obtained via inverse Fourier transform.[58] This approach is widely used in density estimation under errors, though it requires regularization if ϕϵ(t)\phi_\epsilon(t) vanishes.[59] The theorem extends to multivariate settings via multivariate characteristic functions. For independent random vectors X\mathbf{X} and Y\mathbf{Y} in Rd\mathbb{R}^d, the characteristic function of Z=X+Y\mathbf{Z} = \mathbf{X} + \mathbf{Y} is ϕZ(t)=ϕX(t)ϕY(t)\phi_\mathbf{Z}(\mathbf{t}) = \phi_\mathbf{X}(\mathbf{t}) \phi_\mathbf{Y}(\mathbf{t}), where tRd\mathbf{t} \in \mathbb{R}^d and ϕX(t)=E[eitX]\phi_\mathbf{X}(\mathbf{t}) = \mathbb{E}[e^{i \mathbf{t}^\top \mathbf{X}}].[54] This supports extensions to multinomial distributions, where the sum of independent multinomials follows a multinomial via convolution of probability mass functions, with the multivariate characteristic function product confirming the result.[60]

References

User Avatar
No comments yet.