Beyond Coincidence – Mathematical Truths That Transcend All Possible Universes
posted: 02-Feb-2025 & updated: 26-Aug-2025
We see Gaussian distributions everywhere not because our universe happens to be constructed that way, but because any universe with randomness and aggregation would necessarily exhibit the same patterns.
Understanding this inevitability doesn't just help us appreciate the elegant mathematics underlying probability theory; it offers us a glimpse into the deepest logical structures that shape reality at the most fundamental level.
… and as we'll see, sinusoidal functions are the only possible eigenfunctions of any such system. This isn't just a mathematical curiosity; it's a window into why the universe had no choice but to encode wave phenomena in the language of trigonometry.
… Maxwell's equations don't arbitrarily impose sinusoidal solutions on electromagnetic phenomena. Rather, they emerge from a deeper mathematical necessity.\[\newcommand{\reals}{\mathbb{R}} \newcommand{\preals}{\reals_{+}} \newcommand{\ppreals}{\reals_{++}} \newcommand{\complexes}{\mathbb{C}} \newcommand{\integers}{\mathbb{Z}} \newcommand{\kclosure}{\overline{K}} \newcommand{\Prob}{\mathop{\bf Prob}} \newcommand{\Expect}{\mathop{\bf E{}}} \newcommand{\Var}{\mathop{\bf Var{}}} \newcommand{\sign}{\mathop{\bf sign}} \newcommand{\innerp}[2]{\langle{#1},{#2}\rangle} % inner product\]
NotebookLM Podcasts
- Beyond Coincidence - Unpacking the Inevitable Mathematical Patterns of Our Universe (11:51)
- Beyond Coincidence - Mathematical Truths That Shape All Possible Universes (12:04)
- Unavoidable Patterns - How Math Shapes Any Universe (13:26)
From Contingent Physics to Absolute Mathematical Truth
In my previous exploration of Arbitrariness vs Inevitability, I examined the curious boundary between physical laws that might be different in alternate universes—like Newton’s gravitational constant or the specific exponent in inverse square laws—and those that seem to emerge from deeper geometric necessities. That investigation revealed how some aspects of our physical reality might be contingent accidents of cosmic history, while others appear rooted in mathematical truths as fundamental as the relationship between a circle’s circumference and its diameter. But that analysis only scratched the surface of a far more profound question – what lies beyond even the deepest physical laws?
Here I venture into an entirely different realm—one that transcends not just the specific features of our universe, but the very concept of physical existence itself. The mathematical truths I’ll explore don’t merely hold across all possible universes; they would remain true even if no universes existed at all, even if space, time, matter, and energy were merely figments of imagination. These are truths that exist in what we might call the space of pure logical necessity—patterns so fundamental that their denial would lead to logical contradiction, relationships so inevitable that they would emerge in any conceivable framework sophisticated enough to handle concepts like counting, randomness, or spatial relationships. Unlike the physical constants that govern electromagnetic forces or gravitational attraction, these mathematical inevitabilities require no physical substrate whatsoever to manifest their elegant, inescapable logic.
The Gaussian Distribution – Why “Normal” is Actually Inevitable
Walk into any statistics classroom, flip open any data science textbook, or peek under the hood of most machine learning algorithms, and you’ll find the same ubiquitous bell-shaped curve staring back at you. The Gaussian distribution—also known as the normal distribution—appears so frequently in nature and human affairs that we’ve literally named it “normal.” But is this dominance merely a convenient mathematical fiction, or does it reflect something deeper about the structure of reality itself?
The answer lies in one of the most remarkable theorems in all of mathematics: the Central Limit Theorem. Far from being an arbitrary choice or a quirk of our mathematical toolkit, the Gaussian distribution emerges as an inevitable consequence of how randomness behaves when it accumulates. This isn’t just mathematical elegance—it’s a fundamental truth that transcends our universe.
Unlike the physical constants I explored in my previous post on Arbitrariness vs Inevitability—where Newton’s gravitational constant $G$ or the specific exponent in inverse square laws might be contingent features of our particular universe—the normal distribution represents pure mathematical inevitability. The Gaussian curve would emerge in any possible universe where randomness exists, even in alternate realities with completely different physical laws. In fact, we don’t need universes at all for this truth to hold; it flows directly from the logical structure of probability itself, as universal and inevitable as prime numbers.
The Central Limit Theorem – Where Chaos Becomes Order
In probability theory, the Central Limit Theorem states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed.
The Central Limit Theorem states the following:
Let $\{X_i\}_{i=1}^\infty$ be independent, identically distributed (i.i.d.) random variables with $\Expect X_i = \mu$ and $\Var X_i =\sigma^2$. Then \begin{equation} \label{eq:clt} Z_n = \sum_{i=1}^n \frac{X_i-\mu}{\sqrt{n}\sigma} \end{equation} converges in distribution to $\mathcal{N}(0,1)$ as $n$ goes to $\infty$.
Let that sink in for a moment. You can start with any collection of random variables—whether they follow uniform distributions, exponential distributions, or even bizarre custom distributions with multiple peaks and valleys. Add enough of them together, and their sum will inexorably march toward the familiar bell curve. It’s as if mathematics itself has a built-in organizing principle that transforms chaos into predictable patterns.
This is profoundly different from the physical laws we encounter in nature. While we might ask whether Newton’s law of universal gravitation could have an exponent of 3 instead of 2 (and I argued that the exponent 2 might itself have geometric inevitability), or whether Einstein’s \(E=mc^2\) could have a different proportionality constant, no such questions arise for the Central Limit Theorem. The bell curve doesn’t depend on the specific physical constants of our universe, the peculiarities of biological evolution, or the accidents of cosmic history.
The theorem serves as a cornerstone of probability theory because it implies that probabilistic and statistical methods designed for normal distributions can be applied to countless problems involving other types of distributions. But beyond its practical utility lies something more profound: the Central Limit Theorem reveals that the Gaussian distribution isn’t just mathematically convenient—it’s mathematically inevitable.
Sometimes the theorem feels almost too good to be true, like discovering that the messy complexity of the real world secretly follows elegant mathematical laws. However, unlike the curious coincidences between geometry and physics that I explored in my discussion of inverse square laws in From Prime Numbers to Physical Laws - Arbitrariness or Inevitability?, the Central Limit Theorem represents unqualified inevitability. It emerges from the fundamental logic of probability itself, requiring no physical substrate whatsoever.
As fantastic as this mathematical inevitability appears, it raises an even deeper question: why should adding random variables together produce such orderly results? The answer reveals something profound about the nature of information, entropy, and the hidden mathematical structures that govern randomness itself—structures as universal as the geometric facts that make π appear in the area of circles, yet somehow even more fundamental.
Proof of Central Limit Theorem boils down to three simple facts
Here we will first clearly see why the Central Limit Theorem has to hold; it really boils down to surprisingly simple facts!1
- A smooth function can be well approximated by the quadratic polynomial (backed by Taylor’s theorem), i.e., if $f:\reals\to\reals$ is twice-differentiable at $a\in\reals$, there exists $h:\reals\to\reals$ such that
and
\[\lim_{x\to a} h(x) = 0.\]- The definition of $e$, i.e.,
\begin{equation} \label{eq:def-napier-constant} e = \lim_{n\to\infty} \left(1+\frac{1}{n}\right)^n. \end{equation}
- The characteristic function of Gaussian $\mathcal{N}(0, 1)$ is
\begin{equation} \label{eq:char-fcn-gaussian} e^{- \frac{1}{2}t^2}. \end{equation}
(Refer to the list of characteristic functions of some distributions).
Now let’s see why this is the case!
Characteristic function
The characteristic function of a random variable $X\in\reals$ is denoted by $\varphi_X:\reals\to\complexes$ and defined by
\[\varphi_X(t) = \Expect e^{itX}.\]Note that
\[\varphi_X'(t) = i\Expect X e^{itX} \; \& \; \varphi_X''(t) = -\Expect X^2 e^{itX}\]thus
\[\Expect X = -i \varphi_X'(0) \; \& \; \Expect X^2 = - \varphi_X''(0).\]Now let $X,Y\in\reals$ be two (statistically) independent random variables. Then the characteristic function of $X+Y$ is
\[\begin{eqnarray*} \varphi_{X+Y}(t) &=& \Expect e^{it(X+Y)} \\ &=& \int_{-\infty}^\infty \int_{-\infty}^\infty f_{X,Y}(x,y) e^{it(x+y)} dx dy \\ &=& \int_{-\infty}^\infty \int_{-\infty}^\infty f_X(x) f_Y(y) e^{it(x+y)} dx dy \\ &=& \left( \int_{-\infty}^\infty f_X(x) e^{itx} dx \right) \left( \int_{-\infty}^\infty f_Y(y) e^{ity} dy \right) \\ &=& \varphi_X(t) \varphi_Y(t) \end{eqnarray*}\]that is, the characteristic function of $X+Y$ is the product of the two individual characteristic functions of $X$ and $Y$ respectively. Repeatedly applying this, we can easily reach the conclusion that for $n$ independent random variables $X_1,\ldots,X_n$, the characteristic function of $Z = \sum_{i=1}^n X_i$ is
\[\varphi_Z(t) = \prod_{i=1}^n \varphi_{X_i}(t).\]Now let $Y_i^{(n)} = (X_i-\mu)/\sqrt{n}\sigma$ in \eqref{eq:clt}. Then
\[Z_n = Y_1^{(n)} + \cdots + Y_n^{(n)}.\]Since $Y_i^{(n)}$ for $1\leq i\leq n$ are i.i.d., $\varphi_{Y_i^{(n)}}:\reals\to\complexes$ are the same for all $1\leq i\leq n$. If we let $\varphi_{n}:\reals\to\complexes$ denote this quantity, the characteristic function of $Z_n$ is
\begin{equation} \label{eq:char-fcn-zn} \varphi_{Z_n}(t) = \prod_{i=1}^n \varphi_{Y_i^{(n)}}(t) = \left(\varphi_{n}(t)\right)^n. \end{equation}
Taylor’s theorem
Let $\tilde{Y_i} = (X_i-\mu)/\sigma$. Then Taylor’s theorem implies
\[\varphi_\tilde{Y_i}(t) = \varphi_\tilde{Y_i}(0) + \varphi_\tilde{Y_i}'(0)t + \frac{1}{2} \varphi_\tilde{Y_i}^{\prime\prime}(0) t^2 + o(t^2) = 1 - t^2/2 + o(t^2)\]since $\Expect \tilde{Y_i} = 0$ and $\Expect \tilde{Y_i}^2 = 1$ where $o(\cdot)$ denotes the little-o notation. Thus
\begin{equation} \label{eq:taylor-polynomial-char-fcn} \varphi_n(t) = \Expect e^{it(X_i-\mu)/\sqrt{n}\sigma} = \varphi_\tilde{Y_i}(t/\sqrt{n}) = 1 - t^2/2n + o(t^2/n^2) \end{equation}
Definition of $e$
The definition of $e$ \eqref{eq:def-napier-constant}, \eqref{eq:char-fcn-zn}, and \eqref{eq:taylor-polynomial-char-fcn} imply that
\[\lim_{n\to\infty} \varphi_{Z_n}(t) = \lim_{n\to\infty} \left( 1 - t^2/2n + o(t^2/n^2) \right)^n = e^{-t^2/2}\]Thus Lévy’s continuity theorem and \eqref{eq:char-fcn-gaussian} imply that
\[Z_n \Rightarrow \mathcal{N}(0,1)\]i.e., $Z_n$ converges in distribution to $\mathcal{N}(0,1)$, hence the proof!
The profound implications of mathematical inevitability
What makes this result so philosophically striking is its universality. The Central Limit Theorem doesn’t depend on the specific physical constants of our universe, the particular chemical composition of our planet, or even the existence of matter itself. It emerges purely from the logical structure of probability and addition—operations so fundamental that any conceivable intelligence capable of counting would eventually discover them.
Consider this thought experiment: imagine a civilization of pure information beings existing in a digital realm with no physical substrate whatsoever. Even these entities, if they developed concepts of randomness and aggregation, would inevitably arrive at the same bell-shaped curve. The Gaussian distribution transcends not just our universe but the very concept of physical reality itself.
This inevitability reveals something profound about the relationship between mathematics and existence. While the physical laws I explored in my previous post on Arbitrariness vs Inevitability might be contingent features of our particular cosmic circumstances, the Central Limit Theorem represents a deeper tier of truth—one that exists in the realm of pure logical necessity.
Hence in a sense, this mathematical truth doesn’t even require any universe to exist, or any intelligent beings to discover it. The Central Limit Theorem represents a timeless logical necessity—it would have been true even before the Big Bang (if such a concept as “before” has meaning), and it remains true independent of whether time, space, or physical reality exist at all. Unlike physical constants that might depend on the specific configuration of our cosmos, this theorem exists in the realm of pure logical structure, as inevitable as the fact that 2+2=4 or that prime numbers have no divisors other than 1 and themselves.
This places the Gaussian distribution in a remarkable category: truths that are not just universal across all possible universes, but truths that transcend the very concept of universe entirely. They exist in what we might call the space of logical necessity—a realm where mathematical relationships hold not because of any physical substrate, but because their denial would lead to logical contradiction.
The mathematical proof we’ve just examined demonstrates why this inevitability holds with such iron-clad certainty. The emergence of the Gaussian distribution from the Central Limit Theorem isn’t a happy accident or a convenient approximation—it’s the inevitable consequence of three fundamental mathematical facts: the nature of smooth functions (Taylor’s theorem), the definition of the mathematical constant $e$, and the characteristic function of Gaussian random variables.2 These building blocks are so basic to mathematics that any mathematical framework sophisticated enough to handle probability would necessarily contain them.
From chaos to cosmos - the universal organizing principle
Perhaps most remarkably, the Central Limit Theorem reveals that randomness itself contains a hidden organizing principle. No matter how chaotic, unpredictable, or wildly distributed your initial random variables might be, their collective behavior inexorably marches toward the same elegant bell curve. It’s as if mathematics itself has a built-in tendency toward order—a cosmic preference for the Gaussian distribution that transcends any particular physical implementation.
This stands in fascinating contrast to the apparent arbitrariness of many physical phenomena. While we might wonder why Newton’s gravitational constant has the specific value it does, or why electromagnetic forces follow inverse square laws rather than inverse cube laws, no such questions arise for the Central Limit Theorem. There’s no “Gaussian constant” that could have been different, no alternative bell-shaped curve that some alternate universe might have discovered instead.
The ubiquity of the normal distribution in nature—from the heights of human populations to the thermal motion of gas molecules, from measurement errors in scientific instruments to the fluctuations of financial markets—now reveals itself not as a series of coincidences, but as manifestations of this deeper mathematical truth. We see Gaussian distributions everywhere not because our universe happens to be constructed that way, but because any universe with randomness and aggregation would necessarily exhibit the same patterns.
The bridge between mathematical and physical reality
This inevitability of the Gaussian distribution also illuminates the mysterious effectiveness of mathematics in describing the natural world. When physicists model complex systems using normal distributions, they’re not imposing an arbitrary mathematical framework onto reluctant physical phenomena. Instead, they’re recognizing that physical processes involving the aggregation of many random effects must conform to this mathematical truth, regardless of the underlying physical details.
The Central Limit Theorem thus serves as a bridge between the realm of pure mathematical necessity and the contingent world of physical reality. It shows us that some aspects of our universe’s behavior are inevitable not because of the specific physical laws that govern matter and energy, but because of deeper logical structures that would constrain any possible universe where counting, adding, and randomness exist.
In this sense, the normal distribution represents perhaps the purest example of what I call mathematical inevitability—truths so fundamental that they transcend not just the specific features of our universe, but the very concept of physical existence itself. Understanding this inevitability doesn’t just help us appreciate the elegant mathematics underlying probability theory; it offers us a glimpse into the deepest logical structures that shape reality at the most fundamental level.
Sine Waves - The Mathematical Language of Linear Reality
(Hence, you really should study hard trigonometric functions in high school! ★^^★)
Turn on a radio, observe light propagating through space, or watch ripples spread across a pond—in each case, you’re witnessing the same fundamental mathematical truth – nature speaks in sine waves. But why? Why should the universe choose this particular mathematical function to carry energy, information, and disturbances across space and time?
The answer reveals another profound inevitability, one that emerges not from the specific equations of electromagnetism or acoustics, but from something far more fundamental – the mathematical structure of reality itself. Just as the Gaussian distribution emerges inevitably from the logic of probability and aggregation, sinusoidal waves emerge inevitably from the most basic properties we could expect any uniform space to possess.
The key insight lies in recognizing that space-time itself acts as what mathematicians call a Linear Time-Invariant (LTI) system—and as we’ll see, sinusoidal functions are the only possible eigenfunctions of any such system. This isn’t just a mathematical curiosity; it’s a window into why the universe had no choice but to encode wave phenomena in the language of trigonometry.
Linear time-invariant systems - the mathematical framework of uniform space
A linear time-invariant (LTI) system is any system \(H\) that satisfies two fundamental properties that we might naturally expect from uniform, homogeneous space.
- Linearity – If you combine two inputs linearly, you get the linear combination of their outputs.
- Translation Invariance – The laws remain the same everywhere—if you delay an input by some time \(\tau\), the output is delayed by exactly the same amount.
These properties might seem abstract, but they encode something profound about the nature of space itself. Linearity captures the principle of superposition—that disturbances can be added together without interfering with each other’s individual behavior. Translation invariance captures the homogeneity of space—that the laws of physics work the same way here as they do there, now as they did then.
Any medium that possesses these properties—whether it’s electromagnetic fields propagating through the vacuum, sound waves traveling through air, or quantum probability amplitudes evolving through space-time—must necessarily behave as an LTI system.
Discrete-time LTI systems
First suppose that the system being considered is a discrete-time system, that is, we assume that \(f:\integers \to K\) where \(K\) is some field (e.g., \(\reals\) or \(\complexes\)). Then any such \(f\) can be expressed as
\[f(t) = \sum_{k=-\infty}^\infty f(k) \delta(t-k)\]where \(\delta: \integers \to \{0,1\}\) is the Kronecker delta function, i.e.,
\[\delta(t) = \left\{\begin{array}{ll} 1 &\mbox{if } t=1 \\ 0 &\mbox{otherwise} \end{array}\right.\]Then the time-invariance and linearity imply that for a LTI system \(H\)
\begin{equation} \label{eq:lti-op} H(f(t)) = \sum_{k=-\infty}^\infty f(k) H(\delta(t-k)) = \sum_{k=-\infty}^\infty f(k) h(t-k) \end{equation}
where \(h = H(\delta)\) is the output of the system \(H\) when the input is the Kronecker delta function \(\delta\), which is called impulse response of \(H\).
This means for any (discrete-time) LTI system, you can fully characterize the output of any input to the system if you know the impulse response, that is, the output of the system when the input is the Kronecker delta function!
The operation on the right-hand-side (RHS) of \eqref{eq:lti-op} is called (as a binary operation of two functions) convolution (of \(f\) and \(h\)) and typically the symbol \(\star\) is used, i.e., for two functions \(f,g:\integers \to K\), the convolution of \(f\) and \(g\) is denoted by \(f\star g: \integers \to K\) and defined by3
\begin{equation} \label{eq:conv} (f\star g)(t) = \sum_{k=-\infty}^\infty f(k) g(t-k). \end{equation}
Therefore we can write the output of a discrete-time LTI system when the input is \(f\) as
\[H(f) = f \star h.\]Continuous-time LTI systems
Now suppose that the system being considered is a continuous-time system, that is, we assume that \(f:\reals \to K\) where \(K\) is some field.
(Here I choose the most intuitive way to explain concepts sacrificing a bit of mathematical rigor, but not much (at all).) Note that for some small $\Delta>0$, we can approximate \(f:\reals \to K\) by
\[\begin{align} \nonumber f(t) &\approx \sum_{k=-\infty}^\infty f(k\Delta) I_{[k\Delta-\Delta/2, k\Delta+\Delta/2]}(t) \\ \label{eq:fcn-approx} &= \sum_{k=-\infty}^\infty f(k\Delta) I_{[-\Delta/2, \Delta/2]}(t-k\Delta) \end{align}\]where \(I_A:\reals\to K\) for \(A\subset \reals\) is the indicator function defined by
\[I_A(t) = \left\{\begin{array}{ll} 1 &\mbox{if } t \in A \\ 0 &\mbox{otherwise} \end{array}\right.\]As \(\Delta\) goes to \(0\), (with proper smoothness assumption) the RHS of \eqref{eq:fcn-approx} converges to \(f\).
As before, the linearity of the system implies
\[H(f(t)) \approx \sum_{k=-\infty}^\infty f(k\Delta) H\left(I_{[-\Delta/2, \Delta/2]}(t-k\Delta)\right)\]Now we really should note the critical difference between continuous-time system and discrete-time system here. That is, as \(\Delta\) goes to zero, the indicator function \(I_{[-\Delta/2, \Delta/2]}(t-k\Delta)\) vanishes. So we need some way to preserve the energy or equivalently the area of the function, hence we use the following trick!
\[H(f(t)) \approx \sum_{k=-\infty}^\infty \Delta f(k\Delta) H\left(\frac{1}{\Delta} I_{[-\Delta/2, \Delta/2]}(t-k\Delta)\right)\]Here the energy, i.e., the area of (or rather under) the function \(\frac{1}{\Delta} I_{[-\Delta/2, \Delta/2]}(t-k\Delta)\) is preserved to be 1 as shown in Figure 1!
Figure 1: The graph of $\frac{1}{\Delta} I_{[-\Delta/2, \Delta/2]}(t)$ for small $\Delta>0$.
Now let
\[h_\Delta(t) = H\left(\frac{1}{\Delta} I_{[-\Delta/2, \Delta/2]}(t)\right).\]Then the time-invariance of the system implies
\[H(f(t)) \approx \sum_{k=-\infty}^\infty \Delta f(k\Delta) h_\Delta(t-k\Delta)\]Now we note that the RHS is nothing but a Riemann sum, hence under proper smoothness assumption, we have
\[H(f(t)) = \lim_{\Delta\to0} \sum_{k=-\infty}^\infty \Delta f(k\Delta) h_\Delta(t-k\Delta) = \int_{-\infty}^\infty f(\tau) h(t-\tau) d \tau\]where \(h = \lim_{\Delta\to0} h_\Delta\).
As in the discrete-time case, \(h\) is called the impulse response of the (continuous-time) LTI system, and the output of continuous-time LTI system can also be fully characterized by this impulse function!
Also, similarly as for the discrete-time case, the RHS of the equation is called the convolution of \(f\) and \(h\), and the same symbol \(\star\) as for the discrete-time case is used, i.e., for any two functions $f,g: \reals \to K$, we define $f\star g:\reals \to K$ by
\[(f \star g)(t) = \int_{-\infty}^\infty f(\tau) g(t-\tau) d\tau.\]Again, using this notation, we can express the output of continue-time LTI system as
\[H(f) = f \star h.\]The inevitable eigenfunctions
Here’s where the mathematical inevitability becomes crystal clear. For any LTI system, we can ask: what functions pass through the system unchanged except for scaling? These are the system’s eigenfunctions—inputs that emerge as perfect copies of themselves, multiplied only by some complex number.
In a very similar way that the eigenvalues and the associated eigenvectors are defined in linear algebra, that is, for a matrix \(A\in K^{n\times n}\) (for a field \(K\)), if there exist \(\lambda \in \kclosure\) and \(v\in \kclosure^n\) (where \(\kclosure\) is the algebraic closure of \(K\)) satisfying
\begin{equation} \label{eq:la-eigen} A v = \lambda v \end{equation}
\(\lambda\) is called the eigenvalue of \(A\) and \(v\) the associated eigenvector, we can define eigenfunctions (and the associated eigenvalues) similarly for LTI systems, i.e., for a LTI system \(H\), if there exist \(v:T \to \kclosure\) and \(\lambda\in\kclosure\) such that
\begin{equation} \label{eq:lti-eigen} H(v) = \lambda v \end{equation}
\(v\) is called the eigenfunction of \(H\) and \(\lambda\) the associated eigenvalue where \(T=\integers\) in discuss-time case and \(T=\reals\) in continuous-time case.
Note the similarity or resemblance between \eqref{eq:la-eigen} and \eqref{eq:lti-eigen}. We can simply remove the parentheses in \eqref{eq:lti-eigen} to make it look just like \eqref{eq:la-eigen}! So essentially they mean exactly the same thing, hence the names.
For both discrete-time and continuous-time LTI systems, the eigenfunctions are always of the following form.
\[v_f(t) = e^{i 2\pi f t} = \cos(2\pi ft) + i\sin(2\pi ft)\]This is a pure sinusoidal wave with frequency \(f\).
Below we will show this.
Discrete-time LTI systems
For discrete-time LTI systems, we assume $f$ is in $[0,1)$ since $v_f(t)$ is a periodic function of $f$ with period 1.
Note
\[\begin{align} \nonumber H(v_f)(t) = \sum_{k=-\infty}^\infty h(k) v_f(t-k) &= \sum_{k=-\infty}^\infty h(k) e^{-i 2\pi f k } e^{i2\pi f t} \\ \nonumber &= \left( \sum_{k=-\infty}^\infty h(k) e^{-i 2\pi f k } \right) v_f(t) \end{align}\]thus, \(v_f\) is the eigenfunction of \(H\) with \(\lambda_f = \sum_{k=-\infty}^\infty h(k) e^{-i 2\pi f k }\) as the associated eigenvalue where (it turns out that) it is the Fourier Transform of the (discrete-time) impulse response \(h:\integers \to K\).4
Continuous-time LTI systems
Here we assume $f\in\reals$. We will show $v_f(t)$ are the eigenfunctions for any (continuous-time) LTI system. Note
\[\begin{align} \nonumber H(v_f)(t) = \int_{-\infty}^\infty h(\tau) v_f(t-\tau) d\tau &= \int_{-\infty}^\infty h(\tau) e^{-i 2\pi f \tau } e^{i 2\pi f t} d\tau \\ \nonumber &= \left(\int_{-\infty}^\infty h(\tau) e^{-i 2\pi f \tau } d\tau \right) v_f(t) \end{align}\]thus, \(v_f\) is the eigenfunction of \(H\) with \(\lambda_f = \int_{-\infty}^\infty h(\tau) e^{-i 2\pi f \tau} d\tau\) as the associated eigenvalue, which is the Fourier Transform of the (continuous-time) impulse response \(h:\reals\to K\).5
The proof is startlingly simple for both cases! For both cases, we have
\[H(v_f) = \lambda_f v_f\]i.e., the sinusoidal function emerges from the system as a perfect copy of itself, scaled by the eigenvalue $\lambda_f$.
This holds for any LTI system whatsoever. The mathematics gives us no other choice.
From abstract mathematics to physical reality
Now comes the profound connection – electromagnetic wave propagation through free space is necessarily an LTI process.
The linearity is built into the very fabric of electromagnetism through Maxwell’s equations—electric and magnetic fields superpose linearly. If you have two electromagnetic disturbances, the total field is simply their sum.
The (time-)translation invariance emerges from the homogeneity of free space itself. The laws of electromagnetism work the same way in every region of empty space, at every moment in time. There are no preferred locations or special directions in the vacuum (or favorite time).
Therefore, when electromagnetic energy propagates through space, it must behave as an LTI system—and this means its natural modes of propagation must be the eigenfunctions of this system: sinusoidal waves.
The deep mathematical truth
This reveals something remarkable – Maxwell’s equations don’t arbitrarily impose sinusoidal solutions on electromagnetic phenomena. Rather, they emerge from a deeper mathematical necessity.
The wave equation that follows from Maxwell’s equations, i.e.,
\[\frac{\partial^2 \psi}{\partial t^2} = c^2 \nabla^2 \psi\]which both the electric field $E$ and the magnetic field $B$ should satisfy, isn’t the source of sinusoidal behavior; it’s the mathematical expression of the fact that electromagnetic propagation through homogeneous space must be an LTI process.
The sinusoidal solutions aren’t just a solution to this equation; they’re the inevitable solutions because they’re the only functions that can propagate through any linear, translation-invariant medium without changing their fundamental character.
Beyond electromagnetism – a universal principle
This mathematical inevitability extends far beyond electromagnetic waves. Any phenomenon that propagates through a linear, homogeneous medium—sound waves in uniform air, water waves on a calm surface, quantum probability amplitudes in free space, even vibrations through crystalline solids—must necessarily decompose into sinusoidal components.
This is why Fourier analysis works so universally across physics. It’s not that we’ve chosen a convenient mathematical tool; we’ve discovered that sinusoidal decomposition is the natural language that linear, translation-invariant reality uses to encode information.
Just as any statistical phenomenon involving aggregation must eventually yield Gaussian distributions, any wave phenomenon in uniform space must eventually express itself through sinusoidal functions. The mathematics offers no alternatives.
The inevitability of trigonometry
So when your high school math teacher insisted you learn trigonometric functions, they were introducing you to something far more profound than arbitrary mathematical machinery. They were teaching you the fundamental language that any universe with linear, homogeneous space would necessarily discover.
Sine and cosine aren’t just useful mathematical tools—they’re the inevitable mathematical expressions of how disturbances propagate through uniform reality. In any conceivable universe where space is homogeneous and physical processes are linear, intelligent beings would eventually discover these same trigonometric relationships, not because of the specific physics of their world, but because of the deeper mathematical structures that govern any possible world with these basic properties.
The ubiquity of sinusoidal waves in nature now reveals itself not as a curious coincidence, but as a manifestation of mathematical inevitability as fundamental as the appearance of $\pi$ in geometry or the emergence of Gaussian distributions in probability. We see sine waves everywhere because any linear, translation-invariant reality must speak in this mathematical language—and remarkably, that appears to include our own.
- Well, actually, they are two simple facts and one definition! ↩
- The second and the third obviously very closely relate to each other. ↩
- Note that the convolution is a commutative binary operator, i.e., $$ (f\star g)(t) = \sum_{k=-\infty}^\infty f(k) g(t-k) = \sum_{k'=-\infty}^\infty f(t-k') g(k') = (g\star f)(t). $$ ↩
- This holds for any LTI system. The converse is not true, though, that is, for specific LTI systems, there can exist eigenfunctions not of the form $e^{i 2\pi f t }$. ↩
- Again, note that this holds for any LTI system. The converse is not true, though, that is, for specific LTI systems, there can exist eigenfunctions not of the form $e^{i 2\pi f t }$. ↩