← Back to blog

Oscillations, Symmetry and Diagonalization

From a single mass on a spring to a vibrating string, oscillatory systems are often taught in the typical classroom via a sequence of seemingly ad-hoc guesses: exponentials in time, normal modes in space, and Fourier series in the continuum limit. In this essay, we look at oscillations from a linear-operator lens, and use symmetries with which those operators commute to show how diagonalization naturally decouples the dynamics. We show how those familiar solutions arise in a principled fashion. The focus throughout is on symmetry, linear algebra and operator structure, making the discussion applicable well beyond mechanics - to spectral methods, linear dynamical systems and kernel methods which are common in Statistics/Machine Learning.

1: Introduction and Motivation)

Anyone that has taken physics in high school or university has likely seen the following equation many times:

x¨(t)=ω2x(t)\ddot{x}(t) = - \omega^2x(t)

This is the equation of motion of a Simple Harmonic Oscillator (SHO): the acceleration of the particle as a function of time, x¨(t)\ddot{x}(t), is proportional to its displacement from the mean position, x(t)x(t), and directed towards the mean position. If you were taught the way I was, the typical treatment is to introduce as a "guess" solution:

x(t)=Acos(ωt+ϕ)x(t) = A \cos(\omega t+\phi)

We check, and indeed, it satisfies the given differential equation. Then the physics teacher would claim "it is the most general solution for an SHO", and the class would proceed to discuss various properties of this solution.

If you are anything like I was in high school, you would be dissatisfied with this business. I was used to solving equations "in a complete manner". When solving a quadratic equation, we derive the roots. We complete the square, do some algebra and the two roots emerge inevitably. We don't "guess" the solution... then why should the situation be any different here?

In this blog, I show how linear algebra and symmetry provide a principled framework for understanding oscillatory systems. We first reformulate the SHO as a linear operator equation, and using time-translation invariance we see how exponential solutions emerge as irreducible temporal behaviors of the particle. Next, we extend this same operator-theoretic viewpoint to systems of finitely many coupled oscillators, where the coupling structure leads to normal modes via diagonalization. Lastly, we take the continuum limit of a system of infinitely coupled oscillators that obey spatial translation symmetry, and show that waves on a string arise as simultaneous eigenfunctions of the spatial symmetry generator and the underlying oscillatory operator.

I write this blog to share a few key insights that emerged for me only after revisiting ideas from several courses at MIT. These were concepts that were not immediately obvious or enlightening to me the first time I encountered them, but ones that I have realized in retrospection, once viewed through a unifying lens. I think I would have enjoyed reading such a blog the first time I encountered 8.03, 18.03 or 18.06 simply because I find these abstractions very beautiful and helpful to appreciate the core ideas underlying the subject.

I have tried to keep this blog to be technically accessible to a first-year undergraduate/advanced high school student. Towards that end, I emphasize the intuition and ideas behind the concepts rather than a fully formal, mathematical treatment. I leave references to resources where the math is discussed more concretely.

2: 1D SHO via linear operator)

2.1 The SHO operator)

As a first step towards a more principled framework to understand oscillations, we look at a single particle oscillating in 1-dimension from an operator viewpoint.

Informally, an operator is a function of functions. It takes in as input a function, and produces as output another function. Define the differential operator

D:=ddtD := \frac{d}{dt}

so that our SHO equation x¨+ω2x=0\ddot{x}+\omega^2x=0 can be written compactly as

(D2+ω2I)x(t)=0(D^{2} + \omega^2 I)x(t) = 0

where I write II to denote 11, for notational convenience later on. If we define A:=D2+ω2IA := D^2 + \omega^2I, then our SHO equation can be written simply as

Ax=0\boxed{Ax=0}

where we view xx as a function of time.

The operator AA itself is linear, meaning that it acts linearly on functions:

A(αx1+βx2)=αAx1+βAx2A(\alpha x_1 + \beta x_2) = \alpha A x_1 + \beta A x_2

for all functions x1,x2x_1, x_2 and scalars α,β\alpha, \beta.

Note that if x1(t)x_1(t) and x2(t)x_2(t) both represent valid motions of the particle, then any linear combination αx1(t)+βx2(t)\alpha x_1(t) + \beta x_2(t), with constants α,β\alpha, \beta, is also a valid solution. Thus, the set of solutions is closed under linear combinations, and therefore forms a vector space!

This is the first perspective shift I would like to emphasize. Rather than thinking of x(t)x(t) as a literal distance, we instead treat it as an abstract function that takes a time tt as input and returns a real number. Further, these functions inhabit a vector space and behave just like vectors. From this viewpoint, solving the simple harmonic oscillator amounts to finding all functions that lie in the null space (or kernel) of the linear operator AA. This sounds much more like a linear algebra problem!

Recall that in solving a matrix nullspace problem Mv=0Mv=0, we would do typically do Gaussian elimination or row reduction to find out a set of linearly independent vectors {vi}\{ v_{i} \} that span the nullspace of the matrix MM. Then, the most general solution is some linear combination:

v=iciviv = \sum_{i}c_{i}v_{i}

where the cic_{i} are constants.

To solve our SHO equation, we will try doing something similar, but here we have two important differences:

  1. We are working in the vector space of arbitrary functions x(t)x(t), which is an infinite dimensional vector space!
  2. Our linear operator AA isn't a matrix with concrete numbers. It has the D2D^2 operator within it.

Regardless, the core idea remains the same: finding a set of independent vectors {xi}\{ x_{i} \} that span the nullspace of the linear operator AA will give us a general solution to the SHO equation. We deal with these differences one-by-one.

2.2 Dimension of nullspace)

First, we show that the dimension of the nullspace of AA is 22 via a bijection to R2\mathbb{R}^2. So, we need not be concerned with the entire infinite-dimensional space of functions; we are only interested in a 22-dimensional subspace of this vector space of all functions.

We know that two initial conditions are required to fully specify the motion of the particle in SHO, typically the initial displacement x(0)x(0) and the initial velocity x˙(0)\dot{x}(0). One way of seeing why this fully constrains the particles motion for all time is the following: if we fix x(0)x(0) and x˙(0)\dot{x}(0) then all higher order derivatives at t=0t=0 are fixed too: the second derivative is x¨(0)=ω2x(0)\ddot{x}(0)=-\omega^2x(0), and the third derivative is (x¨)˙(0)=ω2x˙(0)\dot{(\ddot{x})}(0) = -\omega^2\dot{x}(0), and so on. Then, by Taylor's expansion around t=0t=0:

x(t)=x(0)+x˙(0)t+x¨(0)t22+x(t) = x(0) + \dot{x}(0) t + \ddot{x}(0) \frac{t^2}{2} + \dots

As all the higher-order derivatives at t=0t=0 are all fixed, so x(t)x(t) is fixed (fixing all orders of derivatives of any arbitrary function x(t)x(t) at a particular time fully determines the function).

One obvious fact that needs to be stated for completion: given any x(t)x(t), there is a unique x(0)x(0) and x˙(0)\dot{x}(0).

Thus, we showed that given a particular (x(0),x˙(0))(x(0), \dot{x}(0)) we obtain a unique x(t)x(t), and given a particular x(t)x(t) we get a unique (x(0),x˙(0))(x(0), \dot{x}(0)). Thus, there is a bijection: solution to SHO(x(0),x˙(0))\text{solution to SHO} \leftrightarrow (x(0), \dot{x}(0)). Further, this is a linear map:

x1(t)+x2(t)(x1(0)+x2(0),x˙1(0)+x˙2(0))x_{1}(t) + x_{2}(t) \leftrightarrow (x_{1}(0) + x_{2}(0), \dot{x}_{1}(0) + \dot{x}_{2}(0))

A bijective, linear map is an isomorphism, implying that the dimension of the spaces related by this map are the same (isomorphism literally means same-shape). So, the dimension of the nullspace of AA is the dimension of the space inhabited by all such pairs (x(0),x˙(0))(x(0), \dot{x}(0)). Clearly, any two real numbers can be valid initial conditions, so this space is R2\mathbb{R}^2, which has dimension 22. So, the dimension of nullspace of AA is 2.

This yields another beautiful picture that I wish to emphasize: each solution to the SHO x(t)x(t) is indexable by a unique vector on the real number plane R2\mathbb{R}^2 that specifies its initial conditions (x(0),x˙(0))(x(0), \dot{x}(0)). Further, choosing t=0t=0 to establish this bijection was arbitrary; you could take any (x(t),x˙(t))(x(t'), \dot{x}(t')) to index the path x(t)x(t) by a similar analysis as above. The set of all such points (x(t),x˙(t))(x(t'), \dot{x}(t')) thus correspond to the same trajectory x(t)x(t), and as you slide through all values of tt', we get a "trajectory" on R2\mathbb{R}^2 traced by this particle. You might have already seen this as the phase-space picture of the SHO.

2.3 Finding two basis vectors

Having proved that the nullspace of AA is 22-dimensional, we thus need to find only two independent functions x1(t),x2(t)x_{1}(t), x_{2}(t) to write down a general solution: any x(t)x(t) in the nullspace would be some linear combination of these two.

In linear algebra, it often helps to reorient ourselves to a convenient basis. The same problem that is very difficult in the default basis can become very simple if you exploit the structure of the matrix in question to perform a basis change.

So let's take a closer look at our linear operator AA:

A=D2+ω2IA = D^2 + \omega^2 I

and specifically, let's focus on the operator DD. One key property of DD is that it commutes with time translation. This is just a formal way of saying:

x˙(t+Δt)evaluate the derivative of x at shifted time=(ddtx)(t+Δt)=ddt(x(t+Δt))=(x(t+Δt))˙shift x(t), then take derivative\underbrace{ \dot{x}(t+\Delta t) }_{ \text{evaluate the derivative of } x \text{ at shifted time}}= \left( \frac{d}{dt}x \right)(t+\Delta t)= \frac{d}{dt}(x(t+\Delta t)) = \underbrace{ \dot{(x(t+\Delta t))} }_{ \text{shift }x(t) \text{, then take derivative}}

but understanding this symmetry better will be very helpful later on.

Symmetry is described mathematically by members of some symmetry group. When I say that DD obeys a symmetry in time, I mean that DD interacts in a certain fashion with elements of the symmetry group of time translations. In general, we can label a time translation by the amount of shift Δt\Delta t it causes:

time shift of ΔtTΔt\text{time shift of }\Delta t \leftrightarrow T_{\Delta t}

These TΔtT_{\Delta t} can also be represented as operators on our vector space of functions (i.e. this is one of the representations of our symmetry group). Their action on any x(t)x(t) is:

TΔtx(t)=x(t+Δt)T_{\Delta t}x(t) = x(t+\Delta t)

i.e. a mapping from the function x(t)x(t) to x(t+Δt)x(t+\Delta t). Note that TΔtT_{\Delta t} is also a linear operator.

Note that DD commutes with any such TΔtT_{\Delta t}:

DTΔt=TΔtDDT_{\Delta t} = T_{\Delta t}D

which we compactly write as [D,TΔt]=0[D, T_{\Delta t}]=0. This can be seen explicitly by the action on any x(t)x(t):

D(TΔtx(t))=D(x(t+Δt))=(x(t+Δt))˙TΔt(Dx(t))=TΔt(x˙(t))=x˙(t+Δt)\begin{align} D(T_{\Delta t} x(t)) =D (x(t+\Delta t)) &= \dot{(x(t+\Delta t))} \\ T_{\Delta t}(Dx(t)) = T_{\Delta t}(\dot{x}(t)) &= \dot{x}(t+\Delta t) \end{align}

Similarly, D2D^2 also commutes with time translations TΔtT_{\Delta t}: [D2,TΔt][D^2, T_{\Delta t}] and consequently, as A=D2+ω2IA = D^2 + \omega^2I:

[A,TΔt]=0[A, T_{\Delta t}] = 0

because ω2I\omega^2I commutes with any operator (its just a scalar times the identity operator).

Now, the key step that will justify all this hard work:

linear operators that commute have the same eigenfunctions\text{linear operators that commute have the same eigenfunctions}

That sounds like a mouthful, but is a key result taught in Quantum Mechanics and Linear Algebra courses. To break it down, I discuss below the underlying intuitive picture.

In this context, eigenfunctions are simply the solutions to operator equations, akin to eigenvectors in finite-dimensional spaces. If a linear operator MM has an eigenfunction vv with eigenvalue λ\lambda, then:

Mv=λvMv = \lambda v

So, applying the operator MM to the eigenfunction vv results in the function being scaled by the eigenvalue λ\lambda. Importantly, eigenfunctions are directions (in the function space) that remain unchanged except for scaling when acted upon by MM. This is not true for a general function. For example, if MM has eigenfunctions v1v_{1} and v2v_{2} with distinct eigenvalues λ1\lambda_{1} and λ2\lambda_{2}, then:

M(c1v1+c2v2)=λ1c1v1+λ2c2v2(c1v1+c2v2)M(c_{1}v_{1} + c_{2}v_{2}) = \lambda_{1}c_{1}v_{1} + \lambda_{2}c_{2}v_{2} \cancel\propto (c_{1}v_{1} + c_{2}v_{2})

unless one of c1c_{1} or c2c_{2} is zero. Eigenfunctions are special directions where the action of the operator is simple scaling.

Our claim is that if two operators commute, then these directions align. The full proof is found in the references, but I prove it for the case where these operators don't have degenerate eigenvalue (i.e. for each eigenvalue, there is a unique associated direction). Let M1M_{1} and M2M_{2} commute, and take any eigenfunction v1v_{1} of M1M_{1} with eigenvalue λ1\lambda_{1}. Here, I will assume that v1v_{1} is the only eigenfunction of M1M_{1} with the Then:

M2M1v1=M1M2v1M_{2}M_{1}v_{1} = M_{1}M_{2}v_{1}

as [M1,M2]=0[M_{1},M_{2}]=0. But then:

M2(M1v1)=M2(λ1v1)=λ1(M2v1)=M1(M2v1)M_{2}(M_{1}v_{1}) = M_{2}(\lambda_{1}v_{1}) = \lambda_{1} (M_{2}v_{1}) = M_{1} (M_{2}v_{1})

so, M2v1M_{2}v_{1} is also an eigenfunction of M1M_{1} with eigenvalue λ1\lambda_{1}. But as these are non-degenerate operators, M2v1M_{2}v_{1} must be in the same direction as v1v_{1}, and so for some constant λ2\lambda_{2}:

M2v1=λ2v1M_{2}v_{1} = \lambda_{2}v_{1}

so, v1v_{1} is also an eigenfunction of M2M_{2}.

The underlying picture is this: because M1M_{1} and M2M_{2} commute, the order in which you apply them to an eigenfunction of either of these operator does not matter, as they both simply scale that eigenfunction. Their actions on these common directions do not interfere.

Going back to our SHO problem, as AA commutes with TΔtT_{\Delta t}, we must have that any eigenfunction of AA is also an eigenfunction of TΔtT_{\Delta t}. As we are interested in finding 22 distinct functions in the nullspace of AA:

Ax=0Ax=0

This is equivalent to finding two distinct eigenfunctions of AA with eigenvalue 00. And using the result on commuting operators, all eigenfunctions of AA are eigenfunctions of TΔtT_{\Delta t}!

The eigenfunctions of the time-translation operator TΔtT_{\Delta t} are much more fundamental objects, and yield a much more satisfying answer to why they are eigenfunctions of time-translation in the first place, as we see below.

We seek functions f(t)f(t) such that TΔtf=λfT_{\Delta t}f =\lambda f for some λC\lambda \in \mathbb{C}, the set of complex numbers. We use complex numbers instead of real numbers because they are algebraically closed, meaning that we need not be concerned when we solve the eigenvalue problem (which, in general, gives a polynomial in λ\lambda):

TΔtf=f(t+Δt)=λf(t)T_{\Delta t}f = f(t+\Delta t) = \lambda f(t)

This already makes the answer fairly clear, but I finish the derivation for completeness. Taking Δtdt\Delta t \to dt, this gives us a differential equation for f(t)f(t):

df(t)f(t)=(λ1)dt    f(t)=f(0)e(λ1)t\frac{df(t)}{f(t)} = (\lambda-1)dt\implies f(t) = f(0)e^{(\lambda-1)t}

Thus, the eigenfunctions of the time-translation operator are exponentials!

It is very satisfactory to see this result: exponentials are such functions that if you shift their argument by Δt\Delta t, you scale the function by esΔte^{s\Delta t} for some constant sCs \in \mathbb{C}. Exponentials are the unique class of functions that survive (i.e. are scaled and don't change direction) under action by translation!

Now that we know that all eigenfunctions of TΔtT_{\Delta t} are exponentials, and that all eigenfunctions of AA are eigenfunctions of TΔtT_{\Delta t}, then all that remains to find is two distinct exponentials that lie in the nullspace of AA. This brings us to the place where most college courses start: plug in este^{st} as a guess solution, where sCs \in \mathbb{C}:

Aest=0    (D2+ω2I)est=0    (s2+ω2)est=0    s=±iωAe^{st}=0 \implies (D^2+\omega^2I)e^{st}=0 \implies (s^2+\omega^2)e^{st} = 0 \implies \boxed{s = \pm i\omega}

Thus, any general solution x(t)x(t) to the SHO equation can be written as a linear combination of these two nullspace-spanning eigenfunctions:

x(t)=C1eiωt+C2eiωtx(t) = C_{1} e^{i\omega t} + C_{2}e^{-i\omega t}

where C1,C2CC_{1},C_{2} \in \mathbb{C}. But as we are dealing with functions that describe the motion of our particle, we require that x(t)x(t) and x˙(t)\dot{x}(t) are real valued for all tt. We ensure that the imaginary parts of these two functions are zero, and after some algebra we end up with the general solution to the SHO:

x(t)=D1cos(ωt)+D2sin(ωt)where  D1,D2Rx(t) = D_{1}\cos (\omega t) + D_{2}\sin(\omega t) \qquad \text{where} \; D_{1},D_{2} \in \mathbb{R}