PHAS0067: Advanced Physical Cosmology

11 Principles of Perturbation Theory

()

This self-study section is non-examinable and is included just to fill in the details for those who want it. I recommend at least reading the summary (Section 11.5) to understand the implications for the rest of the course.

Before diving into relativistic perturbation theory, which has a bunch of complications, it’s worth talking about what we mean by perturbations in the first place.

Figure 13: Results of applying perturbation theory to a pendulum (illustrated on the right). Here I’ve plotted θ against time t for a numerical solution to the equations of motion (solid line) and the perturbation theory result (orange dashed line) for four initial “knocks” of different amplitudes. For small amplitude oscillations the perturbation theory remains accurate.

Often the full behaviour of physical systems is insanely complicated. Even if one can write down the fundamental equations governing the system, it might not be possible to solve them. As an example, think about a pendulum (Figure 13, right panel). The Lagrangian is

=T-V (296)
=12mr2θ˙2+mgrcosθ. (297)

Applying the Euler-Lagrange equations (54), we obtain the equation of motion for θ:

θ¨=-grsinθ. (298)

This is a well-defined differential equation but it has no closed-form solution – a situation that is very familiar in physics. Typically there are two options in these scenarios: either calculate a solution numerically, or perform an analytic approximation.

Given the exponential increase in the power of computers, one might imagine that analytic approximations are only of historical interest. But this overlooks some important points. First, the results given by computers will always be approximate because numbers are stored with imperfect precision. Keeping the resulting inaccuracies under control can be extremely challenging – and requires deep insight into the nature of the physics being simulated.

Second, simply plugging equations into a computer often gives very little insight or understanding. Typically in science we want some hint of why a given system behaves as it does and a computer solution to an equation rarely gives much of an answer to that (at least, not on its own).

Finally, solving the full equations can be essentially pointless if extremely accurate approximations exist. If you’re building a house, you need to understand gravity – but it’d be a very strange engineer that chooses to do so with Einstein’s general relativity rather than Newtonian equations. Choosing the right tool for the job is the mark of a true expert.

For all these reasons, it’s normal to solve difficult equations through some combination of numerical and analytic approaches. The typical analytic approach uses perturbation theory – inspecting how the system behaves for small departures from an equilibrium or other limit in which the equations can be solved.

11.1 Perturbation theory of the pendulum

You have almost certainly seen perturbation theory applied to the pendulum before. The trick is to realise that for sufficiently small angles θ1, we have the result sinθθ and so θ¨-(g/r)θ. We can immediately recognise this as the equation for a harmonic oscillator with period ω=(g/r)1/2.

More technically, we have actually identified a trivial background solution to the exact equations: θ=θbackground=0. Note that if we substitute this into the equations of motion we end up with 0=0 – i.e. they are correctly satisfied. We then write θ=θbackground+δθ, and Taylor expand the equation of motion to first order in δθ. The resulting equation is δθ¨-(g/r)δθ. This reframing should make clear that θbackground need not have been zero – if we can identify any solution, then we are able to consider small perturbations around that solution.

Figure 13 shows a comparison between the numerical result for the oscillating pendulum and the perturbation theory solution, the latter of which is proportional to sinωt. From top left to bottom right, the amplitude of initial oscillation is increased. Note howm for small oscillations, the perturbation approximation is excellent; as the amplitude increases, the match breaks down at progressively earlier times. For initial knocks even greater than those illustrated, the pendulum will actually spin around – the perturbation theory approach has completely broken down and does not predict remotely the correct behaviour. So, this example illustrates that perturbation theory works really well if you use it in the correct limits; you just have to be very careful not to believe its predictions outside those limits.

11.2 Orbits in a spherical potential

In cosmology we can make a lot of progress by looking at small amplitude, early time behaviour which can correctly be regarded as a perturbation around a homogeneous background. However, the background solution is no longer as simple as θ=0. It is worth taking a look at another, slightly more complicated example to see how this works before delving into full cosmological perturbation theory.

Consider the problem of a body orbiting in the midplane of a spherically-symmetric potential; this corresponds to the Lagrangian

=12r2ϕ˙2+12r˙2-V(r). (299)

The corresponding equations of motion, from Euler-Lagrange, are:

r2ϕ˙ =constant, aka the specific angular momentum j
r¨ =j2r3-V(r). (300)

The simplest stable solution is that of a circular orbit described by rc and ϕc where r˙c=0 and consequently

ϕc=jcrc2t, (301)

where without loss of generality I have taken ϕc=0 at t=0. This will be our background solution around which to perturb. We write r=rc+δr, ϕ=ϕc+δϕ and j=jc+δj; the linearised equations of motion are then given by Taylor expanding (300) to first order in the perturbations:

[(jc)]+δj =[rc2ϕ˙c]+rc2δϕ˙+2rcϕ˙cδr+, and (302)
[(r¨c)]+δr¨ =[jc2rc3]-[V(rc)]+2jcδjrc3-3jc2rc4δr-V′′(rc)δr+. (303)

The dots refer to higher order terms (e.g. (δr)2) which we ignore because by assumption δr is small and so (δr)2 is tiny.

Note that the terms in square brackets in the above expression cancel because rc and ϕc satisfy the original equations of motion. This is a generic feature of perturbation theory – even if we do not perturb around zero as in the pendulum example, the terms with no δ’s should all cancel.

Consider the simplest perturbation where δj=0. Then equation (303) becomes to leading order:

δr¨+(3jc2rc4+V′′(rc))δr=0, (304)

yielding the solution δr=Aeiωt with angular frequency ω where ω2=3jc2/rc4+V′′(rc). In other words, r oscillates around rc at a particular frequency. This is the famous result that mildly non-circular orbits can be thought of as “epicycles” around the circular orbit. Similarly (302) implies that δϕ=Beiωt meaning ϕ also oscillates around ϕc at the epicyclic frequency.

😇 Exercise 11A Ensure you can reproduce these results. What is special about the Keplerian case (i.e. where the potential corresponds to a point mass) to give rise to closed, elliptical orbits for planets etc?

11.3 Deriving perturbation equations from a Lagrangian

Typically one is told never to back-substitute solutions into a Lagrangian because it messes up functional dependencies which are necessary for the operation of the Euler-Lagrange equations. However, in spite of this, it is in fact valid to substitute perturbations around a particular background solution back into the Lagrangian2424 24 Technically speaking this all turns out OK because the background solution has no functional dependence on the perturbation terms. – provided you are only using it to derive the perturbation equations, i.e. you cannot then derive information about the original background solution from the new Lagrangian.

Let’s see this in practice for the orbital example just given. The new Lagrangian is

=12(rc+δr)2(ϕ˙c+δϕ˙)2+12(r˙c+δr˙)2-V(rc+δr) (305)

We can expand this order-by-order into 0, which contains no perturbation terms; 1 which contains only linear-order terms in the perturbation variables; 2 containing quadratic perturbations and so on. By design, 0 just regenerates the original Lagrangian (299) with the terms renamed to rc and ϕc. Then we have

1 =rcϕ˙c2δr+rc2ϕ˙cδϕ˙+r˙cδr˙-V(rc)δr
=ddt(jcδϕ) (306)

where the second line follows by substituting the background solution rc2ϕ˙c=jc and jc2/rc3=V(rc). Thus 1 is a total derivative and makes no contribution to the action, i.e. if you apply the Euler-Lagrange equations to 1 you’ll find the result boils down to 0=0!

This is no coincidence. To get linear perturbation theory we need the quadratic perturbation Lagrangian 2. With suitable simplifications this turns out to be:

2=12[jc2rc4-V′′(r)]δr2+12rc2δϕ˙2+12δr˙2+2jrcδrδϕ˙. (307)

😇 Exercise 11B

Show that applying the Euler-Lagrange equations to 2 generates the correct perturbed equations of motion (302) and (303).

The strategy when dealing with perturbation theory is therefore to obtain the second-order expansion of the Lagrangian and simplify it using the background equations of motion. If desired, one can obtain the perturbation equations of motion in this way – but even more importantly, we can calculate the conjugate momentum to a given perturbation variable which is a fundamental quantity when we move from classical to quantum theories.

11.4 Perturbation theory for fields

The basic ideas behind perturbation theory are no different for fields. Given an action for a field φ(𝒙,t), one can find a background solution φ0(𝒙,t) then expand the equations of motion to first order in perturbations δφ(𝒙,t). Alternatively – or necessarily if we wish to quantize the fields – one can find a perturbation Lagrangian too. Just as above, that is obtained by expanding the original Lagrangian to second order in δφ2; once the background equations are substituted, we will be left with 2 alone.

One significant point to be aware of is that, in the cases we will study, the background solution is normally homogeneous (in space, but not in time). For example, we will consider the inflationary field expanding around the homogeneous solutions we have already studied (Chapter 9). In such a case, it is highly advantageous to work in Fourier space. The equation of motion will be linear in δφ but in general will have derivatives both in time and space. Consider, as a simplified example, the wave equation:

δφ′′(𝒙,t)-2δφ(𝒙,t)=0. (308)

The usual trick to solve such partial differential equations is to Fourier transform them in the spatial domain; that is, we multiply equation (320) by e-i𝒌𝒙 and integrate over all 𝒙. The Fourier transform of 2 was discussed under Eq. (271) where we noted that one picks up a -k2 factor. So the Fourier-domain equivalent of (308) is:

δφ′′(𝒌,t)+k2δφ(𝒌,t)=0. (309)

This is now an ordinary differential equation in time t. Although the spatial dependence is still encoded in 𝒌 there are no derivatives with respect to 𝒌, making the solutions much easier to find. In fact, we can immediately read off that φ(𝒌,t)exp±ikt because the equation just looks like an independent harmonic oscillator at each value of 𝒌. This decoupling of a partial differential equation into a set of independent ordinary differential equations makes things much more tractable.

A possible perturbation action giving rise to the equation of motion (308) reads:

S=12d4x(|δφ|2-|δφ|2) (310)

This action can be re-expressed in Fourier space; with care you can show that

S=12dtd3𝒌(|δφ(𝒌,t)|2-k2|δφ(𝒌,t)|2) (311)

and one may verify this gives rise to the correct Fourier-domain equation of motion (309).

One can thus think of the Lagrangian separately for each wavemode 𝒌 as

(δφ,δφ)=12(|δφ(𝒌,t)|2-k2|δφ(𝒌,t)|2), (312)

at which point we have decoupled the infinite degrees of freedom in the field – there is nothing in the action coupling δφ(𝒌) to δφ(𝒌) whatsoever. Fields start to look very much like a set (admittedly infinite in size) of non-interacting variables. This will be immensely helpful in deriving some basic quantum field theory results for inflation (Chapter 13).

What, really, is equation (312)?

Despite the neatness of decoupling degrees of freedom in the field like this, do be aware there is the potential for confusion. For quantising true single-degree-of-freedom systems (like a point particle) the action is an integral over time alone. But our action is an integral over 𝒌 too. Thus (312) is a Lagrangian density in Fourier space – not a true Lagrangian.

We will get away in this course without worrying about that distinction too carefully but to get the normalization of the cosmological power spectrum correct, we need to derive the relationship between the Lagrangian and its density more carefully.

We start by considering a single degree of freedom at a time. Instead of formally Fourier transforming δφ, consider isolating the wavemode 𝒌 by setting

δφ(𝒙,t)=12N(A(t)ei𝒌0𝒙+A(t)*e-i𝒌0𝒙), (313)

where A(t) is the complex wave amplitude, N is a normalization that we will determine, and we include the second term to ensure the wave in δφ is real.

By substituting this ansatz for δφ into the real-space action (310) we obtain the Lagrangian

L(A,A)=(d3xN)×12(|A|2-k02|A|2). (314)

This looks extremely similar to (312) with A=δφ(𝒌0,t); in fact, it allows us to pretend that (312) is a Lagrangian if we just set N=d3x=(2π)3δ(0). (Technically this makes N infinite, but if you want to make it completely rigorous you can take limits as the volume reaches infinity.)

We will be able to calculate the minimum amplitude of fluctuations, |A|2 and will want to use this to calculate the power spectrum. Starting from (313), we find the following expression for δφ(𝒌,t):

δφ(𝒌,t)=(2π)3/22NAδ(𝒌0+𝒌)+(2π)3/22NA*δ(𝒌0-𝒌). (315)

Substituting this into the definition of the power spectrum (278) and integrating over 𝒌 and 𝒌 produces the expression

2π2k03Pδφ(k0) =|A|22Nδ(0)
Pδφ(k0) =k034π2|A|2, (316)

which allows us to skip straight from fluctuations in the single mode with amplitude A to the power spectrum.

For conceptual simplicity we will think of equations like (312) as Lagrangians in themselves and not dwell on the points above; we will just need the result (316) which anyway doesn’t look particularly different from the definition (278) – it differs just by a delta function and a factor 2.

11.5 Summary

  • Perturbation theory is important as a means to understand how equations behave when we cannot solve them exactly.

  • It can be extremely accurate in regimes where the perturbations are sufficiently small, e.g. for the CMB and early Universe.

  • Perturbation equations can be derived from Taylor-expanding equations of motion around a known “background” solution.

  • In the case of cosmology, the known background solution is the homogeneous FRW universe.

  • Similarly the Lagrangian can also be expanded around such a solution and used to derive the perturbation equations and conjugate momenta (this is important for our later quantum fluctuation analyses).

  • When perturbed fields are involved, it is often helpful to expand them in a Fourier basis because the perturbation equations then typically become ordinary differential equations in time (instead of partial differential equations in time and space).