PHAS0067: Advanced Physical Cosmology

10 Characterising Random Fields

()

The real universe is far from homogeneous and isotropic except on the largest scales. Figure 12 (left panel) shows slices through the 3D distribution of galaxy positions from the 2dF galaxy redshift survey out to a comoving distance of 600Mpc. The distribution of galaxies is clearly not random; instead they are arranged into a delicate cosmic web with galaxies strung out along dense filaments and clustering at their intersections, leaving huge empty voids. However, if we smooth the picture on large scales (100Mpc) it starts to look much more homogeneous. Furthermore, we know from the CMB that the universe was smooth to around 1 part in 105 at the time of recombination; see right panel of Fig. 12. The aim of the rest of the course is to study the growth of large-scale structure in an expanding universe through gravitational instability acting on small initial perturbations. We will then learn how these initial perturbations were likely produced by quantum effects during cosmological inflation.

(Left panel) Slices through the 3D map of galaxy positions from the 2dF galaxy
redshift survey. Figure credit: 2dF. (Right panel) Fluctuations in the CMB temperature, as determined from
Figure 12: (Left panel) Slices through the 3D map of galaxy positions from the 2dF galaxy redshift survey. Figure credit: 2dF. (Right panel) Fluctuations in the CMB temperature, as determined from Planck data, about the average temperature of 2.725K. The fluctuations are at the level of only a few parts in 105. Credit: Planck science team.

In this Chapter, we will explore the basic statistical properties enforced on such fields by assuming the physics that generates the initial fluctuations, and subsequently processes them, respects the symmetries of the background cosmology, i.e. isotropy and homogeneity. It seems inconsistent at first sight to talk about inhomogeneities/anisotropies being generated by a mechanism that is itself homogeneous/isotropic! But there is a key distinction: it is homogeneous and isotropic on average over all possible universes but not for a specific universe such as our own.

Throughout, we denote expectation values with angle brackets, For some quantity X, we imagine measuring it in N different universes and getting different results X1, X2, , XN. Then the expectation value X is defined by

XlimNi=1NXiN. (265)

To be clear, one will never be able to actually measure X (irrespective of what X represents) because it requires access to an infinity of different universes. But it is still a useful theoretical construction as we shall see.

10.1 The two-point correlation function in 3D Euclidean space

Consider a random field f(𝒙) – i.e. at each point f(𝒙) is some random number – with zero mean, f(𝒙)=0. Correlators of fields are expectation values of products of fields at different spatial points (and, possibly, different times). They are important because typically they can be calculated theoretically, and can be estimated observationally; they are the essential link between observation and theory (see Section 6 below).

The two point correlator ξ of the field f is defined by

ξf(𝒙,𝒚)f(𝒙)f(𝒚)limNi=1Nfi(𝒙)fi(𝒚)N. (266)

where again fi(𝒙) is the field f evaluated at 𝒙 in the ith notional universe.

We can now explore some general properties of ξf(𝒙,𝒚). Statistical homogeneity means that the statistical properties of the translated field are the same as the original field. For the two-point correlation, this implies

ξf(𝒙,𝒚) =ξf(𝒙-𝐚,𝒚-𝐚) for any displacement 𝐚
ξf(𝒙,𝒚) =ξf(𝒙-𝒚,0), (267)

so the two-point correlator only depends on the separation of the two points. As a shorthand, we therefore can write ξf(𝒙,𝒚)=ξf(𝒙-𝒚), discarding the trailing 0.

Statistical isotropy means that the statistical properties of the rotated field are the same as the original field. For the two-point correlator, we must have

ξf(𝒙,𝒚)=ξf(𝖱𝒙,𝖱𝒚) for any rotation matrix 𝖱. (268)

Combining statistical homogeneity and isotropy gives

ξf(𝒙-𝒚) =ξ(𝖱(𝒙-𝒚)) for any rotation matrix 𝖱 (269)

so the two-point correlator depends only on the distance between the two points. We therefore write ξf(𝒙,𝒚)=ξf(|𝒙-𝒚|), both discarding the trailing 0 (as before) but also being clear that ξ is a function only of the scalar separation of 𝒙 and 𝒚, not of orientation. You can verify that this holds even if correlating different fields (or the same field at different times).

10.2 Random fields in Fourier space

We can repeat these arguments to constrain the form of the correlators in Fourier space, which we will use in preference to the real-space results above. We adopt the symmetric Fourier convention, so that

f(𝒌)=d3𝒙(2π)3/2f(𝒙)e-i𝒌𝒙andf(𝒙)=d3𝒌(2π)3/2f(𝒌)ei𝒌𝒙. (270)

Note that we don’t use a different notation for the function f(𝒌); whether we are talking about the function evaluated in the Fourier or real-space domains is normally very clear from the context. While some textbooks use a different symbol such as f~(𝒌), this becomes extremely fussy for complicated expressions and so we will not bother.

For real fields f(𝒙) it follows immediately from the definitions above that f(𝒌)=f*(-𝒌). Also recall that spatial derivatives turn into factors of i𝒌 in the Fourier transform, e.g. the fourier transform of f(𝒙) is

d3𝒙(2π)3/2(f(𝒙))e-i𝒌𝒙 =-d3𝒙(2π)3/2f(𝒙)e-i𝒌𝒙
=i𝒌d3𝒙(2π)3/2f(𝒙)e-i𝒌𝒙
=i𝒌f(𝒌). (271)

Similarly the Fourier transform of 2f(𝒙) is -𝒌2f(𝒌). These properties will be important to us when we develop perturbation theory in Sections 11 and 13, allowing us to replace differential equations in real space 𝒙 with algebraic ones in Fourier space 𝒌.

For now, we need to focus on a different property; namely, under translations, the Fourier transform acquires a phase factor:

ftranslated(𝒌) =d3𝒙(2π)3/2f(𝒙-𝐚)e-i𝒌𝒙
=d3𝒙(2π)3/2f(𝒙)e-i𝒌𝒙e-i𝒌𝐚
=f(𝒌)e-i𝒌𝐚. (272)

Let’s consider the two-point correlator in Fourier space. We could write this f(𝒌)f(𝒌) but actually we instead by convention consider f(𝒌)f*(𝒌). (The two options contain equivalent information, but the second turns out to lead to simpler mathematical expressions.) Invariance of the two-point correlator in Fourier space under translations implies

f(𝒌)f*(𝒌) =f(𝒌)f*(𝒌)e-i(𝒌-𝒌)𝐚 for any displacement 𝐚
f(𝒌)f*(𝒌) =F(𝒌)δ(𝒌-𝒌), (273)

for some function F(𝒌), where δ(𝒌-𝒌) indicates the Dirac delta function which is zero when 𝒌𝒌. The upshot is that different Fourier modes are necessarily uncorrelated due to statistical homogeneity. Under rotations,

frotated(𝒌) =d3𝒙(2π)3/2f(𝖱𝒙)e-i𝒌𝒙
=d3𝒙(2π)3/2f(𝒙)e-i𝒌(𝖱-1𝒙)

where 𝒙=𝖱𝒙. Now, for any rotation matrix 𝖱, we have 𝖱-1=𝖱. So, -i𝒌𝖱-1𝒙=-i(𝖱𝒌)𝒙, and

frotated(𝒌) =d3𝒙(2π)3/2f(𝒙)e-i(𝖱𝒌)(𝒙) (275)
=f(𝖱𝒌). (276)

So demanding invariance of the two-point correlator under rotations implies

F(𝖱𝒌)δ(𝒌-𝒌)=F(𝒌)δ(𝒌-𝒌) for any rotation matrix 𝖱. (277)

(To get this result you need to use property of the Dirac delta function δ(𝖱𝒌)=det𝖱δ(𝒌)=δ(𝒌).) Equation (277) can only hold if F(𝒌)=F(k) where k|𝒌|. We can therefore define the power spectrum, Pf(k), of a homogeneous and isotropic field, f(𝒙), by

f(𝒌)f*(𝒌)=2π2k3Pf(k)δ(𝒌-𝒌). (278)

The normalisation factor 2π2/k3 in the definition of the power spectrum is conventional and has the virtue of making Pf(k) dimensionless if f(𝒙) is.

10.3 Relationship between the correlation function and the power spectrum

The correlation function ξf(r) and the power spectrum P(k) carry the same physical information, just in a different form. To get the detailed relationship requires a bit of algebra. First, we write down the correlation function and substitute the inverse Fourier transforms (270):

ξf(𝒙,𝒚)=f(𝒙)f(𝒚)=f(𝒙)f*(𝒚)=d3𝒌(2π)3/2d3𝒌(2π)3/2f(𝒌)f*(𝒌)ei𝒌𝒙e-i𝒌𝒚 (279)

where we made use of the fact that f(𝒚) is real to replace it with its own complex conjugate. (This is just a minor trick to simplify the algebra a little.) The term f(𝒌)f*(𝒌) can be replaced by the power spectrum using equation (278), after which the 𝒌 integral can be performed. We also write 𝒌=|k|𝒏^ where 𝒏^ is a unit vector containing the direction information in 𝒌. This gets us to the following expression:

ξf(𝒙,𝒚) =14πdkkPf(k)d2𝒏^eik𝒏^(𝒙-𝒚). (280)

We can evaluate the integral over 𝒏^ by setting 𝒏^(𝒙-𝒚)=|𝒙-𝒚|μ, the 𝒏^ integral then reduces to

2π-11𝑑μeik|𝒙-𝒚|μ=4πsin(k|𝒙-𝒚|)k|𝒙-𝒚|. (281)

It follows that the sought-after relationship between Pf(k) and ξf(r) is:

ξf(r)=dkkPf(k)sin(kr)kr. (282)

Note that this only depends on r=|𝒙-𝒚| as required by the translational and rotational invariance that we discussed earlier.

10.4 Gaussianity

Inflation predicts fluctuations that are very nearly Gaussian and this property is preserved by linear evolution. The cosmic microwave background probes fluctuations mostly in this linear regime (see right panel of Fig. 12).

When we say that the field is Gaussian, we mean first that the probability of finding a particular value at a given location can be written as a Gaussian:

P(f(𝒙))=12πσexp(-f(𝒙)22σ2). (283)

Since, from the definitions above, we must have f(𝒙)2=ξf(0), the value of σ is just given by σ2=ξf(0). By saying the field is Gaussian we also mean that the probability density for the field taking on specified values at multiple points – say 𝒙1, 𝒙2, , 𝒙N – is given by a multivariate Guassian, with covariance given by:

𝖢=(ξf(0)ξf(|𝒙2-𝒙1|)ξf(|𝒙N-𝒙1|)ξf(|𝒙2-𝒙1|)ξf(0)ξf(|𝒙N-𝒙2|)ξf(|𝒙N-𝒙1|)ξf(|𝒙N-𝒙2|)ξf(0)), (284)

i.e. fully specified just by the correlation function. So the real point of a Gaussian field is that these 2-point correlations capture all the information. There is no further information in higher order correlators.

Why should the fluctuations emerging from inflation be nearly Gaussian? If you have studied quantum field theory (QFT), the answer is pretty clear: when we calculate the 2-point correlator of a quantum field ϕ(𝒙1)ϕ(𝒙2), the result is given to us by the QFT propagator. But when we calculate higher-order correlations such as ϕ(𝒙1)ϕ(𝒙2)ϕ(𝒙3), we turn to interacting field theory where there are not just propagators but also interaction vertices; these carry at least one factor of the interaction strength which is typically small.

Non-linear structure formation at late times destroys Gaussianity and gives the filamentary cosmic web (see left panel of Fig. 12). So the late time universe is certainly non-Gaussian. But searching for primordial non-Gaussianity to probe departures from simple inflation is a hot topic; no convincing evidence for primordial non-Gaussianity has yet been found.

10.5 Random fields on the sphere

Physical fields (in a flat universe) can be represented in Fourier or real space as described above. We’ve seen that an ability to work in Fourier space is essential because it allows us to encode notions like homogeneity and isotropy of the field statistics in a succinct way – equation (278).

However observations at their root consist of a field associated with position on the sky – we can think of observations as measured fields on the surface of a sphere. For that reason we need to be able to express fields on the surface of a sphere and manipulate them with the same sophistication as we can for fields in flat space.

Spherical harmonics are basis functions on the sphere that are the closest possible equivalent to Fourier modes. Functions f of position on the sphere 𝐧^ can be expanded as

f(𝐧^)==0m=-fmYm(𝐧^). (285)

In this expression and others involving spherical harmonics we normally don’t use the summation convention, i.e. repeated indices do not implicitly mean a sum. We instead write sums explicitly where required, as in Eq. (285). The Ym are familiar2323 23 Just to ward off potential confusion – there are various phase conventions for the Ym; here we adopt Ym*=(-1)mY-m so that fm*=(-1)mf-m for a real field. from quantum mechanics as the position-space representation of the eigenstates of L^2=-2 and L^z=-iϕ:

2Ym =-(+1)Ym
ϕYm =imYm, (286)

with an integer 0 and m an integer with |m|. This implies a typical wavelength of fluctuations for a given : because 2Ym2Ym, we are talking about fluctuations with an angular scale of

δθ2π. (287)

Note the close analogy with Fourier modes in a flat space where 2ei𝐤𝐱=-𝐤2ei𝐤𝐱 and xei𝐤𝐱=ikxei𝐤𝐱. In the Fourier case, the scale of fluctuations is 2π/|𝒌|. So acts a bit like the wavenumber |𝐤| whereas m is a bit like 𝐤^𝐤/|𝐤| (i.e. m contains information about directionality).

Just like the Fourier modes, the spherical harmonics have a lot of special properties that help us manipulate them. They are orthonormal over the sphere,

𝑑𝐧^Ym(𝐧^)Ym*(𝐧^)=δδmm, (288)

so that the spherical multipole coefficients of f(𝐧^) are

fm=𝑑𝐧^f(𝐧^)Ym*(𝐧^). (289)

In turn, this implies a completeness relationship

mYm(𝐧^)Ym*(𝐧^)=δ(𝐧^-𝐧^). (290)

where δ(𝐧^-𝐧^) is the Dirac delta function on the sphere.

There is a generalisation of equation (290) which sums only over the ms, not the s. It is known as the spherical harmonic addition theorem and states that

mYm(𝐧^)Ym(𝐧^)=2+14πP(𝐧^𝐧^) (291)

where P(x) is the ’th Legendre polynomial. Many different proofs of this statement are available e.g. in quantum mechanics text books. They’re a bit tedious so here we’ll just take the result on trust (we’ll only need it once anyway).

What is the implication of statistical isotropy for the correlators of fm? We first need to know that, under a rotation, YmmRmmYm for some rotation coefficients Rmml; any given rotation of a general function f(𝒏^) can therefore be effected by transforming its spherical harmonic coefficients fmmRmmfm. There are a few helpful properties of the Rmm:

  • They do not mix different s (this is already built into the notation). If you’re curious to see why, you can show it from the commutation of L^2 with the individual rotation generators.

  • It’s relatively straight-forward to show that |Rmm|2=1 – this can be derived from the orthonormalisation of the Yms themselves.

  • There is an extension of the previous statement that is even more powerful: namely, the R’s are orthonormalized such that mRmmRmm′′*=δmm′′. Demonstrating this requires application of the completeness relation (290) and is recommended for enthusiasts only.

  • For a rotation of angle ϕ around the z axis, Rmm=δmmeiϕ. This follows directly from equation (286).

😇 Exercise 10A Use the above properties to show that, for fm and fmfm* to be invariant under rotations, it is sufficient that fm =0       (0) (292) fmfm* =Cδδmm. (293) for some Cs which we call the angular power spectrum of f.
Exercise 10B Show that the above conditions are not just sufficient but also necessary for isotropy.

Equation (293) is the equivalent of equation (278); these are the two most useful and important results in this Section.

10.6 Estimating the properties of random fields

Random fields are defined by their ensemble average properties. If some element of randomness is involved in physics (e.g. because of quantum mechanics) we would ideally want to measure ensemble average outcomes rather than an individual outcome.

However in cosmology, this isn’t an option – we have just one universe and we’re stuck with it. Yet, if one makes sufficiently strong assumptions (normally homogeneity and isotropy), it’s still possible to make inferences about the random process. In effect, one splits up the data into lots of chunks and regards them as independent “trials” of the random process. This introduces inherent limits to what can be achieved – especially when we’re talking about the very largest scales of the universe. If and when we find something that seems strange or doesn’t fit the model, we must bear in mind that we might just be looking at an “unlucky” draw from the random process.

Later in the course we will especially look at the cosmic microwave background, which is an example of a random field on a notional sphere (the “surface of last scattering”). We want to be able to measure C, but in practice we can’t because C|fm|2 – we’re unable to take the required ensemble average from within our own universe.

Instead we construct an “estimator for C” which does what the name suggests: it estimates C from a single observation. Such estimators are normally written C^; the simplest (and, in ideal circumstances, the best) example of which is:

C^=12+1m=-|fm|2. (294)

Exercise 10C

Show that the estimator C^ has expectation and variance:

C^=C       (C^-C)2=2C22+1. (295)

You may want to use Wick’s theorem, which states that for zero-mean Gaussian random numbers a, b, c and d, abcd=abcd+ acbd+ adbc.

The exercise above shows that on average, C^ will be close to C, especially as becomes large (this corresponds to small scales where there are lots of independent modes each sampling C). But for small – the largest accessible angular scales – the standard deviation becomes comparable to C itself. This inherent limitation in our ability to estimate C (or other statistical properties of fields we observe) is known as “cosmic variance”.

10.7 Summary

  • Many of the predictions made by modern cosmological theories are statistical in nature;

  • Quantum fields have near-Gaussian random fluctuations which, in the inflationary picture, seed structure in our Universe;

  • Gaussian fluctuations can be characterised by two-point correlations alone;

  • Statistical homogeneity and isotropy put stringent limitations on the form that two-point correlations can take;

  • The predictions then take the form of a power spectrum P(k), measuring fluctuations on comoving scales 2π/k – or, equivalently, a correlation function ξ(r);

  • When measured on a sphere, the power spectrum is discretised into C’s measuring fluctuations on angular scales 2π/

  • Power spectra can be calculated from theories but cannot be measured directly from observations because they involve an average over an infinite series of universes…

  • …however we can estimate power spectra observationally by averaging over a number of observed modes. The theory/observation comparison will then have an irreducible level of uncertainty known as “cosmic variance”. This is most problematic when probing large scales due to the small number of accessible large-scale modes.