PHAS0067: Advanced Physical Cosmology 3 The Geodesic Equation 5 A More Systematic Approach to the FRW Universe

4 The Einstein Equation

()

The strength of gravity depends on how mass is distributed through the Universe. That’s missing from our discussion so far: we’ve specified a particular spacetime metric and derived its effect on particles. So we need to close the loop and derive the spacetime metric from a description of the particles within the Universe. Only then will we have a full theory of gravity.

For PHAS0067 purposes the mathematical derivations in Section 1 are non-examinable – but we will need the results, and the physical underpinnings are examinable. If you’re encountering GR for the first time, the end of the Section 1 has a recap.

4.1 Characterising relativistic gravity: the Riemann curvature

How can we start quantifying the strength of gravity in a way that can be connected back to the distribution of matter? It’s not at all obvious -- it took Einstein 10 years or so⁶⁶ 6 During which time he was in close correspondence with others, most famously the mathematician David Hilbert.. Luckily now that we know the answer it’s not quite so difficult. The key is to focus on relationships between particles in a way that is independent of the particular choice of coordinates. Working without a fixed coordinate system can feel a little like trying to build elaborate sand-castles… but it is possible with sufficient patience.

Figure 6: In general relativity, there are no preferred coordinates so it can be quite tricky to specify unambiguously what the effects of gravity really are. We need a definition that is independent of coordinate choice. One way to do this is to think about how nearby particles behave in a gravitational field – they can accelerate relative to each other. Without gravity (assuming the absence of other forces for now), such an acceleration is impossible. Mathematically, the idea of “nearby particles accelerating relative to each other” gets turned into the curvature of spacetime, as described in the text.

Figure 6 shows the construction we’ll use to characterise the presence of gravity in a coordinate-independent way. Imagine two nearby particles. If there’s no gravity, those two particles will be moving at a constant speed – they might approach or recede from each other, but they can’t accelerate (assuming all other forces are negligible). Conversely, once gravity is present, relative acceleration becomes not just possible but inevitable.

To turn this idea of the relative acceleration of two particles into a mathematical statement, we imagine there is a vector $X^{\mu}$ that connects their spatial locations. (Let’s postpone coming up with a formal definition of $X^{\mu}$ for now.) Roughly speaking, acceleration is then present if ${\rm d}^{2}X^{\mu}/{\rm d}\tau^{2}$ is non-zero, where $\tau$ is the time measured by a clock carried by one of the particles.

The covariant derivative

We first need to figure out what ${\rm d}/{\rm d}\tau$ really means. With the equivalence principle in mind, the safest thing to do is to start with a definition for a locally Minkowski set of coordinates (i.e. free-falling, inertial, cartesian coordinates), just as we did in Section 2. Let’s call those coordinates $x_{M}^{\mu}$ and the components of the vector $X_{M}^{\mu}$ ; then we can straight-forwardly apply the chain rule:

\frac{{\rm d}X_{M}^{\mu}}{{\rm d}\tau}\equiv\frac{{\rm d}x_{M}^{\nu}}{{\rm d}% \tau}\frac{\partial X_{M}^{\mu}}{\partial x_{M}^{\nu}}\hskip 28.452756pt% \textrm{(locally Minkowski coordinate system).}

(62)

From a physical perspective that’s all there is to it – we could work directly with definition (62). But it’s a nuisance to be forever finding transformations into a specific coordinate system. We want instead to write a version of equation (62) that is applicable in any coordinates. Such an equation is called covariant.

To find a covaraint expression, recall that equation (12) tells us how $X^{\mu}$ transforms between coordinate systems. To define ${\rm d}X^{\mu}/{\rm d}\tau$ in coordinate systems other than the free-falling one employed in equation (62), the best option is to insist that it, too, transforms as a vector. This provides the definition for the derivative of a vector in non-inertial coordinate systems in a way that is consistent with the equivalence principle. To put it another way, it ensures that any equation involving vectors and derivatives of vectors will have the same physical meaning regardless of which coordinate system it’s expressed relative to.

But can we do this without forever transforming back and forth to locally free-falling coordinate systems? Thankfully the answer is yes, and the logic is very similar to that from Section 2. Let’s keep $x_{M}^{\mu}$ as locally inertial coordinates and imagine a new, general coordinate system $x^{\mu}$ . Starting from the definition of ${\rm d}X_{M}^{\mu}/{\rm d}\tau$ , we now have:

\frac{{\rm d}X_{M}^{\mu}}{{\rm d}\tau}\equiv\frac{{\rm d}x_{M}^{\nu}}{{\rm d}% \tau}\frac{\partial X_{M}^{\mu}}{\partial x_{M}^{\nu}}=\frac{{\rm d}x^{\nu}}{{% \rm d}\tau}\frac{\partial x_{M}^{\mu}}{\partial x^{\alpha}}\frac{\partial X^{% \alpha}}{\partial x^{\,\nu}}+\frac{{\rm d}x^{\nu}}{{\rm d}\tau}\frac{\partial^% {2}x^{\mu}_{M}}{\partial x^{\nu}\partial x^{\beta}}X^{\beta}

(63)

😇 Exercise 4A Verify equation (63), starting from the definition (62) and using the appropriate transformation laws such as (12) and (11). Hint: First argue that

\frac{{\rm d}x^{\nu}_{M}}{{\rm d}\tau}\frac{\partial X^{\mu}_{M}}{\partial x^{% \nu}_{M}}=\frac{{\rm d}x^{\nu}}{{\rm d}\tau}\frac{\partial X^{\mu}_{M}}{% \partial x^{\nu}}\textrm{.}

(64)

Comparing with equation (12), the first term in equation (63) is precisely the transformation that I just argued we want. What of the second term? We can rewrite it in terms of our old friends, the Christoffel symbols (37), as follows:

\frac{{\rm d}x^{\,\nu}}{{\rm d}\tau}\frac{\partial^{2}x_{M}^{\mu}}{\partial x^% {\nu}\partial x^{\beta}}X^{\beta}=\frac{{\rm d}x^{\,\nu}}{{\rm d}\tau}\frac{% \partial x_{M}^{\mu}}{\partial x^{\alpha}}\Gamma_{\nu\beta}^{\alpha}X^{\beta}.

(65)

Putting everything together we now have:

	$\displaystyle\frac{{\rm d}X^{\alpha}}{{\rm d}\tau}$	$\displaystyle=\frac{\partial x^{\alpha}}{\partial x_{M}^{\mu}}\frac{{\rm d}X^{% \mu}_{M}}{{\rm d}\tau}$	(stating the equivalence principle)
		$\displaystyle=\frac{{\rm d}x^{\,\nu}}{{\rm d}\tau}\left(\frac{\partial X^{% \alpha}}{\partial x^{\,\nu}}+\Gamma_{\nu\beta}^{\alpha}X^{\beta}\right)$	(the final result),		(66)

where the second line follows from substituting (63) and (65) for ${\rm d}x_{M}^{\mu}/{\rm d}\tau$ . Note that in the result (66) all explicit reference to a special Minkowski system of coordinates is gone. It has been fully absorbed into the Christoffel symbols $\Gamma_{\nu\beta}^{\alpha}$ which can be computed directly from the metric in any coordinate system, equation (39).

We normally abbreviate the combination in brackets as the covariant derivative $\nabla$ ; for any vector $V^{\beta}$ we can write

\nabla_{\alpha}V^{\beta}\equiv\frac{\partial V^{\beta}}{\partial x_{M}^{\alpha% }}+\Gamma^{\beta}_{\alpha\gamma}V^{\gamma}\ .

(67)

Then we have a neat way to express equation (66):

\frac{{\rm d}X^{\alpha}}{{\rm d}\tau}\equiv\frac{{\rm d}x^{\nu}}{{\rm d}\tau}% \nabla_{\nu}X^{\alpha}\ .

(68)

We also sometimes use a semi-colon to indicate a covariant derivative, and a comma to indicate a partial derivative with respect to the coordinates $x$ — in summary:

V^{\alpha}_{,\beta}\equiv\frac{\partial V^{\alpha}}{\partial x^{\beta}}\hskip 2% 8.452756pt\textrm{and}\hskip 28.452756ptV^{\alpha}_{;\,\beta}\equiv\nabla_{% \beta}V^{\alpha}\equiv V^{\alpha}_{,\beta}+\Gamma^{\alpha}_{\beta\gamma}V^{% \gamma}\textrm{.}

(69)

Our derivation shows, in a very general way, that the equivalence principle allows us to simply write what we would have wanted in flat Minkowski space then replace any partial derivatives $\partial/\partial x_{M}^{\alpha}$ with the covariant derivative $\nabla_{\alpha}$ . That’s a very powerful statement: for example, it allows us to re-derive the properties of geodesics almost instantaneously; we can conveniently re-write equation (38) as

\frac{{\rm d}x^{\alpha}}{{\rm d}\lambda}\nabla_{\alpha}\frac{{\rm d}x^{\mu}}{{% \rm d}\lambda}=0.

(70)

☞ Exercise 4B Explain why equation (70) must be a valid description of particle motion, using the equivalence principle in the form of replacing partial by covariant derivatives. Then explicitly expand equation (70) using the definition (67) and show it is indeed the same as equation (38).

A few important properties of the covariant derivative

In a locally free-falling coordinate system, we have $\nabla_{\nu}=\partial/\partial x_{\nu}$ (or, using the other notation, $V^{\mu}_{;\nu}=V^{\mu}_{,\nu}$ ) and everything boils back to our first definition, equation (62).

We can work out a covariant derivative for any type of tensor with any number of indices. For example, given a covector $W_{\beta}$ we have

\nabla_{\alpha}W_{\beta}=\frac{\partial W_{\beta}}{\partial x^{\alpha}}-\Gamma% ^{\lambda}_{\alpha\beta}W_{\lambda}\textrm{,}

(71)

which can be obtained in a number of ways – most obviously, by following the logic for vectors $V^{\mu}$ above, but with the appropriate transformation $W_{\mu}=(\partial x_{M}^{\nu}/\partial x^{\mu})W^{M}_{\nu}$ .

😇 Exercise 4C Perform this derivation.

Each Christoffel $\Gamma$ term appears because of the transformation associated with an index. Consequently, as indices get added, the additional Christoffel terms associated with each index simply add up; for example, for a rank-2 mixed tensor $T^{\alpha}_{\beta}$ we have

\nabla_{\gamma}T^{\alpha}_{\beta}=\frac{\partial T^{\alpha}_{\beta}}{\partial x% ^{\gamma}}+\Gamma^{\alpha}_{\gamma\lambda}T^{\lambda}_{\beta}-\Gamma^{\lambda}% _{\gamma\beta}T^{\alpha}_{\lambda}\textrm{.}

(72)

For convenience, we define the covariant derivative of a scalar quantity to be just

\nabla_{\alpha}q\equiv\frac{\partial q}{\partial x^{\alpha}}\textrm{,}

(73)

which is neatly self-consistent: when there are zero indices, there will be zero $\Gamma$ terms.

A nice property of the covariant derivative is that it satisfies the product rule, just like the partial derivative:

\nabla_{\alpha}\left(S^{\beta\gamma\cdots}_{\delta\epsilon\cdots}T^{\mu\nu% \cdots}_{\rho\sigma\cdots}\right)=\left(\nabla_{\alpha}S^{\beta\gamma\cdots}_{% \delta\epsilon\cdots}\right)T^{\mu\nu\cdots}_{\rho\sigma\cdots}+S^{\beta\gamma% \cdots}_{\delta\epsilon\cdots}\left(\nabla_{\alpha}T^{\mu\nu\cdots}_{\rho% \sigma\cdots}\right)\textrm{.}

(74)

for tensors $T$ and $S$ which can have any number of indices (including zero). You can straight-forwardly verify this property for yourself if interested; it basically follows because the covariant derivative is the combination of a standard partial derivative (which follows the product rule) and an extra term for each index. We will make use of the product rule several times in derivations. The covariant derivative is also said to be metric-compatible:

\nabla_{\alpha}g_{\beta\gamma}=0=\nabla_{\alpha}g^{\beta\gamma}\textrm{,}

(75)

a property that you can again verify for yourself if keen. This is again extremely helpful in practical derivations and we will take it for granted.

Back to the relative acceleration of neighbouring particles

Figure 7: A more refined setup for testing the acceleration of neighbouring geodesics. At some initial time, we take a series of nearby geodesics and label them by a continuous parameter

\sigma

. The separation vector

X^{\mu}

is then defined as

\partial x^{\mu}/\partial\sigma

; we can inspect how it changes over time (parametrised by

\tau

) to establish whether the freely-falling particles are accelerating relative to each other.

Now we need one last mathematical ingredient to characterise gravity, which was our original aim. The missing insight is about the vector $X^{\mu}$ . I was a bit slapdash in saying $X^{\mu}$ connects two nearby worldlines – what does that really mean?

Figure 7 shows how to define $X^{\mu}$ more carefully. We imagine setting up a series of test geodesics labelled by a continuous parameter $\sigma$ . Then we can define $X^{\mu}\equiv\partial x^{\mu}/\partial\sigma$ , where the partial derivative is taken at fixed $\tau$ . Where we had derivatives with respect to $\tau$ we must now take them at fixed $\sigma$ (to unambiguously take a derivative down a particular particle’s worldline). With this in place, we can expand $\partial^{2}X^{\mu}/\partial\tau^{2}$ in terms of covariant derivatives:

\displaystyle\frac{\partial^{2}X^{\mu}}{\partial\tau^{2}}\equiv\frac{\partial^% {3}x^{\mu}}{\partial\tau^{2}\partial\sigma}

\displaystyle=\frac{\partial x^{\alpha}}{\partial\tau}\nabla_{\alpha}\left(% \frac{\partial x^{\beta}}{\partial\sigma}\nabla_{\beta}\frac{\partial x^{\mu}}% {\partial\tau}\right)

(76)

Here I have very conciously chosen to expand a $\tau$ derivative, then a $\sigma$ derivative, and then the second $\tau$ derivative using equation (68). Obviously I could have chosen to expand the derivative in a number of other ways, but the result below will explain why this order is particularly helpful. We can continue simplifying using the product rule:

\displaystyle\frac{\partial^{2}X^{\mu}}{\partial\tau^{2}}

\displaystyle=\frac{\partial x^{\alpha}}{\partial\tau}\frac{\partial x^{\beta}% }{\partial\sigma}\nabla_{\alpha}\nabla_{\beta}\frac{\partial x^{\mu}}{\partial% \tau}+\frac{\partial x^{\alpha}}{\partial\tau}\left(\nabla_{\alpha}\frac{% \partial x^{\beta}}{\partial\sigma}\right)\left(\nabla_{\beta}\frac{\partial x% ^{\mu}}{\partial\tau}\right).

(77)

Next, we identify that

\frac{\partial x^{\alpha}}{\partial\tau}\left(\nabla_{\alpha}\frac{\partial x^% {\beta}}{\partial\sigma}\right)\equiv\frac{\partial x^{\beta}}{\partial\tau% \partial\sigma}=\frac{\partial x^{\beta}}{\partial\sigma\partial\tau}=\frac{% \partial x^{\alpha}}{\partial\sigma}\left(\nabla_{\alpha}\frac{\partial x^{% \beta}}{\partial\tau}\right)

(78)

and furthermore that

\left(\nabla_{\alpha}\frac{\partial x^{\beta}}{\partial\tau}\right)\left(% \nabla_{\beta}\frac{\partial x^{\mu}}{\partial\tau}\right)=-\frac{\partial x^{% \beta}}{\partial\tau}\nabla_{\alpha}\nabla_{\beta}\frac{\partial x^{\mu}}{% \partial\tau}

(79)

which follows from the geodesic property in the form of equation (70). Putting it all together (and relabelling some dummy indices to simplify the expression) we have the result:

\frac{\partial^{2}X^{\mu}}{\partial\tau^{2}}=\frac{\partial x^{\alpha}}{% \partial\tau}\frac{\partial x^{\beta}}{\partial\sigma}(\nabla_{\alpha}\nabla_{% \beta}-\nabla_{\beta}\nabla_{\alpha})\frac{\partial x^{\mu}}{\partial\tau}.

(80)

😇 Exercise 4D

Check you can obtain equation (80) starting from equation (76) by filling in the details of the derivation given above.

We choose to expand the result in this particular way because equation (80) has a very clear physical interpretation. In flat spacetime, $\nabla_{\alpha}$ reduces back down to partial derivative with respect to $x^{\alpha}$ . As a result,

\nabla_{\alpha}\nabla_{\beta}=\frac{\partial^{2}}{\partial x^{\alpha}\partial x% ^{\beta}}=\frac{\partial^{2}}{\partial x^{\beta}\partial x^{\alpha}}=\nabla_{% \beta}\nabla_{\alpha}\quad\textrm{(flat spacetime)}

(81)

and the acceleration as expressed by equation (80) is obviously zero – this is exactly the intuitive result we were wanting!

In non-flat space, we instead expand (80) in terms of Christoffel symbols, to find that

	$\displaystyle\frac{\partial^{2}X^{\mu}}{\partial\tau^{2}}$	$\displaystyle=\frac{\partial x^{\alpha}}{\partial\tau}\frac{\partial x^{\beta}% }{\partial\sigma}\frac{\partial x^{\nu}}{\partial\tau}R^{\mu}_{\nu\alpha\beta}$		(82)
	$\displaystyle\textrm{where }R^{\mu}_{\nu\alpha\beta}$	$\displaystyle\equiv\frac{\partial\Gamma^{\mu}_{\beta\nu}}{\partial x^{\alpha}}% -\frac{\partial\Gamma^{\mu}_{\vphantom{\beta}\alpha\nu}}{\partial x^{\beta}}+% \Gamma^{\mu}_{\vphantom{\beta}\alpha\lambda}\Gamma^{\lambda}_{\beta\nu% \vphantom{\lambda}}-\Gamma^{\mu}_{\beta\lambda}\Gamma^{\lambda}_{\vphantom{% \beta}\alpha\nu}$		(83)

and $R^{\mu}_{\nu\alpha\beta}$ is known as the Riemann curvature tensor of the spacetime. Note that $R^{\mu}_{\nu\alpha\beta}$ has no dependence on the particular choice of geodesics $x^{\mu}(\tau,\sigma)$ – it is purely constructed from the Christoffel symbols $\Gamma$ that are intrinsic to the spacetime. The Riemann tensor was a widely-studied property of curved spaces long before Einstein started investigating it as a description of gravity.

😇 Exercise 4E Check that expanding the covariant derivatives in equation (80) does indeed result in expression (83).

The physical connection we have made is essential: the curvature of spacetime describes the way that nearby particles accelerate towards or away from each other. Therefore, to complete the general relativistic theory of gravity, we want to connect the curvature back to the contents of spacetime.

A quick recap – what just happened?

The derivations above won’t appear in the examination for this course – but applying the results or understanding the physical content might, so let’s summarise it for convenience:

1.

We realised we need to take the derivative of a vector over time, to start defining what gravity does;
2.

We defined this derivative in local inertial coordinates, equation (62);
3.

We invoked, as always, the equivalence principle to work out how to calculate the same derivative in a general coordinate system, equation (66). This can be written in compact notation as equation (68). By construction we can now write a general relativistic equation of motion just by taking the special relativistic equations and replacing partial derivatives $\partial/\partial x^{\alpha}$ with covariant derivatives $\nabla_{\alpha}$ .
4.

Finally, using these results, we worked out how to calculate the relative acceleration of two particles, $\partial^{2}X^{\mu}/\partial\tau^{2}$ . This required a careful definition of $X^{\mu}$ , the vector separating the two particles, and some further cunning manipulations to reach equations (82) and (83). Equation (82) shows that relative acceleration of particles is generated by curvature of spacetime, as parametrised by the Riemann tensor $R^{\mu}_{\nu\alpha\beta}$ .

4.2 Energy-momentum tensor

We’ve so far identified the property of the spacetime – Riemann curvature – that describes the physical effects of gravity in a coordinate-independent way. Now we have to work out what to equate to curvature to get the desired link from the mass distribution in the universe to the dynamics.

In Newtonian theory the gravitational field $\Phi$ is generated by the density of matter, $\rho$ , through Poisson’s equation $\nabla^{2}\Phi=4\pi G\rho$ . We want an equivalent relativistic link between curvature and density.

Let’s return to the 4-momentum of a particle as considered at the end of the previous chapter. For a massive particle, special relativity says that $p^{\alpha}=mU^{\alpha}$ , where $m$ is the “rest mass” of the particle, and the 4-velocity $U^{\alpha}$ is normalised such that $U^{\alpha}U_{\alpha}=1$ .

In an inertial frame the energy of the particle is $E=p^{0}$ . Note that $E$ is not invariant between different inertial frames (i.e. under Lorentz transformations) – this is a familiar idea even in Newtonian physics where $E=mv^{2}/2$ is not the same in different frames. In the particle’s rest frame, $p^{0}=m$ , which (recalling that we set $c=1$ from the start) is just the famous result $E=mc^{2}$ . The generalization to any inertial frame is $p_{\mu}p^{\mu}=m^{2}$ , or $E^{2}=m^{2}+|\boldsymbol{p}|^{2}$ , where $\boldsymbol{p}^{2}=\delta_{ij}p^{i}p^{j}$ .

So $p^{\mu}$ provides a complete description of both the energy and the momentum of a particle, and the normalization of the vector describes the relationship between those two quantities. In relativity, energy and momentum are intertwined precisely because time and space are intertwined.

But we now want to relate curvature to overall density, not a single point mass. The energy density $\rho$ of matter is an even more slippery concept. To see why, suppose you care about the total density of particles inside a measuring jug – it’s the total energy $E=\sum_{i}E_{i}$ summed over all particles $i$ divided by the volume of the jug $V$ . Not only do the energies of particles change if we change frame, but also remember Lorentz contraction means the volume of the jug changes as well; in a frame travelling at speed $v$ the volume becomes

V^{\prime}=V\sqrt{1-v^{2}}\textrm{.}

(84)

This transformation is the inverse of the energy transformation

E^{\prime}=\frac{E}{\sqrt{1-v^{2}}}

(85)

and so the density transforms as

\rho^{\prime}=\frac{E^{\prime}}{V^{\prime}}=\frac{E}{V(1-v^{2})}=\frac{\rho}{1% -v^{2}}\textrm{.}

(86)

Overall, the transformation of $\rho$ picks up two factors of $(1-v^{2})^{1/2}$ , compared to $E$ which picks up just one. If $E$ is the time component of a rank-1 tensor $p^{\mu}$ , then it looks like $\rho$ must be the time-time component of a rank-2 tensor $T^{\mu\nu}$ :

☞ Exercise 4F Check this explicitly, i.e. show that

p^{\prime 0}=\gamma E

implies that

T^{\prime 00}=\gamma^{2}\rho

(assuming that

p^{\mu}=(E,0,0,0)

and

T^{00}=\rho

T^{0i}=0

T^{ij}=0

Our conclusion from this approximate argument turns out to be accurate. Density is the time-time component of the energy-momentum tensor (also called the stress-energy tensor) $T^{\mu\nu}$ , which is a symmetric tensor describing the flux of $4$ -momentum $p^{\mu}$ across a surface of constant $x^{\nu}$ . In this sense, “density” is defined in relativity as the flux of particles across a spacelike surface. However, if these words don’t seem helpful, in practice there are just a few consequences we need to take away. The meaning should become clearer as we now look at constructing energy-momentum tensors for specific matter sources.

Dust

Start with “dust”, by which cosmologists mean massive particles with negligible velocities or pressure. This is the case that we have already been considering in Exercise 2. So we need $T^{00}=\rho_{\mathrm{rest}}$ where $\rho_{\mathrm{rest}}$ is the rest-frame density, and all other components must vanish. The covariant way to write this requirement is

T^{\mu\nu}_{\mathrm{dust}}=\rho_{\mathrm{rest}}U^{\mu}U^{\nu}\,\textrm{,}

(87)

where $U^{\mu}$ is the velocity 4-vector of the particles discussed previously (it’s $(1,0,0,0)$ in the rest-frame, but different in other frames).

We can now examine what the other components mean; suppose we switch to a frame moving at speed $v$ along the $x$ -axis relative to the original. Then $U^{\prime\mu}=(1/\sqrt{1-v^{2}},v/\sqrt{1-v^{2}},0,0)$ , so $\rho\equiv T^{\prime 00}=\rho_{\mathrm{rest}}/(1-v^{2})$ (we discussed how this is required above) and, more interestingly, $T^{\prime 0i}=\rho_{\mathrm{rest}}v^{i}/(1-v^{2})=\rho v^{i}$ . We can recognize this quantity as a momentum density, i.e. it is the net momentum per unit volume that the fluid posesses.

Since $T$ is symmetric, $T^{i0}=T^{0i}$ so the only part left to understand are the $T^{ij}=T^{ji}$ space-space components. In the example above, where we have particles in uniform motion along the $x$ axis, $T^{\prime 11}=\rho_{\mathrm{rest}}v^{2}/(1-v^{2})=\rho v^{2}$ . If you’ve taken a fluid dynamics course you’ll know that $\rho v^{2}$ is the ram pressure – it’s the pressure felt as particles flow past an object at speed $v$ .

Fluids

Imagine that we now add up the effects of many different particles, all moving in different directions with a speed $v$ , so that on average they’re at rest but any given particle is moving⁷⁷ 7 This is just the microscopic description of a gas, although for simplicity let’s assume that the speed is always $v$ – to be more precise, we’d need to use the Maxwell-Boltzmann distribution, which doesn’t change anything fundamentally… but adds a layer of complexity that we can do without at the moment.. Now we’ll get

T^{ii}=\frac{\rho_{\mathrm{rest}}v^{2}}{3(1-v^{2})}=\frac{\rho v^{2}}{3}\quad% \textrm{ (no sum over $i$)}

(88)

for $i=1,2,3$ . In other words, $T^{ii}=p$ , the pressure of the overall distribution of particles. But the $T^{ij}$ are still zero for any $i\neq j$ …

☞ Exercise 4G Why must

T^{ij}=0

for

i\neq j

in the example given above? Consider two sets of particles, each with density

\rho

. Let one set move along the direction at

45^{\circ}

between the

x

and

y

axes, and the other moving at the same speed along the direction at

-45^{\circ}

. Show that, while the contribution to the diagonal part of

T^{ij}

adds, the contribution to the off-diagonal part cancels. By pairing all contributions in this way, we can show the off-diagonal contributions are always zero. Are there arguments you can think of that show this vanishing without even having to make any calculations?

So, by thinking about a series of particles moving in different directions, we have motivated the idea that – in the rest frame – we have

•

$T^{00}=\rho$ is the density (which equals $\rho_{\mathrm{rest}}$ for dust with zero velocity, $v=0$ );
•

$T^{0i}$ is the momentum density which is zero in the rest frame;
•

$T^{ij}$ is the pressure $P$ for $i=j$ , and zero for $i\neq j$ .

Such gases are known in relativity as perfect fluids and can be completely defined by an energy density $\rho$ and an isotropic rest frame pressure $P$ . Note that this $\rho$ should now be thought of as the “rest frame density of the fluid” – although this rest frame is not the same as the rest frame of individual particles, unless they are all at rest ( $P=0$ ). The rest frame of the fluid is the frame in which there is no net motion. To be able to leave the rest frame of the perfect fluid, we need a covariant expression which turns out to be:

T^{\mu\nu}=(\rho+P)U^{\mu}U^{\nu}-Pg^{\mu\nu}\ .

(89)

☞ Exercise 4H Check this is the correct expression by writing out the terms in an inertial frame where

g^{\mu\nu}=\eta^{\mu\nu}

. Having checked it for just one frame, why must expression (89) be valid in any frame?

A perfect fluid is general enough to describe a wide variety of cosmological fluids, given their equation of state,

w=\frac{P}{\rho}\ .

(90)

Dust has $P=0,\ w=0$ . Non-relativistic sources like baryonic and dark matter are often well-approximated as dust since $|P|\ll\rho$ ( $|w|\ll 1$ ). Radiation has $P=\rho/3,\ w=1/3$ . Vacuum energy (which we’ll discuss in more detail later) is proportional to the metric: $T^{\mu\nu}=-\rho_{\mathrm{vac}}g^{\mu\nu}$ , $P_{\mathrm{vac}}=-\rho_{\mathrm{vac}}$ , $w=-1$ .

Evolution of energy

The energy-momentum tensor of a perfect fluid at rest can also be written with one index lowered in the following metric-independent form:

T^{\mu}_{\ \nu}=\left(\begin{array}[]{cccc}\rho&0&0&0\\ 0&-P&0&0\\ 0&0&-P&0\\ 0&0&0&-P\end{array}\right)\ .

(91)

Because of its relative simplicity and transparent physical interpretation of $\rho$ and $P$ this is often the most useful way to express the energy-momentum tensor.

😇 Exercise 4I

Explain why this mixed form of $T^{\mu}_{\nu}$ is coordinate system independent provided the fluid is at rest and the metric is diagonal.

How do the components of $T^{\mu}_{\ \nu}$ evolve with time? Consider the case where there is no gravity. Then, from thermodynamical arguments we know that the pressure and energy evolve as:

	$\displaystyle \textrm{Continuity\ equation:}$	$\displaystyle\frac{\partial\rho}{\partial t}+\nabla\cdot(\rho\boldsymbol{u})$	$\displaystyle=0\ ,$		(92)
	Euler equation:	$\displaystyle\rho\frac{\partial\boldsymbol{u}}{\partial t}+\left(\rho% \boldsymbol{u}\cdot\nabla\right)\boldsymbol{u}+\nabla P$	$\displaystyle=0\ .$		(93)

where $\boldsymbol{u}$ is the Newtonian velocity vector of the fluid. In special relativity, we turn this into a $4$ –component conservation equation for the energy-momentum tensor:

\frac{\partial T^{\mu}_{\ \nu}}{\partial x^{\mu}}=0.

(94)

😇 Exercise 4J

Show that, in the non-relativistic limit where $P\ll\rho$ , equation (94) for a perfect fluid specified by (89) regenerates the continuity and Euler equations (92) and (93).

In general relativity, the conservation criterion must be modified because the expression above is not covariant. The discussion of the equivalence principle in Section 1 implies that we just need to “promote” the partial derivative to a covariant derivative to get the general relativistic conservation law:

\framebox{$\displaystyle\nabla_{\mu}T^{\mu}_{\ \nu}=0$}\ .

(95)

For our perfect fluid stress-energy tensor there are four separate equations to be considered here. Take first the $\nu=0$ component for the metric (23), generating the continuity equation for an expanding universe:

\frac{\partial\rho}{\partial t}+3\frac{\dot{a}}{a}\left(\rho+P\right)=0\ .

(96)

📝 Exercise 4K Recover this expression starting from

\nabla_{\mu}T^{\mu}_{0}=0

. The covariant derivative for rank-2 tensors is given by equation (72) and the appropriate Christoffel symbols by equation (5).

Rearranging this equation,

a^{-3}\frac{\partial}{\partial t}\left(\rho a^{3}\right)=-3\left(\frac{\dot{a}% }{a}\right)P\ .

(97)

Immediately this gives us some information about how the density of the universe changes as it expands. Pressureless matter has $P=0$ by definition, so

\frac{\partial}{\partial t}\left(\rho_{\mathrm{m}}a^{3}\right)=0\Rightarrow% \rho_{\mathrm{m}}\propto a^{-3}\quad\textrm{(for matter).}

(98)

This is expected if you consider that mass remains constant while number density scales as inverse volume.

Equation (88) shows that radiation ( $v=1$ ) has a pressure $P=\frac{\rho}{3}$ . In this case we obtain instead

\frac{\partial\rho_{\mathrm{r}}}{\partial t}+4\left(\frac{\dot{a}}{a}\right)% \rho_{\mathrm{r}}=a^{-4}\frac{\partial}{\partial t}\left(\rho_{\mathrm{r}}a^{4% }\right)=0\ .

(99)

From equation (99) radiation density changes with scalefactor according to

\rho_{\mathrm{r}}\propto a^{-4}\quad\textrm{(for radiation).}

(100)

By considering the energy-momentum tensor and the equivalence principle we are obtaining highly non-trivial results already – and we haven’t even written down the Einstein equation yet!

4.3 At last: the Einstein equation

Let’s look again at where we are in the big picture. So far, we have worked out that:

•

the Riemann curvature $R^{\mu}_{\nu\alpha\beta}$ is the property of spacetime that, a little like the Newtonian shear $\nabla_{i}\nabla_{j}\Phi$ , describes the strength of gravity (through its effect on the trajectories of neighbouring particles);
•

the energy-momentum tensor $T^{\mu}_{\nu}$ is the property of matter that generalises density $\rho$ to relativistic settings; the conservation equation $T^{\mu}_{\nu;\mu}=0$ correctly describes how the density is affected by the metric.

Even though $\nabla^{2}\Phi=4\pi G\rho$ would suggest a close correspondence between these two tensors $R^{\mu}_{\nu\alpha\beta}$ and $T^{\mu}_{\nu}$ , we can’t just equate them – apart from anything, they don’t have the same number of indices. This mismatch shouldn’t be a huge surprise because the curvature couldn’t possibly be determined locally by the energy-momentum; we need some sort of “action at a distance” to recover the familiar physical effects of gravity.

Rather than explicitly try to construct action at a distance, it’s easier to ask: how can we boil down the Riemann curvature to a two-index tensor in a way that is covariant, i.e. independent of coordinate system? Despite seeming like a very different question, this will actually lead us to the right physical solution.

We need to contract two of the Riemann tensor’s indices – e.g. to form

R_{\alpha\beta}\equiv R^{\mu}_{\alpha\mu\beta}\textrm{,}

(101)

a construction known as the Ricci tensor. Actually by using the symmetries of the Riemann tensor (e.g. the obvious one that $R^{\mu}_{\nu\alpha\beta}=-R^{\mu}_{\nu\beta\alpha}$ , but there are more subtle ones too) one can show that this is the only such construction. These same symmetries imply that the Ricci tensor is symmetric, $R_{\alpha\beta}=R_{\beta\alpha}$ . Finally, one can also show reasonably straight-forwardly that

R^{\alpha}_{\beta;\alpha}=R_{,\beta}/2\textrm{,}

(102)

where the Ricci scalar $R=g^{\mu\nu}R_{\mu\nu}$ is the contraction of the Ricci tensor. Take a look in Carroll’s excellent textbook, Section 4.1 and 4.2, for guidance on deriving these points if you want to follow the details.

Following this strategy of finding-something-with-the-right-number-of-indices, Einstein first tried directly equating the Ricci tensor and the energy-momentum tensor. But that failed because the energy-momentum tensor satisfies the conservation constraint $T^{\alpha}_{\beta;\alpha}=0$ whereas, as just stated, $R^{\alpha}_{\beta;\alpha}\neq 0$ . Luckily he was later able to construct a “trace-reversed” version of the Ricci tensor that does satisfy the correct conservation law, known as the Einstein tensor $G_{\mu\nu}$ :

G_{\mu\nu}\equiv R_{\mu\nu}-\frac{1}{2}g_{\mu\nu}R\ .

(103)

This is the only quantity that is linearly dependent on the Riemann curvature and has all the right symmetries and conservation laws to be equated to $T_{\mu\nu}$ .

😇 Exercise 4L

Show that $G^{\alpha}_{\beta;\alpha}=0$ , starting from equation (102).

Just like in the Newtonian theory, the strength of gravity can be adjusted by a constant of proportionality; to make relativistic results compatible with Newtonian results in the appropriate limits (we’ll look at this in the context of cosmology later on), this constant for GR turns out to be $8\pi G$ where $G$ is the Newtonian gravitational constant. That is:

G_{\mu\nu}=8\pi GT_{\mu\nu}\ ,

(104)

Be very clear: the $G_{\mu\nu}$ on the LHS is the Einstein curvature tensor (103) whereas on the RHS the $G$ is just the familiar old Newtonian gravitational constant, $G\simeq 6.67\times 10^{-11}\mathrm{m^{3}\,s^{-2}\,kg^{-1}}$ . This double-use of $G$ is a bit weird but we’re stuck with it. For completeness, the Ricci tensor definition (101) can be expanded directly in terms of the Christoffel symbols as:

R_{\mu\nu}=\Gamma^{\alpha}_{\mu\nu,\alpha}-\Gamma^{\alpha}_{\mu\alpha,\nu}+% \Gamma^{\alpha}_{\beta\alpha}\Gamma^{\beta}_{\mu\nu}-\Gamma^{\alpha}_{\beta\nu% }\Gamma^{\beta}_{\mu\alpha}.

(105)

Exercise 4M Verify equation (105) starting from the expression (83) for the Riemann curvature and the definition (101). Then use equation (105) to show that the Ricci tensor is symmetric,

R_{\mu\nu}=R_{\nu\mu}

. Hint: Three of four terms can be shown to be symmetric without too much trouble. For the fourth term, it may be helpful to use the matrix identity

\mathrm{Tr}\left(\mathsf{M}^{-1}\partial\mathsf{M}/\partial x\right)=\partial% \ln|\mathsf{M}|/\partial x

which is true for any invertible matrix

\mathsf{M}

that depends on any parameter

x

Reflecting on the Einstein equation

In some ways, the Einstein equation looks a bit peculiar as a description of gravity. For example:

•

$T_{\mu\nu}$ specifies the Ricci tensor $R_{\mu\nu}$ ; in turn $R_{\mu\nu}$ only partially captures the information in the Riemann tensor $R^{\mu}_{\nu\alpha\beta}$ . Where does the rest of the information in $R^{\mu}_{\nu\alpha\beta}$ come from?
•

In Newtonian physics, $\nabla^{2}\Phi=4\pi G\rho$ generates action at a distance because $\nabla^{2}$ is a differential operator. So in general relativity, how does the local value of $\rho$ affect geodesic deviation of particles at a distance – when there are no differential operators in the Einstein equation?

Think about the above, as we’ll discuss it in class.

4.4 Practical consequences: the Friedmann equations

We’re finished with the very fast recap of GR, leaving just the task of calculating the Einstein tensor for cosmological metrics. It looks like hard work – and it is – but we have already done much of it by computing $\Gamma^{\alpha}_{\mu\nu}$ in a flat FRW universe, finding that it has only two sets of non-vanishing components: $\mu,\nu=0$ and $\mu,\nu=i$ . Using them (and noting that $\delta^{i}_{\ i}=3$ ) we can show that,

R_{00}=-3\left(\frac{\ddot{a}}{a}\right),\quad R_{ij}=\delta_{ij}\left[2\dot{a% }^{2}+\ddot{a}a\right]\ .

(106)

Contracting the Ricci tensor, we then obtain the Ricci scalar for the flat, homogeneous FRW universe as:

$\displaystyle R$	$\displaystyle\equiv$	$\displaystyle g^{\mu\nu}R_{\mu\nu}$	(107)
	$\displaystyle=$	$\displaystyle R_{00}-\frac{1}{a^{2}}\delta^{ij}R_{ij}$
	$\displaystyle=$	$\displaystyle-6\left[\frac{\ddot{a}}{a}+\left(\frac{\dot{a}}{a}\right)^{2}% \right]\ .$

☞ Exercise 4N Verify the components of the Ricci tensor and Ricci scalar given above for the flat, homogeneous FRW universe.

To understand the evolution of the scale factor in a homogeneous expanding universe, we only need to consider the time-time component of the Einstein equation:

R_{00}-\frac{1}{2}g_{00}R=8\pi GT_{00}\ ,

(108)

leading to the first Friedmann equation for a flat universe:

\framebox{$\displaystyle\left(\frac{\dot{a}}{a}\right)^{2}=\frac{8\pi G}{3}% \rho$}\quad\textrm{(flat Universe)}\ .

(109)

📝 Exercise 4O Check that equation (108) implies equation (109)

There is a second Friedmann equation. Consider the space-space component of Einstein’s equation:

R_{ij}-\frac{1}{2}g_{ij}R=8\pi GT_{ij}\ .

(110)

Using the flat FRW terms we worked out previously in Eqs. (5, 106, 107), with $g_{ij}=-\delta_{ij}a^{2}$ we find:

\mathrm{LHS:}\quad\delta_{ij}\left[2\dot{a}^{2}+\ddot{a}{a}\right]-\delta_{ij}% \frac{a^{2}}{2}\,6\left[\frac{\ddot{a}}{a}+\left(\frac{\dot{a}}{a}\right)^{2}% \right]\ .

(111)

Noting the mixed form for the perfect fluid energy-momentum tensor, (91), we see that,

\mathrm{RHS:}\quad 8\pi GT_{ij}=8\pi Gg_{ik}T^{k}_{\ j}=8\pi Ga^{2}\delta_{ij}% P\ .

(112)

Equating these terms, we obtain

\frac{\ddot{a}}{a}+\frac{1}{2}\left(\frac{\dot{a}}{a}\right)^{2}=-4\pi GP\quad% \textrm{(flat Universe)}.

(113)

Combining with the first Friedmann equation (109), this leads us to the second Friedmann equation:

\framebox{$\displaystyle\frac{\ddot{a}}{a}=-\frac{4\pi G}{3}\left(\rho+3P% \right)$}\quad\textrm{(flat Universe)}\ .

(114)

☞ Exercise 4P Derive Eq. (114) via another route, by differentiating the first Friedmann equation (109) with respect to time

t

, and combining with the continuity equation (96).

Exercise 4 illustrates redundancy, which is a striking feature of the Einstein equations: we find many routes to deriving each result. The reason is that, by construction, both the Einstein tensor $G_{\mu\nu}$ and the energy-momentum tensor $T_{\mu\nu}$ obey the conservation constraint $G^{\mu}_{\nu;\mu}=0=T^{\mu}_{\nu;\mu}$ . Consequently many results can be derived either by considering a little geometry and lots of physics or by considering a little physics and lots of geometry. This can feel quite miraculous, but it’s really a consequence of sneaking thermodynamics into the energy-momentum tensor conservation equation… and then constructing the Einstein equation to match this conservation requirement. If we had done anything else (and Einstein did try many other approaches first), we would end up with an inconsistent theory (which is why he discarded those earlier attempts).

Still, it’s neat that such a weak statement of thermodynamics as equations (92) and (93), coupled to the equivalence principle, can give rise to powerful results like (96). Moreover it pretty much is miraculous that, any geometric construction that so beautifully mirrors reality is constructable at all. Make no mistake: Einstein’s GR does describe gravity exceptionally well according to experiments. It has been thoroughly tested on scales up to the size of the solar system. For a recent review of work in this area, see https://arxiv.org/abs/1403.7377.

4.5 Summary

•

The presence of gravity is characterised by the acceleration of nearby free-falling particles. By systematising this idea (see Section 1 for a summary of how), we showed that the Riemann curvature tensor $R^{\mu}_{\nu\alpha\beta}$ contains the information about the gravitational field. Something about this curvature tensor must be generated by mass in order to recover gravity as we know it.
•

In Newtonian theory we think of mass or mass density as being the source of gravity. But neither mass nor density is Lorentz-invariant, and therefore in relativity cannot be on its own the source of gravity.
•

The simplest relativistic generalisation of mass is the energy-momentum tensor $T_{\mu\nu}$ , a symmetric tensor which contains density, pressure, and momentum information in $T_{00}$ (time), $T_{ij}$ (spatial), and $T_{0i}$ (time-space) components respectively.
•

This tensor obeys the conservation property $\nabla_{\mu}T^{\mu}_{\nu}=0$ , which is the relativistic generalisation of conservation of energy. This conservation equation packs a surprising amount of information, e.g. the fluid dynamics equations (92) and (93).
•

By process of elimination Einstein realised the theory of gravity is completed by equating $T_{\mu\nu}$ to what is now known as the Einstein tensor $G_{\mu\nu}$ , defined by Eq. (103). The Einstein tensor contains some of the information in the Riemann tensor, but as the former is rank-2 and the latter rank-4, there must be some extra information in the latter.
•

The remaining information in the gravitational field is generated by geometrical constraints that relate the different components of $R^{\mu}_{\nu\alpha\beta}$ , but interpreted physically these look like evolution equations – they constitute the free field of gravitation that gives rise e.g. to gravitational waves.
•

To calculate the dynamics of a given universe one uses Einstein’s field equations, often in combination with the conservation law for $T^{\mu}_{\nu}$ as already summarised in a bullet point above.
•

In an expanding Universe, the conservation equation tells us how density evolves with the scalefactor, Eq. (96). The Einstein equations turn into the Friedmann equations, Eq. (109) and (114) although these will be generalised in the next chapter to include the possibility of spatial curvature.