Notes gr20
Notes gr20
Mehrdad Mirbabayi
Abstract
1
Contents
7 Dilaton Waves 29
8 Spin-2 Gravity 37
12 Manifolds 61
13 Curvature 69
2
1 Poincaré Transformations, Causal Structure, Group Theory
One can ask what are the transformations of the 2d plane that leave this distance invariant. Global
Translations clearly satisfy this property:
r0 = r + a (1.2)
The only other transformations that preserves the Euclidean distance are rotations:
where θ = constant.
It is very convenient during the whole course to adopt matrix notation. So the Euclidean
vectors (and tensors) carry indices i, j, · · · which run over 1, 2, · · · , d where d is the dimension. For
instance, in a plane we denote
r1 = x, r2 = y (1.4)
and the Kronecker delta is defined as δij = 1 if i = j and = 0 otherwise. Repeated indices are
summed over: in two dimensional Euclidean space
with components Rji . Now we see that the characteristic property of rotations can be stated in
3
terms of rotation matrices:
i
δij Rm Rnj = δmn . (1.8)
Rotations form a group: There is a multiplication rule that combines two rotations into another
rotation R3 ij = R1 ik R2 kj . There is a unit element Rji = δji . And for every rotation there is an inverse
rotation. The rotation group is a continuous group; we can continuously increase the rotation angle
from 0. Continuous groups are characterized by their dimension. The group of 2d, and 3d rotations
are respectively 1- and 3-dimensional: there is only one type of rotation on a 2d plane and there
are 3 in three dimensions.
1. How many independent rotations are there in d Euclidean dimensions? Hint: Consider an
infinitesimal rotation
Rji = δji + ij , ij 1, (1.9)
and determine what condition equation (1.8) implies on components of ij . Then count the
number of independent components.
Solution: Working to linear order in , the condition (1.8) implies
where the index of ij is lowered by the Kronecker delta (so ij is actually the same matrix as
ij ). If there was no condition on components of ij it would have d2 independent components.
The antisymmetry (1.10) means that there are as many independent components as the
number of pairs of distinct indices, which is
d d(d − 1)
= . (1.11)
2 2
Poincaré transformations. Special relativity puts time and space on a more equal footing.
In order to have an invariant velocity c (the speed of light) one can find a direct generalization of
Euclidean translations and rotations:
In 3+1 spacetime dimensions Greek indices run over 0, 1, 2, 3, and as before the repeated indices are
summed over. Translations are parametrized with the arbitrary constant vector aµ . The Lorentz
transformation matrix Λµν is the generalization of Rotation such that the interval
between two spacetime events A and B is preserved. An Event is a point in space-time, identified
by xµ = (x0 , x1 , x2 , x3 ). Setting c = 1, the Minkowski metric becomes ηµν = diag(−1, 1, 1, 1). Since
it is sign-indefinite, IAB can have either sign. We can introduce a causal structure: the two events
4
can be time-like separated (IAB > 0), space-like separated (IAB < 0), or null separated (IAB = 0).
(Mark them on a space-time diagram.)
1
γ=√ . (1.15)
1 − v2
gives a formula for the boost parameter v i in terms of xµAB . First generalize the formula (1.14)
to a boost in a generic direction v i by replacing x with the component parallel to v and y, z
with the perpendicular component. To satisfy (1.16) the boost should clearly be parallel to
xiAB . Therefore
xiAB
γ(xiAB − v i tAB ) = 0 ⇒ v i = . (1.17)
tAB
q
Note that for time-like separated events |xAB | = δij xiAB xjAB < |tAB |. Therefore the boost
√
parameter v i is subluminal v = v 2 < 1 as it should.
Those events that are time-like separated are called causally connected since they can influence
one another in a Lorentz invariant theory. The space-like separated events are in turn causally
disconnected. Moreover, depending on the sign of x0AB two causally connected events A and B
have a time-order. If x0AB > 0 (< 0) A is to the future (past) of B. This order is preserved by
proper Lorentz transformations. Therefore, in a Lorentz invariant theory boosted observers agree
on the initial value formulation of physics.
3. Alice, Bob and Charlie meet for breakfast at the bar. Afterwards, Bob sees Alice and Charlie
moving away from him along the same straight line in opposite directions, with constant
velocities v A and v C respectively.
5
(a) Worldline is the trajectory of a timelike observer on a spacetime diagram. It consists of
a continuum of events and can be drawn once xi (t) is known. Draw wordlines of Alice, Bob,
and Charlie, assuming that they all meet at t = 0 according to the bar’s clock.
(b) Any continuum of points in spacetime is called a Curve. The proper length of the curve
is defined by breaking it into infinitesimal straight intervals and summing up the Poincaré
invariant length dτ of the infinitesimal intervals, given by
This is positive for the infinitesimal elements of a timelike curve, or a worldline. The proper
length of a worldline is called proper time. Calculate τA , τB , τC as functions of t.
(c) Write the components of Alices 4-velocity in Bobs frame, i.e. uA = dxµA /dτA . Draw uµA
on the spacetime diagram. What is ηµν uµA uνA ? What is ηµν uµA uνB ?
(d) Find Charlie’s velocity in Alice’s frame.
Solution: (a) Note the distinction between events which are spacetime points, worldlines
which are collection of events, and frames. When we say Alice’s frame we mean the inertial frame
in which Alice is at rest. For questions involving relative velocities dxµ /dτ , it is not important how
worldlines are offset with respect to one another. The constant translation simply vanishes when
taking derivative.
If this is part of the worldline of an observer with velocity v, then dx = vdt, therefore
By convention τ increases toward future: dτ = γ −1 dt. If the velocity is constant (as it is the case
in our problem), we can integrate this equation to obtain
t
τ= + constant. (1.22)
γ
1
uµ = √ (dt, dx1 , dx2 , dx3 ) = γ(1, v). (1.23)
dt 1 − v 2
6
The norm of uµ is given by
(Note that the Lorentz indices are lowered by ηµν .) Therefore, uµA is the unit tangent vector to
the worldline A. The inner product is given by
1
uAµ uµB = − q . (1.25)
1 − v 2A
(d) In Alice’s frame u0A µ = (1, 0). We can use the result of part (c)
µ 1
u0Aµ u0C = uAµ uC µ = − q . (1.26)
1 − v 0 2C
µ
u0Aµ u0C = uAµ uC µ = −γA γC (1 − vA vC ). (1.27)
0 gives
Equating the two expressions and solving for vC
0 vC − vA
vC = . (1.28)
1 − vA vC
where R00 = 1, Ri0 = R0i = 0 and Rji is a 3d rotation. A pure boost B that takes an object
7
from rest to velocity v is given by
1
B00 (v) =γ = √ ,
1 − v2
γ−1
Bji (v) =δij + vi vj .
v2
6. ∗ Two observers move in opposite directions on a circle of radius R with constant angular
velocities ω1 and ω2 . When they first meet, they synchronize their clocks. When they meet
again, whose clock will be delayed, and by how much?
8
2 Relativistic Dynamics, Constant Acceleration
The main principle of special relativity is that the laws of physics have to look the same in
all inertial frames (or to all inertial observers), which are related by Poincaré transformations.
Therefore, they have to relate Lorentz scalars to Lorentz scalars, Lorentz vectors to Lorentz vectors
and so on. These relations are called covariant. Moreover, in the appropriate limit they have to
reproduce the laws of non-relativistic dynamics.
dxµ
pµ = muµ = m . (2.2)
dτ
Since dxµ is a four-vector and dτ invariant under Lorentz transformations, (2.1) is covariant. More-
over, in the non-relativistic limit
1
pi → mv i , p0 → m + mv 2 , non-relativistic (2.3)
2
so (2.1) reduces to non-relativistic energy-momentum conservation. Note that the initial and final
particles are always the same in non-relativistic processes because the kinetic energies are too small
to have particle creation or annihilation.
d2 xi
m = FNewton , Rest Frame. (2.5)
dt2
d2 xi i d2 xi dv j dxi
d −1 dx
2
= γ −1 γ = γ −2 2 − v j . (2.6)
dt dτ dτ dτ dτ dτ
9
In the rest frame γ = 1 and v j = 0, therefore
i i
fRF = FNewton . (2.7)
0 ? Hint: use the fact that uµ = dxµ /dτ has a unit norm u uµ = −1.
1. What is fRF µ
Solution: Take derivative of uµ uµ = −1:
d duµ
0= (uµ uµ ) = 2uµ = 2uµ f µ . (2.8)
dτ dτ
0
fRF = 0. (2.9)
2. Rindler Observer Bob starts at rest at the origin in frame F . He then moves along the
x-axis with constant acceleration a. By “acceleration” in special relativity I mean: go to the
instantaneous rest frame of Bob, F 0 , in which Bobs velocity is v 0 = dx0 /dt0 = 0, where x0 and
t0 are the coordinates in frame F 0 . Then in frame F 0 Bob has acceleration dv 0 /dt0 = a. Find
the equation of motion of Bob x(t) in frame F .
Solution: Constant acceleration means that
d2 t0 d2 x0
ẗ0 ≡ = 0, ẍ0 ≡ = a, in F 0 frame. (2.10)
dτ 2 dτ 2
where the boost parameter is v = ẋ/ṫ, and γ = (1 − v 2 )−1/2 = ṫ. Using this and (2.10) we
find
ẗ = aẋ, ẍ = aṫ. (2.12)
1 p
x(t) = ( 1 + a2 t2 − 1). (2.13)
a
1 1
x(τ ) = (cosh aτ − 1), t(τ ) = sinh aτ. (2.14)
a a
3. Rindler Horizon Alice stays at the origin in frame F and sends text-messages to Bob. After
a time t the messages will never reach Bob. Calculate t.
FYI: Rindler horizon provides an extremely useful and simple setup to understand lots of
properties (both classical and quantum mechanical) of black hole horizons in general relativity.
10
From Bob’s point of view Alice falling behind the Rindler horizon is not much different from
Alice jumping inside a black hole of radius R = 1/(2a). Fortunately, unlike black holes there
is no singularity hiding behind the Rindler horizon.
Solution: The worldline of Bob is a hyperbole. It asymptotes the null line
1
t=x+ . (2.15)
a
This is the last signal that can reach Bob. It intersects Alice’s worldline x = 0 at t = 1/a.
x = t − tA , (2.16)
1
tA = (sinh aτ − cosh aτ + 1). (2.17)
a
Therefore,
−1 −1
ωA = dtA = (cosh aτ − sinh aτ )dτ = (cosh aτ − sinh aτ )ωB . (2.18)
ωB → e−aτ ωA . (2.19)
A more systematic way is to consider the four-momentum k µ of the photon that is sent from
Alice to Bob. By definition, the frequency of the photon measured by an observer with
4-velocity uµ is
ω = −uµ k µ . (2.20)
5. ∗ Now suppose that Alice and Bob are initially at rest, respectively at x = XA and x = XB
with XA > XB > 0. At t = 0 they launch their spaceships with constant proper accelerations
1/XA and 1/XB . Bob is sending a message to Alice every one minute. What is the frequency
at which Alice receives the messages?
11
3 Electromagnetism, Variation Principle
More generally we can set the component of Aµ along any unit vector nµ to zero by taking
Z
α=− nµ dxµ nν Aν . (3.4)
2. Pure Gauge Configuration. If Fµν = 0 everywhere, then the gauge field is a pure gauge
Aµ = ∂µ α. Find α.
Hint: It is useful to remember how in Newtonian physics potential energy is defined for a
curl-free static force field F (x) – not to be confused with Fµν :
This definition only fixes the potential difference between two points:
Z x
φ(x) = φ(0) − c dx0 · F (x0 ). (3.6)
0
The integral is defined along a path C connecting the two points 0 and x. For φ(x) to be
well-defined, the answer must be independent of the path. Equivalently, we must have
I
dx · F (x) = 0, for all closed loops. (3.7)
This is related to ∇×F by the Stoke’s theorem and is guaranteed by the condition ∇×F = 0.
12
Solution: Z x
α(x) = c dxµ Aµ . (3.8)
x0
This is well-defined because dxµ Aµ = 0 for any loop: Stoke’s theorem relates this to
H
dΣµν Fµν over the enclosed area, which vanishes by assumption. The reference point x0
R
Aharonov-Bohm Effect. If the spacetime region with Fµν = 0 has a nontrivial topology the
above construction can fail. Imagine there is a solenoid with a magnetic flux Φ 6= 0. Then
I
dxµ Aµ = Φ, if the loop goes around the solenoid. (3.9)
Here Aµ is not a pure gauge configuration even outside of solenoid, even though Fµν = 0 outside.
Of course this is not a surprise since the condition for Aµ being a pure gauge was that Fµν = 0
everywhere.
A more interesting situation arises if we cut and throw away the part of the spacetime that is
occupied by the solenoid. Now we have a spacetime with a boundary. The original solution for
Aµ is still a perfectly valid solution of Maxwell equations, and now Fµν = 0 everywhere since the
solenoid is not part of the spacetime anymore. Yet clearly Aµ is not a pure gauge.
3. Point-Particle Action Find the charged particle equation of motion by varying the following
action r
dX µ dX ν dX µ
Z Z
S[X, A] = m dσ −ηµν −q dσ Aµ (X). (3.10)
dσ dσ dσ
Solution:
d dXµ /dσ ν
m q + q d Aµ (X) − q dX ∂µ Aν = 0. (3.11)
dσ −ηµν dX
µ dX ν dσ dσ
dσ dσ
Using
d 1 d
=q (3.12)
dτ ηµν dX
µ dX ν dσ
dσ dσ
and
d dX ν
Aµ (X) = ∂ ν Aµ (3.13)
dσ dτ
we find
mẌµ = q Ẋ ν Fµν . (3.14)
to Aµ gives the Maxwell equations. What is the electric current corresponding to a point
13
particle with charge q? Hint: To vary with respect to Aµ , the coupling of Aµ to the worldline
in (3.10) has to be expressed in terms of an action defined as the integral of a Lagrangian
density over the spacetime d4 x. This can be done using a Dirac delta function δ 4 (xµ −X µ (σ)).
Use the delta function and the fact that for any f and g
Z X g(σn )
dσg(σ)δ(f (σ)) = , f (σn ) = 0 (3.15)
n
|df (σn )/dσ|
dX µ
Z
d4 xdσ Aµ (x)δ 4 (xµ − X µ (σ)). (3.16)
dσ
dX µ 4 µ
Z
µ
J =q dσ δ (x − X µ (σ)). (3.17)
dσ
We can use the delta function on x0 to perform the σ integration. Note that for a timelike
geodesic there is one and only one solution to the equation X 0 (σ) = x0 , as long as σ is a
nonsingular parameter along the worldline.
Ẋ µ 3 i
Jµ = q δ (x − X i (τ )). (3.18)
Ẋ 0
5. ∗ Wordline Reparametrization. Check that the point particle action (3.10) is invariant
under worldline reparametrization
σ → σ 0 = f (σ) (3.19)
dX µ
Z
1
SPolyakov = dσe−1 lẊ µ Ẋµ − e2 m2 r, Ẋ µ = ,
2 dσ
where e (which is called the worldline metric) is an additional dynamical variable. Are
these equations equivalent to the equation of motion for a point-particle? Note that in this
action (unlike the Nambu-Goto action), one can set m = 0. Does it reproduce what you
expect for a massless particle? What is the transformation of e under reparametrizations of
σ?
7. ∗ In addition to the physical time translation invariance X 0 → X 0 + the action for a point-
like particle is obviously invariant also under the worldline time translation (a special case of
14
worldline reparametrization),
X µ (σ) → X µ (σ + ) .
Normally, by Noether theorem one expects to obtain the conserved “worldline energy”, cor-
responding to this symmetry. Calculate this energy.
where [] denotes full antisymmetrization, e.g. A[µν] = (Aµν − Aνµ ). dω is called an exact
form. A form Ω is called a closed form if dΩ = 0. Show that all exact forms are closed:
ddω = 0 for all ω.
Hodge Dual of a rank-r form is a rank-(d − r) form:
Find ? ? ω.
Show that the Maxwell equations can be written as
d ? F = ?J, dF = 0. (3.22)
9. ∗ The identity dF = 0 implies that (locally) there exists a one-form field A such that F = dA
— in components Fµν = ∂µ Aν − ∂ν Aµ . Prove this by giving an explicit integral formula for
Aµ in terms of Fµν . Is your answer unique? (Hint: Recall how potential energy is defined for
a curl-free static force field.)
10. ∗ Cluster Decomposition. The first term in the point particle action (3.10) is the Lorentzian
length of the particle trajectory mτ [X µ (σ)] (square brackets emphasize the fact that τ is a
function of functions X µ (σ), i.e. it is a functional). Suppose we replace this term by a non-
linear function of the length, such as (τ [X µ (σ])2 . Show that for a free particle the new action
is extremized by a straight trajectory.
However, in the presence of a force (such as the EM force) the motion of the particle changes
in a funny way. The acceleration will depend on the full history. This theory breaks cluster
decomposition principle, namely the result of experiments performed at far separated points
depend on one another.
15
4 Noether Theorem, Symmetries and Conservation Laws
Consider an action Z
S[φ] = d4 xL (4.1)
where φ is a general notation for all dynamical variables. A symmetry transformation is a trans-
formation
φ → φ + δs φ (4.2)
q
µ dX ν
ηµν dX
R
2. Consider the worldline action SP P = dσ dσ dσ . It is invariant under both transla-
tions and Lorentz transformations
From the symmetries we can derive conservation laws as follows. Multiply the infinitesimal
symmetry transformation by a function (x) with finite support around a point x0 :
φ → φ + (x)δs φ. (4.8)
Because (x → ∞) = 0 the boundary term in the variation of the action vanishes. Moreover, the
16
variation would be zero if ∂µ = 0 so one obtains
Z Z
4 µ
δS[φ + (x)δs φ] = d x∂µ J = − d4 x∂µ J µ , (4.9)
for some vector J µ which depends on φ and its derivatives. In the last step I integrated by parts
and used the asymptotic behavior (x → ∞) = 0. On-shell, the action is stationary. In particular,
its variation under (4.8) must vanish. Since (x) is arbitrary, we must have
∂µ J µ |on−shell = 0. (4.10)
a Promoting the shift symmetry of the massless scalar field to a spacetime varying function1
φ → φ + c(x) (4.11)
leads to Z
δS = − d4 x∂µ c(x)∂ µ φ. (4.12)
b ∗ Find the conserved current Jbµ corresponding to the linear shift, and verify its conserva-
tion.
c Stress-Energy Tensor. Consider spacetime dependent translations
φ → φ + aµ (x)∂µ φ. (4.14)
where aµ (x → ∞) = 0. The change of the action after some partial integrations can be
written as Z
1
δS = − d4 x∂µ aν (x)[∂ µ φ∂ν φ − δνµ (∂φ)2 ]. (4.15)
2
There is one conserved current for each component of aν . They are collected in a single
rank-2 tensor called stress-energy tensor Tνµ . ∂µ Tνµ can be easily seen to be proportional to
φ.
d ∗ For a massive field show that
1 1
Tνµ = ∂ µ φ∂ν φ + δνµ (− (∂φ)2 − m2 φ2 ) (4.16)
2 2
1
To avoid clutter, I often combine the arbitrary function (x) with the parameter of transformation – the constant
c in this example – to define a single spacetime dependent parameter c(x) = (x)c.
17
and that it is conserved on-shell.
daµ
Z
dXµ
δSP P = dσ Pµ , Pµ = m . (4.18)
dσ dτ
From a conserved current one can derive a conserved charge. Let’s define
Z
Q(t) ≡ d3 xJ 0 (t, x). (4.19)
5. ∗ The above conservation of Q is a special case of a more general property. Show that
∂µ J µ = 0 is equivalent to
d ? J = 0, (4.21)
where J stands for the 1-form Jµ . Integrating this equation over any spacetime region R
gives Z Z
0= d?J = ?J (4.22)
R ∂R
6. ∗ Nambu-Goto Action. Analog of the worldline action for a string (or more generally an
n-dimensional brane, with point-particle and string corresponding, respectively, to n = 0 and
n = 1) is the Nabmu-Goto action. For a string, n = 1, it is
Z
−`−2 d2 σ
p
S= s − det hαβ ,
`−2
s is the string tension, α, β are worldsheet indices which run over 0, 1, and the induced
metric of the worldsheet is defined as
hαβ = ∂α X µ ∂β X ν ηµν , µ, ν = 0, · · · , d.
18
Check that the Nambu–Goto action is invariant under worldsheet reparametrizations,
X µ (σ α ) → X µ (f α (σ)) .
19
5 Representations of Poincaré Group and Spin
This implies that under composition of any two g1 , g2 ∈ G, the symmetry group, U satisfies
Hence U ’s form a representation of G. If the phase φ(g1 , g2 ) cannot be set to zero by a redefinition
of U this is called a projective representation of the group. The probabilities must remain invariant
if all state vectors are transformed together. This implies that U is unitary and linear
or else antiunitary and antilinear. We don’t need to worry about the latter case; see Weinberg 2.2
for further discussion.
We are specially interested in continuous groups (or Lie groups), like rotation, where the
group elements can be parametrized by a set of real numbers {θa }, with θa = 0 corresponding to
the identity element. The unitary operators can be expanded around 1:
where ta are called the generators of the group.2 For U to be unitary, ta must be Hermitian: t†a = ta .
Moreover, the composition rules of the symmetry group implies certain commutation relations on
ta . Perhaps the most familiar example is the commutation relations between generators of rotations,
i.e. the angular momentum algebra:
20
−θ1 , followed by −θ2 . Work at first order in θ1 and first order in θ2 (this includes O(θ1 θ2 )).
Verify (5.5) by demanding that the combination of four U ’s reproduce the net rotation.
Solution: The result of the combined rotation is
namely, a rotation by θ1 θ2 along x3 . Multiplying four U operators and using (5.5) gives
as it should.
The commutation relation of the generators {ta } is called the Lie algebra of the group. Ap-
plying a similar argument to the unitary representation of Poincaré group,
1
U (ω, ) = 1 + iωµν J µν − iµ P µ + · · · (5.8)
2
i[J µν , J ρσ ] = η νρ J µσ − η νσ J µρ + η µσ J νρ − η µρ J νσ , (5.9)
i[P µ , J ρσ ] = η µρ P σ − η µσ P ρ (5.10)
i[P µ , P ν ] = 0. (5.11)
The generators of spacetime translations are momentum operators P µ . The generators of proper
Lorentz transformations J µν comprise boosts K i = J i0 and angular-momenta Ji = ijk J jk .
2. ∗ Why are the conserved charges associated to a symmetry identified with the generators
of those transformations? Hint: Repeat the derivation of the Noether theorem in the path
integral formalism and with the insertion of a local operator O(y). Derive the Ward identity:
where T denotes time-order product, and δO is the variation of O under the symmetry
transformation. Integrate this equation over a thin slab of spacetime that contains y and
extends to spatial infinity.
As in the case of rotations in non-relativistic quantum mechanics (QM), we need to find irre-
ducible unitary representations of the symmetry group and their dimensionality to identify different
particle species. In the same sense that there is no intrinsic difference between spin up and spin
down electrons (they can be transformed into one another by a rotation), the irreducible representa-
tion of Poincaré group is a set of states that can be transformed into one another by a combination
21
of translations, boosts, and rotations.
with σ standing for all other quantum numbers, e.g. the analog of spin up and spin down states
of electron in non-relativistic QM. If we have a state with multiple particles then pµ is the center
of mass momentum, and σ would characterize, among other things, the relative momenta of those
particles. Hence it would be a continuous label. Our goal here is to classify single particle states and
study their interactions perturbatively. So we ignore such a possibility, and take σ to be discrete.
Having diagonalized momentum operators, Lorentz transformations Λµν have to be taken care
of next. Apply a general Lorentz transformation Λµν to Ψp,σ . The new state U (Λ)Ψp,σ is still an
eigenstate of P µ with eigenvalue p0µ = Λµν pν , because using the Poincaré algebra (check)
The matrix Dσσ0 has a block-diagonal structure with each block corresponding to an irreducible
representation. Note that in non-relativistic QM the problem of finding irreducible representations
of the group of Galilean transformations has a relatively simple answer. Since Galilean boosts
commute, any decomposition of Λ into a boost and a rotation gives a unique answer Rji for the
rotation component. The matrix Dσσ0 will depend only on Rji , which is an element of SO(3) and
whose irreducible representations are particles of definite spin.
Nevertheless, I treat both cases. The trick is to choose a reference momentum k µ , and define
arbitrary momentum states by applying a standard Lorentz transformation Lµν (p) to the reference
22
state:
Ψp,σ ≡ N (p)U (L(p))Ψk,σ , (5.16)
where N (p) is a normalization factor. The action of an arbitrary Lorentz transformation U (Λ) on
Ψp,σ , which takes it to a superposition of states with momentum Λp = ΛL(p)k, can be decomposed
into a Lorentz transformation Wνµ which leaves k µ invariant, followed by L(Λp). More explicitly,
using (5.2),
The transformations W which leave k µ invariant form a subgroup of Lorentz transformations which
is called the little group. Apparently
X
U (W (Λ, p))Ψk,σ = Dσσ0 (W )Ψk,σ0 (5.18)
σ0
and hence, finding representations of Lorentz group reduces to finding the representations of the
little group. The action of U (Λ) on any Ψp,σ is then determined from its little group image W
N (p) X
U (Λ)Ψp,σ = Dσσ0 (W )ΨΛp,σ0 , (5.19)
N (Λp) 0
σ
For massive particles −p2 = m2 > 0, it is natural to choose k µ = (m, 0). L(p) can be taken
to be a pure boost (defined in section 1) with parameter v i = pi /p0 . The little group is SO(3)
rotations whose irreducible representations are, as anticipated, states of integer and half integer
spin.
For massless particles p2 = 0, choose k µ = (κ, κ, 0, 0) where κ = 1eV . L(p) can be taken to
be a boost along x1 that takes k µ to (p0 , p0 , 0, 0) (find the boost parameter), followed by a rotation
that takes x̂1 to p̂. The little group is more interesting.
J = J 23 , A ≡ J 02 − J 12 , B ≡ J 03 − J 13 . (5.20)
23
which is the algebra of the symmetry group of Euclidean plane (two translations and a rotation
J).
However, if a or b is nonzero they must be a continuous label because under a rotation by angle θ
There is no compelling evidence for the existence of such continuous spin representations in nature.3
We ignore them here and set a = b = 0. Thus we are left with the representations of SO(2):
Rotations around the momentum vector whose eigenvalues are called helicity. Since a rotation
by 2π has to give ±1, the helicity of massless representations are either integer or half integer.
The latter has to do with the fact that the Lorentz group is doubly connected (Weinberg 2.7) and
therefore it admits projective representations (with half-integer spin and helicity). A rotation by
2π corresponds to a closed path in the group manifold that cannot be shrunk to zero. While a
rotation by 4π can always be shrunk to zero.
So far there is no reason for helicities to come in pairs h = ±σ when σ 6= 0. However, they
have to because, firstly, parity changes the sign of h, and secondly, even if parity is not respected
by interactions of a particle, causality of interactions that involve any massless particle requires
the existence of the opposite helicity state antiparticle (Weinberg 5.9).4 The gauge symmetries of
the theories of photons and gravitons are also the consequence of the same requirement. A Lorentz
vector Aµ has too many components to describe two helicities of the photon. Gauge symmetry
is the result of the redundancy of this description. This is the general strategy to construct a
relativistic quantum theory: to package the creation and annhilation operators for the irreducible
representations of Poincaré into fields and to write local lorentz-invariant Lagrangians for these
fields.
4. ∗ Find the number of degrees of freedom (polarizations) of dilaton (massless scalar), photon
(massless spin 1), and graviton (massless spin 2) in d = 5 spacetime dimensions. (Are there
photons or gravitons in d = 2 or d = 3?)
3
Nevertheless, they have been revisited recently by Schuster and Toro (1404.0675).
4
The argument goes as follows. To construct a causal interacting theory, we need to combine creation and anni-
hilation operators in local fields that transform under (non-unitary) representations of Loretnz group. To construct
such fields, the annihilation operator for any massless state of helicity σ 6= 0 has to be combined with the creation
operator of a massless field with helicity −σ.
24
6 A Relativistic Theory of Gravity
Newtonian gravity is very successful in explaining a wide range of phenomena. But it cannot fit
in a relativistic framework since it involves action at a distance. There cannot be any instantaneous
interaction in relativity because simultaneity is not Lorentz invariant, even though it is a good
approximate notion for slowly moving objects. Incidentally, Newtonian gravity looks extremely
similar to electrostatics with mass playing an analogous role as the electric charge. The Newtonian
potential ϕ is related to mass density ρ via Poisson equation
∇2 ϕ = 4πGρ, (6.1)
and massive objects moving in this field experience a force −m∇ϕ. There is a gravitational analog
of the Coulomb 1/r2 force between massive objects.
By the end of 19th century it was known that electrostatics is just a special limit of the Maxwell
theory, a relativistic theory which involves magnetic forces and electromagnetic waves. And today
we understand it as the unique theory of interacting massless spin-1 particles called photons. Today,
we also know that Newtonian gravity is a special limit of general relativity, the unique theory of
interacting massless spin-2 particles, or gravitons. In the next few lectures, I will take a non-
historical approach to reach this conclusion. Our guiding principles will be special relativity and
quantum mechanics. In this framework long-range interactions can only result from the exchange
of local degrees of freedom, i.e. particles.5 However unlike electromagnetism, gravity cannot be
described by the exchange of spin-1 photons. Like electric charges repel each other while all
(positive) masses attract gravitationally.
It is perhaps more natural to try a massless spin-0 field with a relativistic equation of motion
where the source T has a non-relativistic limit −T → ρ. At small velocities, we can neglect ∂t
compared to ∇ and recover (6.1).
So far we specified how matter sources ϕ. How does ϕ affect the motion of matter? What kind
5
To be more precise I should say small nonuniform motions of a particle can affect and be affected by another
particle via the exchange of local degrees of freedom.
25
of force does the exchange of ϕ induce among massive point sources? In field theory the answer is
unambiguous. One can use the perturbative QFT machinery to calculate the scattering amplitude
due to the exchange of ϕ. I leave this to the exercises, and take an alternative approach. Let’s try
to couple ϕ to the worldline of a point particle. To lowest order in ϕ and its derivatives (and of
course respecting worldline reparametrization invariance), this is
r
dX µ dX ν
Z
Spp = −m dσ −ηµν (1 + λϕ(X)). (6.4)
dσ dσ
In the rest frame and to lowest order in ϕ, we obtain the following equation of motion
The worldline coupling (6.4) also leads to a source for ϕ. Introducing a kinetic term for ϕ and
extending the worldline coupling to a spacetime action gives
Z Z
1
4
d x − κ(∂ϕ)2 − mϕ 4
dτ δ (x − X(τ )) . (6.7)
2
2. The sign before the worldline action (6.4) is important for our result. Show that it must be
negative?
Solution: The easiest way to fix the sign is by comparison to non-relativistic particle action.
Fix the worldline reparametrization by setting σ = X 0 ≡ t. This is called the static gauge.
s 2
dX i
Z
Sp = −m dt 1− (1 + ϕ). (6.8)
dt
Expanding for small velocities and small ϕ and neglecting an X i independent term, we get
" 2 #
dX i
Z
1
Sp = dt m − mϕ + O(v 4 , v 2 ϕ). (6.9)
2 dt
m 3 i
κϕ = δ (x − X i (t)). (6.10)
Ẋ 0
26
For a non-relativistic particle Ẋ 0 ' 1 and we recover (6.1) by setting κ = 1/4πG. Note that the
r.h.s. is indeed the trace of the stress-energy tensor for a point-particle,
pµ pν 3 i
T µν = δ (x − X i (t)). (6.11)
p0
So we take this as the definition of relativistic scalar gravity at linear level. Namely, given a matter
action Sm [ψ], with ψ representing all matter fields, we introduce the coupling S = ϕTµµ , where
R
Tµν is the total stress-energy tensor of Sm [ψ].6 Such a universally coupled scalar field is often
called dilaton. And the theory we just developed is the linear version of the theory of gravity
that Nordström proposed in 1913 (https://en.wikipedia.org/wiki/Nordstrom’s_theory_of_
gravitation).
Our scalar gravity model by construction passes all non-relativistic tests. Hence, to confirm or to
reject it we need to examine its intrinsically relativistic predictions. The most obvious examples are
gravitational red-shift, bending of light, precession of the perihelion of Mercury, and gravitational
wave emission from close binaries.
Gravitational Redshift. Good clocks are small and have rapid internal dynamics. What
does happen if they are immersed in an external field that is almost constant compared to their
size and period? There is no universal answer to this question. The external field might change
some fundamental feature based on which the clock operates. (For instance, an external magnetic
field shifts the energy levels of an atomic clock.) However, we expect some universality to arise in
a gravitational field. If dilaton ϕ is coupled, as above, by replacing X µ → X µ (1 + ϕ) everywhere
in the matter action, then all that happens to good clocks is that their natural time-scales get
stretched, or their frequency redshifted:
ω = (1 + ϕ)ω0 , (6.13)
where ω0 is the frequency away from gravitational sources, i.e. when ϕ = 0. This phenomenon
has been measured on Earth; precise atomic clocks at different altitudes work at different rates
https://en.wikipedia.org/wiki/Pound-Rebka_experiment.
Light Unbent. Let me next show that ultra-relativistic particles decouple from scalar gravity.
Consider the motion of a particle in a static gravitational field (say of the Sun)
d dXµ
m (1 + ϕ) = −m∂µ ϕ. (6.14)
dτ dτ
6
Note that at lowest order in perturbation theory this is just a cubic coupling between matter fields and ϕ. For
instance, a massive scalar field ψ in d dimensions is coupled via the vertex
Z
d 2−d 2 d 2 2
S= d xϕ (∂ψ) − m ψ . (6.12)
2 2
27
The µ = 0 component of this equation gives us a constant of motion
dX 0
E = m(1 + ϕ) . (6.15)
dτ
We can use this equation to express derivatives with respect to τ in terms of those with respect to
time t = X 0 :
d2 X i m2
= − ∂i ϕ + O(ϕ2 ). (6.16)
dt2 E2
Now taking the limit m → 0 while keeping the energy of the particle E fixed makes the r.h.s. negli-
gible. Therefore, light rays move on straight lines. Nowadays we have very accurate measurements
that light rays are indeed bent by about 1.7 arcseconds by the sun. So the simplest relativistic
model of gravity is experimentally ruled out. Nevertheless, it is very instructive to pursue other
predictions of scalar gravity, because it is a simple setup which shares lots of conceptual similarities
with spin-2 gravity. (Or imagine we couldn’t measure light.)
3. ∗ Couple the Polyakov action to ϕ by changing ηµν → (1 + 2ϕ)ηµν . Show that the same
conclusion about massless particles can be directly derived from the Polyakov action with m
set to zero.
4. ∗ Calculate Tµµ for the Maxwell theory. Does it agree with your expectation? Repeat the
same exercise for a massless scalar field.
5. ∗ Calculate the differential cross-section for scattering in attractive 1/r potential, with strength
C. Use Born approximation.
6. ∗ Calculate the differential cross-section for scattering of two non-relativistic particles of mass
m, M (m M ), due to the tree-level exchange of a dilaton coupled to Tµµ . Use the rest frame
of M . Is there a choice of C in the previous problem that gives the same result?
28
7 Dilaton Waves
One of the most exciting features of the relativistic theory of gravity is the emergence of gravi-
tational waves, very much analogous to the emergence of electromagnetic waves once electrostatics
is completed into the relativistic Maxwell theory. The motions of matter sources cause the dilaton
field ϕ to locally fluctuate. These fluctuations are governed by a wave equation (∂t2 − ∇2 )ϕ = 0,
so they propagate at the speed of light, carrying away energy and information. The goal of this
lecture is to learn how to calculate the dilaton field of moving masses, and the radiation of dilaton
waves by compact systems.
Retarded Potential. The standard way to deal with the problem is to find the retarded
Green’s function G(x, x0 ) which is the solution to
x GR (x, x0 ) = δ 4 (x − x0 ) (7.1)
δ(t − t0 − |r − r 0 |)
GR (x, x0 ) = − θ(t − t0 ). (7.2)
4π|r − r 0 |
For aesthetic reasons I use r to denote spatial coordinates: xµ = (t, r). The step function θ
[θ(t − t0 ≥ 0) = 1 and θ(t − t0 < 0) = 0] ensures that GR vanishes for t < t0 . Note, however, that
GR vanishes everywhere outside of the future light-cone of x0 , as expected from Lorentz invariance
of the system. (In this particular case GR is non-vanishing just on the future light-cone.) Had it
been nonzero anywhere outside of the future light-cone, at that point x and x0 would have been
spacelike separated and by a Lorentz transformation one could have made GR nonzero at some
t < t0 .
1. Derive (7.2).
Solution: Since there is no explicit x dependence G is a function only of x − x0 . It is
convenient to use Fourier basis:
d3 k
Z
0 0
GR (x − x ) = 3
Gk (t − t0 )eik·(r−r ) . (7.3)
(2π)
We get
(−∂t2 − k 2 )Gk (t − t0 ) = δ(t − t0 ), Gk (t − t0 < 0) = 0, (7.4)
√
where k = k2 . There is a unique solution:
1
Gk (t − t0 ) = − θ(t − t0 ) sin k(t − t0 ). (7.5)
k
29
Next, Fourier transform this expression back to real space (at t0 = 0, r 0 = 0, t > 0; t0 , r 0 can
be easily restored at the end by using translational invariance):
d3 k
Z
GR (t, r) = Gk (t)eik·(r)
(2π)3
Z ∞
1
=− 2 dk sin kt sin kr
2π r 0
Z ∞
1
=− 2 dk[cos k(t − r) − cos k(t + r)] (7.6)
8π r −∞
Z ∞
1
=− Re dk[eik(t−r) − eik(t+r) ]
8π 2 r −∞
1
=− [δ(t − r) − δ(t + r)].
4πr
The argument of the second delta function is positive definite, so it can be ignored. Restoring
t0 and r 0 gives (7.2).
For a generic source T (x) ≡ Tµµ (x), the solution to the equation
with outgoing boundary condition (i.e. no incoming ϕ wave as t → −∞) can be easily checked to
be −4πG d4 x0 GR (x, x0 )T (x0 ). This is the retarded potential
R
T (t − |r − r 0 |, r 0 )
Z
ϕ(t, r) = G d3 r 0 . (7.8)
|r − r 0 |
The retardation t − |r − r 0 | makes it manifest that information propagates at the speed of light.
However in spite of being conceptually illuminating, this exact solution is too complicated to be
useful practically. To proceed, imagine we have a compact system of masses with typical size R (by
which I mean either a bound system of size R or a system made of non-relativistic masses which do
not move significantly while being observed), and study the large r R behavior of the solution. I
don’t make any assumption about the time-dependence of the source. In general it can be Fourier
transformed in time Z
0 dω −iωt
T (t, r ) = e T (ω, r 0 ), (7.9)
2π
where reality of T (t, r 0 ) implies T (ω, r 0 ) = T ∗ (−ω, r 0 ). Since the equation is linear and there is
no explicit time-dependence, different frequencies are decoupled. Thus, let’s focus on one of the
frequencies
Tω = T (ω, r 0 )e−iωt (7.10)
30
and find the corresponding ϕω solution (note that Tω and ϕω have dimensions of T /ω and ϕ/ω,
respectively)
Z 0 )e−iω(t−|r−r 0 |)
3 0 T (ω, r
ϕω (t, r) = G d r . (7.11)
|r − r 0 |
Because of the linearity of (7.8), the full solution will be the superposition
Z
dω
ϕ(t, r) = ϕω (t, r). (7.12)
2π
can be truncated. In two extreme regimes, when ωr 1 and when ωr 1, further approximations
can be made. In the Near Zone r 1/ω, which also implies ωr0 r0 /r, the corrections to the
exponent in (7.11) can be neglected compared to the r0 /r corrections coming from the denominator.
The solution is then approximately
T (ω, r 0 )
Z
−iω(t−r)
ϕω (t, r) = Ge d3 r 0 . (7.14)
|r − r 0 |
Expanding the denominator in powers of r0 /r gives a multipole expansion very much analogous to
what is encountered in electrostatics. The lth multipole has the r̂ dependence of the lth spherical
harmonic Yl0 and is suppressed by (r0 /r)l with respect to the monopole solution, which goes as
1/r. We recover the full non-relativistic answer if we can further neglect powers of ωr compared to
R/r. We then have an approximately instantaneous reaction of ϕ to the variations of the source.
The opposite regime r 1/ω is called the Wave Zone. The solution is now approximately of
the form
A(ω, r̂)e−iω(t−r)
ϕω (t, r) ≈ −G , (7.15)
r
where Z
0
A(ω, r̂) = d3 r 0 T (ω, r 0 )e−iωr̂·r . (7.16)
Thus ϕω is an outgoing wave. We are not interested in r0 /r corrections in this regime because,
as we will see, the energy flux of the free waves is O(ϕ2 ). Therefore, it is only the leading 1/r
term that is responsible for taking away energy to infinity. Imagine we are detecting the waves
coming from a faraway source, with characteristic frequency ω, and say at distance r0 from the
Earth center so that ωr0 1. Near the Earth we can write r = r 0 + x, and use |x| ∼ RE r0 to
write the solution as
ϕω (t, x) ≈ N e−iω(t−r̂0 ·x) , (7.17)
with N a normalization constant (including the phase eiωr0 ). That is, the derivatives are dominated
by derivatives of the phase. This is a Plane Wave with wave-vector k = ωr̂ 0 . Plane waves are
31
the simplest solutions of the free field equations, and are usually used as a basis for the asymptotic
states and to quantize the theory. For instance, deriving a formula for the energy flux of the
radiation becomes extremely easy since the problem has essentially become 1 + 1 dimensional.
2. Derive a formula for the Energy Flux (erg/sec.cm2 ) in terms of stress-energy tensor. What
does it give for a plane dilaton wave?
Solution: Recall that the zeroth component of the stress-energy tensor T0µ , is the Noether
current associated to time-translations. Therefore, the conserved charge − d3 xT00 has to be
R
identified as the total energy and T00 as minus the energy density. The conservation equation
then implies that −T0i is the flux of energy (erg/cm2 sec). For a plane wave, propagating
along z-direction, T0µ depends only on t and z and the flux is F = −Ttz . For a free massless
scalar field with action S = − κ2 (∂ϕ)2 , one finds
R
1
− Ttz = −κ∂ z ϕ∂t ϕ = (∂t ϕ)2 , (7.19)
4πG
where I used the fact that in our signature ∂ z = ∂z , for a right-moving wave ϕ(t, z) = ϕ(t−z),
and plugged in κ = 1/4πG. Note that for an observer at r 0 (measured from the source) the
wave-number is ωr̂ 0 and Ttz must be replaced by the component of T0i along r̂ 0 , which is T0r .
3. ∗ Derive the Poynting flux for the Maxwell field, using Ei = F0i and Bi = εijk Fjk .
4. ∗ Transverse vibrations of a non-relativistic string with mass density ρ and tension T are
described by the action
Z
1 2 1 2
S= dtdx ρ(∂t u) − T (∂x u) . (7.20)
2 2
Derive the stress-energy tensor. Find the sound speed cs . Find the energy flux for a right-
moving wave u(t, x) = u0 cos(ω(t − x/cs )).
Let’s now return to the original problem with a general time-dependent source (7.9). We are often
interested in one of the following cases.
• Either the system is (approximately) periodic with period ∆t, in which case we would like
to know the average Radiated Power (Luminosity) (erg/sec). In this case, the Fourier
integral in (7.9) becomes a discrete sum
X
T (t, r 0 ) = T (ωn , r 0 )e−iωn t , ωn ≡ 2πn/∆t, (7.21)
n∈Z
32
with a similar discrete sum replacing (7.12):
X
ϕ(t, r) = ϕωn (t, r). (7.22)
n∈Z
Given an expression for the energy flux, the average luminosity (erg/sec) is obtained by
integrating the flux (erg/sec.cm2 ) over a sphere of constant radius r, and averaging it over
one period Z t0 +∆t Z
1
hLi = − dt dr̂ r2 T0r . (7.23)
∆t t0
The integral over one period collapses the double sum over n into the diagonal because
Z t0 +∆t
dte−i(ωn +ωn0 )t = ∆t δn,−n0 . (7.25)
t0
So we finally get Z
G X
hLi = dr̂ ωn2 |A(ωn , r̂)|2 , (7.26)
4π
n∈Z
• Or else, there is a process like a scattering event or a merger with a finite temporal extent,
during which dilaton waves are emitted. Here one often asks what is the total amount of
energy released in gravitational radiation. In this case the Fourier transform is continuous
and we have Z ∞ Z
Etot = − dt dr̂ r2 T0r . (7.27)
−∞
we obtain Z Z ∞
G dω 2
Etot = dr̂ ω |A(ω, r̂)|2 . (7.30)
4π −∞ 2π
33
source of characteristic frequency ω and characteristic size R, the solution looks like an outgoing
wave for r 1/ω. To study radiation (for instance the amount of energy that escapes from the
system to infinity) we only care about the leading 1/r behavior of solution. Higher order corrections
in R/r can be ignored. So the solution is of the form
ϕ̂(t − r, r̂)
ϕ(t, r) = + O(1/r2 ), (7.31)
r
So depending on the structure of the source, the radiation will be anisotropic (depends on r̂) because
of the r̂ · r 0 . It is convenient to expand the r̂ dependence of ϕ̂ in terms of an orthogonal basis on
the sphere, i.e. spherical harmonics:
GX
ϕ(t, r) = alm (t − r)Ylm (r̂). (7.33)
r
l,m
l = 0, 1, 2, . . . are called monopole, dipole, quadrupole and so on. Due to the orthonormality of
Ylm basis, the total emitted power which is obtained by integrating the flux over the sphere is
X
L=G ȧ2lm (t). (7.34)
l,m
The nth order term in the sum is suppressed by v n , so for a given precision the sum can be truncated.
On the other hand the nth order term can only contribute to l ≤ n multipoles. Therefore, the
multipole expansion is an expansion in powers of velocity. To second order, we get (check)
G ˙ 1
ϕ(t, r) = m(t − r) + r̂i di (t − r) + (3r̂i r̂j − δij )q̈ij (t − r) + · · · (7.36)
r 18
34
where over-dot denotes d/dt and the monopole, dipole, and quadrupole are respectively defined as
Z
m(t) = d3 r 0 T (t, r 0 ) + · · · ,
Z
di (t) = d3 r 0 ri0 T (t, r 0 ) + · · · , (7.37)
Z
2
qij (t) = d3 r 0 (3ri0 rj0 − r0 δij )T (t, r 0 ) + · · · .
Note that we have neglected relative corrections of order v 2 to each moment (this is the meaning
of ellipses). Given that multipole expansion is itself an expansion in powers of v, one can use, say,
the quadrupole term without worrying about the corrections to the monopole only if monopole
emission vanishes because of a symmetry reason.
Let us now consider a concrete example by calculating the power-loss of a binary system com-
posed of two neutron stars in a circular orbit with frequency ω. It is indeed the case that both
monopole and dipole radiation are zero by symmetry. The monopole term is spherically symmetric,
and the spherical average of the source term in the center of mass frame is time-independent. In
this frame no vector can be associated to the system either, hence di = 0.
6. ∗ Show that in the center of mass frame the monopole and dipole moment of any solid object
is time-independent. As a result the monopole of a binary system must be proportional to
the eccentricity of the orbit. Is the dipole moment always zero?
Thus the leading emission is quadrupole emission. Integrating the expression for the flux over
the sphere and using
Z
4π
dr̂(3r̂i r̂j − δij )(3r̂k r̂l − δkl ) = (3δik δjl + 3δil δjk − 2δij δkl ) (7.38)
5
35
Assuming equal masses M at separation R in x−y plane, the time-dependent part of qij is (exercise)
3 3 3
qxy = M R2 sin(2ωt), qxx = M R2 (1 + cos(2ωt)), qyy = M R2 (1 − cos(2ωt)). (7.40)
4 4 4
rg5
Ploss = − . (7.41)
15GR5
This loss of power causes the orbit to shrink with a rate (check)
8rg3
Ṙ = − . (7.42)
15R3
This concrete prediction of scalar gravity has actually been tested observationally, and has failed
miserably. Astronomers have measured the orbit evolution of a few such binaries in the Uni-
verse extremely precisely and over many years (https://en.wikipedia.org/wiki/Hulse-Taylor_
binary). The result is in perfect agreement with general relativity, which predicts a power loss
that is by a factor of 6 larger than (7.39).
36
8 Spin-2 Gravity
Reading: https://video.ias.edu/pitp-2011-arkani-hamed1
We have seen that the scalar theory of gravity, though theoretically sound, is not phenomeno-
logically adequate. We argued before that gravitational attraction cannot be described by a spin-1
particle. So the next candidate for graviton is a massless spin-2 particle, which has two degrees of
freedom with helicity ±2. In a relativistic theory, such a particle is described by a symmetric tensor
hµν . Compare this with photons which are described by a vector field Aµ . In Maxwell theory, the
description in terms of Aµ is redundant since in d = 4 there are 4 components in Aµ while there
are only two photon polarizations. However, the theory has a gauge symmetry Aµ → Aµ + ∂µ α
and this eliminates the extra degrees of freedom. To preserve this symmetry the electric current
J µ which sources Aµ has to be conserved. The gauge symmetry and the need for such a conserved
source are both manifest in the Aµ equation of motion
∂ν F νµ = −J µ . (8.1)
There are 10 independent components in hµν , exceeding the number of graviton polarizations
by 8. And there is a bigger gauge symmetry called Linearized Diffeomorphisms (or “linear
diffs” for short)
hµν → h̃µν = hµν + ∂µ ξν + ∂ν ξµ , (8.2)
with the gauge parameter now being an arbitrary x-dependent 4-vector ξ µ . The equation of motion
for hµν has to be a tensor equation, and to respect the gauge symmetry, the source term must be a
conserved tensor. The natural candidate is the stress-energy tensor T µν . With these requirements
the hµν equation has to be
The proportionality coefficient c depends on the normalization of hµν . We will fix it by demanding
that, first, the gravitational coupling is
Z
1
Scoupl. = d4 xhµν Tµν
(0)
, (8.4)
2
and second, Newtonian gravity is recovered in the non-relativistic limit (as we shall see). The
super-script (0) on Tµν indicates that it is calculated to zeroth order in hµν , as appropriate for a
linearized theory of gravity. To simplify the notation I will drop (0), but it is implicitly assumed
until we get to the discussion of nonlinear gravity.
37
1. Derive the homogeneous part of (8.3).
Solution: First, use (8.2) to set ∂ µ h̃µν = 0:
with a, b, c to be determined. Next, demand that this equation is invariant under the residual
diffs (8.6). This forces a = 1. Then, take the divergence ∂ µ of both sides. The l.h.s. vanishes
identically if and only if b = −1. We will fix c later.
Finally, we need to write a diff invariant equation that in ∂ µ h̃µν = 0 reduces to the above
equation. This means that the full equation contains ∂ µ hµν . With a = −b = 1 it is easy to
see that the l.h.s. of (8.3) is the only diff invariant generalization of (8.7).
As in the case of scalar gravity once the coupling of graviton to matter fields is determined, we
can read off the gravitational force either by looking at scattering of particles due to the exchange
of gravitons or by coupling hµν to the worldline action. The second route is more straightforward,
so let us follow it here. Recall that for a point-particle
Z
T µν = m dτ Ẋ µ Ẋ ν δ 4 (xµ − X µ (τ )). (8.8)
(As before over-dot means d/dτ .) Using (8.4), we obtain the following worldline action
dX µ dX ν
r !
dX µ dX ν
Z
dσ dσ
Spp = − m dσ −ηµν 1− hµν (X)
dσ dσ 2(dτ /dσ)2
(8.9)
r
dX µ dX ν
Z
=−m dσ −(ηµν + hµν (X)) + O(h2µν ).
dσ dσ
Varying with respect to X µ , gives the particle equation of motion in the gravitational field as
d ηµν + hµν dX ν 1
m q = m∂µ hαβ Ẋ α Ẋ β . (8.10)
dτ 1 − h Ẋ α Ẋ β dτ 2
αβ
To simplify this equation, perform a spatial diff ξ i to set h0i = 0.7 Then, in the rest frame where
7
This is a short way of saying that perform a diff with ∂0 ξi = h0i so that h̃0i = 0. And then forget about the old
38
Ẋ i = 0, we find Ẍ i = O(∂i hµν ). Therefore for µ = i, we can ignore hµν on the l.h.s., to get
1
mẌ i = m∂i h00 + O(h2µν ). (8.11)
2
This implies that in the non-relativistic regime h00 is related to the Newtonian potential by
2. Given the above relation derive the proportionality coefficient of Tµν in (8.3).
Solution: In the Newtonian limit the dominant component of the stress-energy tensor is
T00 = ρ, and the time-derivatives are small. Using these facts together with the trace of (8.3)
one can obtain an equation for h00 . The trace of (8.3) implies
c
h − ∂σ ∂ρ hσρ = T µ. (8.13)
2−d µ
Combining this with the 00 component of (8.3), and using the approximation ∂0 ≈ 0, Tµµ ≈
−T00 = −ρ give
3−d
∇2 h00 = c ρ. (8.14)
2−d
Substituting (8.12), taking d = 4, and comparing with the Poisson equation, we obtain
c = −16πG. (8.15)
The last thing to do in this lecture is to find an action for hµν . This allows us to quantize gravity
and to determine the energy-momentum content in gravitational waves. Writing a quadratic action
αβ
that gives (8.3) is extremely simple. Introduce the second order differential operator Eµν such that
αβ
Eµν hαβ gives the l.h.s. of (8.3). Explicitly, this is of the form
αβ αβ
Eµν = δµν − δνα ∂µ ∂ β + · · · (8.16)
Finding the rest of the terms is left to you as an exercise. Then the quadratic action for hµν is
Z
1
Sg = d4 xhµν Eµν
αβ
hαβ (8.17)
64πG
1
d4 x
R
3. Derive the equation of motion from the action S = 2 φφ + φJ . Write the action in
the first order form (where there is at most one derivative per field).
Solution. The equation of motion is
φ = −J. (8.18)
hµν and drop the tilde from the new one. In the future I may just say “choose the gauge h0i = 0”.
39
d4 x − 21 (∂φ)2 + φJ .
R
Integration by parts in the first term gives S =
∗ αβ
4. Derive Eµν . Show that
∂ µ Eµν
αβ
hαβ ≡ 0, (8.19)
The gravitational action Sg can be simplified a bit, and be written in the following first order
form Z
1
d4 x (∂α hµν )2 − 2(∂µ hµν )2 + 2∂µ hµν ∂ν hαα − (∂α hµµ )2 .
Sg = − (8.20)
64πG
and all other components of hµν equal to zero. The energy-momentum tensor of a freely
propagating wave is in its plane of motion. Derive an effective 1 + 1 dimensional action. Find
the stress-energy tensor, and using that the energy flux carried by plane gravitational waves.
Solution: The only term in (8.20) that survives is the first term. Let’s use initial Latin
indices for the two dimensional space (t, x), and suppose the transverse plane is compact
with area A Z
A
d2 x (∂a h+ )2 + (∂a h× )2 ,
Sg = − (8.22)
32πG
This is identical to the action of a free scalar field in 1 + 1d. Dividing by A we get the 3 + 1d
stress-energy tensor
1 1
tab = [∂a h+ ∂b h+ − ηab (∂c h+ )2 ], (8.23)
16πG 2
and a similar expression for h× . I use lower-case tab for the stress-energy tensor of gravitons
to distinguish it from that of matter fields which appears in the source term (8.4). The energy
flux is
1 1
− t10 = − ḣ+ h0+ = ḣ2 , (8.24)
16πG 16πG +
where I used h0+ ≡ ∂x h+ = −ḣ+ .
6. ∗ By Weinberg-Witten theorem a theory with a massless spin-2 particle cannot have a non-
trivial (ordinarily) conserved stress-energy tensor. Why does our derivation of the energy flux
in the gravitational waves make sense despite this fact?
40
9 Phenomenology of Spin-2 Gravity
We formulated spin-2 gravity in the last lecture. Now we should understand what it predicts.
Redshift. As discussed in the context of scalar gravity, to study the performance of good
clocks (that is small clocks with fast internal dynamics), the gravitational field can be treated as a
constant matrix hµν . As before we set h0i = 0 by an appropriate diff so as to make connection with
dilaton gravity and Newtonian gravity. Looking back at the worldline action (8.9), we see that the
only effect of such a constant field is to replace
with zero 0i components. Hence, the internal dynamics of clocks would be the same if we measured
time and distances differently:
p p
X̃ 0 = 1 − h00 X 0 , X̃ i = λi Rji X j , no sum over i (9.2)
where Rji is a rotation matrix that diagonalizes gij , and λi are its eigen-values:
X
gij = λm Rim Rjm . (9.3)
m
1
ω = (1 − h00 )ω0 = (1 + ϕ)ω0 , (9.4)
2
which is the same as in dilaton gravity. In fact, the dilaton gravity could have been formulated in
a similar fashion by taking
gµν = (1 + 2ϕ)ηµν . (9.5)
Note that in the X̃ µ system of measurement there is the usual (special relativistic) rules of trans-
lating the rate of moving clocks to one another. Thus, it is natural to modify the definition of
proper time of an observer who moves inside gravitational field to be what her clock measures:
Good clocks always work at the same rate in terms of τ̃ . Soon we will get used to the new definition
and drop the tilde. In terms of τ̃ the particle equation of motion (8.10) takes the form
dX ν dX α dX β
d 1
m gµν = m∂µ gαβ . (9.7)
dτ̃ dτ̃ 2 dτ̃ dτ̃
41
This equation is called the Geodesic Equation, describing the shortest lines that can be drawn
on a manifold with metric gµν .8 Of course, so far we derived it assuming small hµν , ignoring O(h2µν )
corrections, and continue to do so in the rest of this lecture. Later I will return to the nonlinear
version.
As we saw in the context of scalar gravity, the motion of test particles in the static field
of astrophysical objects is an important testing ground for the theory of gravity. The frequency
of signals propagating linearly on such a background remains conserved.9 Hence we can set up
experiments to easily measure gravitational redshift. Moreover, as in the case of dilaton gravity
there is a conserved energy for particle propagating on this background. The zeroth component of
the geodesic equation implies that the following quantity is a constant of motion
dX µ
mg0µ = E. (9.8)
dτ̃
1
∂ µ ψµν ≡ ∂ µ (hµν − ηµν hαα ) = 0. (9.9)
2
Everything we said about dilaton waves can now be repeated for ψµν almost unchanged. Far from
time-dependent sources there will be gravitational radiation:
Z
4G
ψµν = d3 r 0 Tµν (t − |r − r 0 |, r 0 ). (9.11)
r
They look like plane waves with k = ωr̂. They carry energy and momentum, and they can be
measured by monitoring the motion of test masses and laser beams.
However, there are still too many components in ψµν . We can further simplify it by using the
residual diffs that preserve (9.9), namely diffs that satisfy
ξµ = 0. (9.12)
µ
In particular, for a plane wave of momentum k µ , (i.e. hµν (x) = ĥµν eikµ x with ĥµν =const.) we
8
To be honest they are the longest for Lorentzian signature. Any time-like trajectory can be deformed into
a zigzagging one made of null segments which has zero proper time. The universal description is stationary, or
optimum.
9
This is saying that we can do a Fourier transform in t and different Fourier components eiωt decouple.
42
µ
should use ξµ = ξˆµ eikµ x , which automatically satisfies (9.12), because kµ k µ = 0. Take
X
ξˆµ = akµ + bk̄µ + cs (s)
µ , (9.13)
s=1,2
(s)
and µ two orthonormal polarization vectors that span the transverse plane:
0 µ 0
(s)
µ
(s )
= δ ss , k µ (s) µ (s)
µ = k̄ µ = 0. (9.15)
We can make ĥµν purely spatial (h0µ = 0), transverse (k µ hµν = 0), and traceless (hαα = 0) by
choosing
ν
k̄ µ k̄ ν ĥµν ĥµµ k̄ µ (s) ĥµν
a=− , b= , cs = . (9.17)
8iω 4 4iω 2 2iω 2
Therefore, in this gauge ψµν = hµν and for a wave propagating along x1 , the nonzero components
are
1
h+ = (h22 − h33 ), h× = h23 . (9.18)
2
We have already derived the stress-energy tensor of these modes. Solving them in terms of Tµν
in the non-relativistic limit and deriving the quadrupole formula is nicely explained in LL §110.
However, for completeness I’ll summarize it here.
Let’s fix some k̂ = r̂ = x̂1 . Observe that the above gauge transformation to spacelike-transverse-
traceless hµν does not change h23 and h22 − h33 . Moreover, it is always true that ψ23 = h23 , ψ22 −
ψ33 = h22 − h33 . Therefore, we can immediately extract h+ and h× from the spatial components
of (9.11) which have Tij as a source integrated over d3 r 0 . We can use stress-energy conservation to
43
simplify any localized source (to simplify the notation I temporarily drop the prime on r):
Z Z
d3 rTij = d3 r[∂k (rj Tik ) + rj ∂0 Ti0 ]
Z
=− d3 rrj ∂0 [∂k (ri T0k ) + ri ∂0 T00 ] + B.T.
Z (9.19)
3 i
=− d r[r ∂0 Tj0 + ri rj ∂02 T00 ] + B.T.
Z
=− d3 r[Tji + ri rj ∂02 T00 ] + B.T.
where B.T. stands for boundary terms, which vanish for localized sources. This equation implies
that Z Z
3 1
d rTji = − ∂02 d3 rri rj T00 , (9.20)
2
For non-relativistic sources, T00 ≈ ρ and we get from (9.11)
2G 2G
h23 = − q̈23 , h22 − h33 = − (q̈22 − q̈33 ). (9.21)
3r 3r
I replaced the second inertial moment with the quadrupole moment (restoring the prime)
Z
qij = d3 r 0 (3r0 i r0 j − δij r02 )T00 , (9.22)
by removing the trace. The additional term is proportional to δij and doesn’t affect (9.21). In the
last lecture we derived the expression for the flux:
1
F = (ḣ2 + ḣ2× ). (9.23)
16πG +
To find the total power, we need to integrate it over a sphere of large radius r. For this purpose it
is useful to define the plus and cross polarization vector for a general k̂ = r̂:
In terms of these polarization vectors the expression for the flux in an arbitrary direction becomes
G X (s) d3
F = ( qij )2 . (9.25)
36πr2 s ij dt3
Since the polarization vectors give a basis for transverse, traceless matrices the above sum over s
44
can be expressed purely in terms of r̂i and δij :
X (s) 1
ij (s)
mn = [(δim δjn + δin δjm ) − δij δmn + (r̂i r̂j δmn + r̂m r̂n δij )
s
4
(9.26)
− (r̂i r̂m δjn + r̂i r̂n δjm + r̂j r̂m δin + r̂j r̂n δim ) + r̂i r̂j r̂m r̂n ].
The integral of this expression over r̂ can then be expressed just in terms of δij :
Z X (s) 2π
dr̂ ij (s)
mn = (3δim δjn + 3δin δjm − 2δij δmn ). (9.27)
s
15
Gravitational waves can be detected by monitoring the distance between two free flying masses.
If one of the masses is equipped with a laser and an accurate clock, and the other with a good mirror,
the distance between the masses can be measured by timing how long it takes for a pulse of laser light
to make the round-trip journey. This is essentially how the mainstream detectors such as LIGO work
today. In 2016, LIGO announced the first direct detection of gravitational waves due to the merger
of two black holes of roughly 30 solar mass each (https://www.ligo.caltech.edu/detection).
Such a merger has different phases. The early phase and the longest is called Inspiral, during
which the two objects are separate and they orbit one another. Our linearized theory gives a good
description for the emission during most of this period. Towards the end of the inspiral phase and
during the Merger and Ringdown phase the system is fully relativistic. During this short final
period the full nonlinear formulation of general relativity is necessary to make predictions.
3. ∗ The goal of this problem is to see why massless spin-2 particles are described by a symmetric
45
tensor hµν . Consider a helicity +2 single-particle state with momentum k µ = (ω, ω, 0, 0):
The field hµν (x) has to be constructed out of annihilation and creation operators multiplied by
appropriate polarization vectors and mode-functions. In particular, this expansion includes
and focus on (9.33), the term whose momentum remains invariant. This can be calculated in
two ways. One is to use (9.32). The other is to transform ε∗µν .
4. ∗ Two objects of mass M have a head-on collision at event (0, ~0). In the distant past, t → −∞,
the masses started at x → ±∞ with zero velocity.
5. ∗ Consider a thin metal rod of mass M and length ` spinning at frequency ω around a
symmetrical perpendicular axis.
46
a Show that the time-dependent part of the reduced quadrupole moment,
1
Jij ≡ Iij − δij I,
3
`2
Jxx = m cos2 (ωt),
12
`2
Jyy = m sin2 (ωt),
12
`2
Jxy = m sin(ωt) cos(ωt). (9.35)
12
b Use this result to compute the gravitational radiation luminosity emitted by the rod. What
is its power assuming that M = 103 g, ` = 100 cm, and ω = 1 kHz. How long does it take
for the rod to lose a significant part of its kinetic energy in GWs?
c Estimate the amplitude of the gravitational waves h at a distance of 1 km.
47
10 Nonlinear Gravity: I. General Covariance
Let us momentarily return to scalar gravity, which at linear order was defined by
I explicitly added the label (0) to emphasize that the r.h.s. is evaluated to zeroth order in ϕ.
Clearly, the total stress-energy tensor of the theory contains ϕ. First, because there is energy
stored in the gravitational field of massive objects. For instance, in a gravitationally bound system
this energy is negative twice the kinetic energy. Secondly, in the relativistic theory there are dilaton
waves, which carry energy and momentum even in the absence of matter. So to have a nonlinear
theory of scalar gravity one has to specify how the r.h.s. of (10.1) is generalized. However, from a
fundamental viewpoint there is no unique answer. We could just postulate that (10.1) is the full
description, or we could try to add higher order terms by adding higher order interactions to the
action. Of course, different choices would lead to different predictions. Given that scalar gravity
fails phenomenologically already at linear order, we don’t need to pursue this further.
αβ (0)
Eµν hαβ = −16πGTµν , (10.2)
because the l.h.s. vanishes identically when acted on by ∂ µ . This implies that for the equation to
be consistent divergence of the r.h.s. must vanish on-shell. This holds to zeroth order in hµν but
no further than that! Again there is energy in the gravitational field of massive objects, and more
importantly there are gravitational waves. Matter can source gravitational waves and lose energy,
and can absorb them. Therefore, at nonlinear order in hµν the r.h.s. must look like
(n) (n)
where Tµν incorporates the effect of coupling between matter and hµν at order hn+1
µν , and tµν
correspond to purely gravitational contributions such as the stress-energy tensor of the GWs derived
in the last lecture. Solving this problem iteratively is nontrivial because once we add a term of
order hµν to the r.h.s., we have to modify the action at O(h2µν ). This in turn contributes to stress-
energy tensor at O(h2µν ) and hence requires a new term of O(h3µν ) to be added to the action and
so on. There are several arguments at various levels of rigor showing that General Relativity is
the unique answer to this problem.10 Below I will give only some hints of why this is the case.
10
See e.g. “Self-Interaction and Gauge Invariance” by Deser, and “Spin-2 Fields and General Covariance” by Wald.
48
1. ∗ a) Consider a theory of two decoupled fields ψ and χ,
Show that there are two conserved stress-energy tensors T1µν and T2µν , one for each action. (As
a concrete example you can take ψ, χ to be free scalar fields and derive T1 and T2 explicitly.)
b) Add a small coupling between the two fields:
Z
S → S + Sint , Sint = d4 xχ2 ψ 2 . (10.5)
Show that T1µν and T2µν are no longer conserved but ∂µ T1µν = O() and ∂µ T2µν = O(). Rather
there is only one conserved tensor
µν
where Tψχ is the stress-energy tensor derived from Sint .
The situation with gravity is similar. As long as gravitons and matter fields are decoupled,
(0)
the stress-energy tensors for the matter fields denoted by Tµν is conserved. One can also
(2)
talk about the conserved energy and momentum of free gravitons obtained by integrating tµ0
(0) (0) (2)
over the space. Once we add the coupling hµν Tµν , neither Tµν nor tµν are conserved. In
(0)
particular ∂ µ Tµν = O(hµν ).
Let’s look back at the Maxwell theory for some inspiration. Recall that the l.h.s. of the Maxwell
equation ∂ν F νµ = −J µ is also identically conserved, and hence the r.h.s. is the electric current
which is conserved on-shell. We saw that this follows naturally once we realize that the description
in terms of Aµ is redundant and the action must be invariant under U (1) gauge transformations
We learned from the Noether theorem that if the action of a set of fields collectively denoted by
ψ is invariant under a global symmetry with parameter a, when the parameter of the symmetry
transformation is promoted to a function of the spacetime the action changes as
Z
δSψ = − d4 x ∂µ aJ µ . (10.8)
Therefore, to ensure (10.7) we identified a(x) = α(x) and coupled Aµ to the Noether current
associated to the global symmetry of the charged fields ψ
Z
SAψ = d4 xAµ J µ . (10.9)
This procedure of promoting a global symmetry to a local one (i.e. one with an arbitrary space-
49
time dependent parameter a(x)) by coupling the fields to a gauge field is called Gauging the
Symmetry. The sourced Maxwell theory is so simple because photons and hence the Aµ field
are electrically neutral. Therefore, we do not need to modify the homogeneous part of the action.
But the conserved electric current J µ will generically change after coupling charged particles to
photons. For instance the source term in the Maxwell equation coupled to a charged scalar field is
where Dµ = ∂µ + iqAµ .
From this point of view, what we have done for spin-2 gravity is to gauge spacetime translations
xµ → xµ −aµ . When the parameters of translations become spacetime dependent and identified with
the gauge parameters that transform hµν (i.e. aµ (x) = ξ µ (x)) we obtain a gauge symmetry that has
several names: General Covariance, Reparametrization Invariance, Diffeomorphism, etc.
The difference with the Maxwell theory is that graviton is charged under spacetime translations,
namely, it carries energy and momentum. From the perspective of the Noether theorem this is
saying that the hµν (x) field also transforms under translations. Therefore, the conserved Noether
current Tµν , whose zeroth component used to be the density of energy and momentum, has to be
modified once matter fields are coupled to hµν . As mentioned above one way to construct the full
theory is to iterate, starting from the linear theory.
Another way is to be smart like Einstein was and guess the full answer: Our goal is to write a
nonlinear action for hµν and matter fields ψ such that it is invariant under
and such that for hµν 1 it reproduces our linearized theory.11 To do this first divide the most
general action for hµν , ψ into two pieces:
The requirement of general covariance is quite strong. It almost uniquely fixes both actions. Let’s
focus on Sm first. Recall that at hµν = 0, Sm is the volume integral of a Lagrangian which is a local
function of fields and their derivatives and which is a scalar under Lorentz transformations. This
is achieved by forming invariant products of vectors and derivatives using the Minkowski metric.
For general coordinate transformations (10.11) this construction is insufficient:
∂xα ∂xβ
ηαβ 6= ηµν . (10.13)
∂ x̃µ ∂ x̃ν
11
Note the distinction between the linearized diffs under which δhµν = ∂(µ ξν) and full diffeomorphisms. Once ξ µ
is identified as the change of xµ we have to include O(ξhµν ) terms such as ξ λ ∂λ hµν in the transformation law of hµν .
50
We therefore replace all ηµν ’s with a new metric tensor gµν (x) (made of hµν ) such that
is invariant. (Note that the invariant interval ds2 is commonly defined like this regardless of metric
signature, so in mostly plus signature dτ 2 = −ds2 .)
At O(h2µν ) the relation depends on what our (so far unspecified) prescription is for the transforma-
tion of hµν at O(hµν ξ α ). For instance, we can take (10.15) as the full nonlinear definition of hµν .
Note that in any case hµν is not a tensor,
∂xα ∂xβ
h̃µν (x̃) 6= hαβ (x) (10.16)
∂ x̃µ ∂ x̃ν
because ηµν is not. Indeed, if a tensor is zero at a point x in one coordinate system, it has to be
zero in all others. The inhomogeneous transformation δhµν = ∂µ ξν + ∂ν ξµ + · · · breaks this rule
because we can start from hµν = 0 and make it nonzero by almost any ξ µ (x). Therefore, hµν and
ηµν have no place in the covariant action unless they come together.
The simplest example of a covariant matter action is perhaps the worldline action
r
dX µ dX ν
Z
Spp = −m dσ −gµν (X) . (10.18)
dσ dσ
This is what we encountered at linear order, but now it is a fully nonlinear generally covariant
theory. More generally, we still need to make two additional changes to have a complete recipe for
Sm .
Covariant Derivative. In Cartesian coordinates, derivatives of tensors are higher rank tensors.
This is not the case in general coordinate systems; example:
The reason is clear. It is in general not meaningful to subtract components of vector fields at
different spacetime points. For instance, in spherical coordinates Aθ at two different values of θ point
51
in two different directions. However, one can introduce a covariant derivative ∇µ to construct proper
higher rank tensors from lower order ones. The trick is to combine (10.19) (and its analogs) with
another non-covariant object made of derivatives of gµν such that the last term in the transformation
cancels
∇µ Aν = ∂µ Aν − Γαµν Aα (10.20)
where Γαµν are called Christoffel symbols. They don’t form a tensor:
More generally, the covariant derivative of a tensor contains one Christoffel symbol per index:
Notation. The ordinary derivative and covariant derivative are sometimes denoted by comma and
semicolon, respectively:
Aµ,ν ≡ ∂ν Aµ , Aµ;ν ≡ ∇ν Aµ . (10.23)
5. ∗ Show that
g σρ
Γσµν = (∂µ gρν + ∂ν gρµ − ∂ρ gµν ) (10.25)
2
has the same transformation property as (10.21). This is the unique choice that is symmetric
in µ ↔ ν and ensures
∇µ gαβ = 0. (10.26)
What is ∇µ g αβ ?
From a purely geometrical point of view there is no unique covariant derivative. To any Γσµν
σ and thereby get a new covariant
that satisfies (10.21) we can add an arbitrary tensor field Cµν
derivative. (It is also true that any two choices Γ and Γ0 differ by a tensor field.) In particular, the
antisymmetric part
σ
Tµν = Γσµν − Γσνµ (10.27)
is a tensor called the Torsion Tensor. Geometrically, the vanishing of torsion tensor is motivated
because it guarantees that
σ
Tµν = 0 ⇒ ∇µ ∇ν f (x) = ∇ν ∇µ f (x). (10.28)
And the condition (10.26) ensures that (as we will see) the inner product is preserved under parallel
transport. Thus apart from being special, there is no strict mathematical rule to enforce (10.25).
52
From our physical point of view, the situation is completely different. We started from a theory of
matter coupled to gravitons, described by hµν . Our goal is to add coupling between matter and
hµν to construct a covariant theory. The only way to construct covariant derivatives using just hµν
(so that ∇µ → ∂µ when hµν = 0) is to choose Christoffels as in (10.25), so we should stick to it.
6. ∗ Rewrite the geodesic equation in terms of Christoffels. Note that they play the role of the
gravitational field.
7. ∗ Show that the variation of metric under linear diffs can be written as
δξ gµν = ∇µ ξν + ∇ν ξµ . (10.29)
So we replace ordinary derivatives with covariant derivatives. This procedure should be familiar
from gauge theories like electromagnetism. In order to gauge a global symmetry we have to replace
∂µ → Dµ = ∂µ + iqAµ . (10.30)
The gauge field Aµ and the Christoffel symbols are sometimes called Connections, since they
allow us to compare vectors and tensors at different spacetime points.12
Volume Element. The last step to build a covariant action is to replace the measure
√
d4 x → d4 x −g, g ≡ det gµν . (10.31)
∂xα ∂xβ
g̃µν (x̃) = gαβ (x), (10.32)
∂ x̃µ ∂ x̃ν
one easily sees that the new measure is invariant, and hence the action is invariant if a scalar is
√
integrated using this measure over the whole spacetime. It is convenient to absorb −g inside the
Lagrangian density to make it a scalar density (rather than a scalar).
8. ∗ Is the Levi-Civita symbol (the fully antisymmetric rank 4 object with ε0123 = 1) a tensor?
√
If not, can you combine it with −g to construct a tensor?
12
A more sophisticated way of saying this is the following. In gauge theories there is an independent copy of the
symmetry group at each spacetime point. In order to compare fields that transform nontrivially under the symmetry
group (e.g. vectors and tensors in GR) at two different points we need a connection that lives on the link between
those two points. It transforms under the symmetry group at the end of the link and inverse of the group at the
beginning of the link.
53
9. ∗ Show that for any vector V µ
√ √
−g∇µ V µ = ∂µ ( −gV µ .) (10.33)
Therefore, such a term in the Lagrangian reduces to a boundary contribution to the action.
To summarize, we learned that in order to couple matter to spin-2 gravity we have to write a
generally covariant theory, in which the matter fields no longer see the flat Minkowski metric but
rather gµν = ηµν + hµν , and the coupling to the spin-2 field hµν is only through this metric. The
theory is invariant under reparameterization of coordinates (10.11). For any point xµ0 there exists
a set of ξ µ such that in the new coordinate system
Moreover, different ξ µ in this set are related by Poincaré transformations (see Weinberg 3.2 for a
proof). In these coordinate systems the geodesic equation at X µ (τ0 ) = xµ0 becomes
d2 X µ
= 0, (10.35)
dτ 2 τ0
where τ denotes the proper time as measured by gµν (i.e. τ̃ of the previous lecture). For this reason
they are called Local Inertial Frames. The existence of local inertial frames for any x0 is the
Einstein’s famous Equivalence Principle.
Even though GR textbooks often attribute a central role to the Equivalence Principle and
general covariance, they do not lead uniquely to Einstein’s theory of gravity. First of all, general
covariance is an almost empty statement. Any theory could be formulated in a covariant way by
following the above steps. It is only when we regard gµν (or equivalently hµν ) as a dynamical
variable and introduce the action Sg [hµν ] that we obtain a theory of spin-2 gravity. Equivalence
Principle, on the other hand, is not empty as it relates gravity to geometry. However, it doesn’t
single out spin-2 gravity. Scalar gravity can be completed at nonlinear order into a covariant theory,
with the metric given by
gµν = e2ϕ ηµν . (10.36)
Then the Equivalence Principle holds in the sense that at any point x0 there is a set of inertial
frames in which the laws of physics are those of special relativity. I won’t spend much time talking
about these matters of principle, because they sound mysterious and deeper than what they actually
are. Almost any GR text book devotes at least a section (often a chapter) on Equivalence Principle,
which you can refer to for further discussion.
µ α β
ξ µ = Aµν xν + Bαβ x x + O(|x|3 ). (10.37)
54
Show that the condition g̃µν (0) = ηµν fixes Aµν up to a Lorentz transformation, and the
µ
condition ∂σ g̃µν (0) = 0 uniquely fixes Bαβ . This proves the existence of local inertial frames.
Stress-Energy Tensor. Once we have a covariant formulation of the matter theory, we can
give an alternative and more efficient method of deriving the stress-energy tensor: By general
covariance Z
δL δL
0 = δξ S[ψ, gµν ] = d4 x δξ g µν µν + δξ ψ . (10.38)
δg δψ
We can use (10.29) and the general relation
δg µν = −g µα g νβ δgαβ (10.39)
which holds between any tensor field and its inverse (satisfying g µν gνρ = δρµ ) to write the first term
as
δL δL
δξ g µν =(−∇µ ξ ν − ∇ν ξ µ ) µν
δg µν δg
(10.40)
√ 2 δL
= −gξ µ ∇ν √ + total derivative
−g δg µν
where in the second line we used the symmetry of g µν , the Leibniz rule and (10.33). The total
derivative term can be neglected because ξ µ (x) is an arbitrary function of x which we take to be
zero at infinity. So we have
√
Z
4 µ2 δL ν δL
d x −gξ ∇ √ + δξ ψ = 0. (10.41)
−g δg µν δψ
The second term is proportional to the equations of motion for matter fields, therefore it must
vanish on-shell. Therefore, since ξ µ are arbitrary
2 δL
ν
∇ √ = 0. (10.42)
−g δg µν on−shell
We can identify this covariantly conserved tensor as the covariant stress-energy tensor of the matter
action:
2 δLm
Tµν = − √ . (10.43)
−g δg µν
This gives a very efficient prescription to derive a symmetric stress-energy tensor even in the absence
of gravity. We can momentarily covariantize the theory by following the above steps, vary with
respect to the g µν and then set it back to η µν . The result is an ordinarily conserved tensor.
P∞ (n)
When hµν is finite (10.43) has to be regarded as the sum n=0 Tµν in (10.3) of all order
corrections to the matter stress-energy tensor due to the gravitational interactions.
55
11. ∗ Show that for a symmetric rank-2 tensor
1 √ 1
∇µ Tνµ = √ ∂µ ( −gTνµ ) − ∂ν gαβ T αβ (10.44)
−g 2
56
11 Nonlinear Gravity: II. Einstein-Hilbert Action
Next we turn attention to Sg [hµν ]. We learned that the action must be generally covariant and
hence only a function of gµν , its inverse g µν and covariant derivatives ∇µ . Therefore, it is a fully
(2)
geometrical object. Moreover, at quadratic order in hµν it has to reproduce Sg [hµν ] that was
constructed in the previous lecture. And it has to be a spacetime integral of a Lagrangian which is
a scalar density, perhaps up to total derivative terms. It is a simple exercise in differential geometry
to show that there is only one such object, the Einstein-Hilbert action
√
Z
1
SEH = d4 x −gR (11.1)
16πG
where R = g µν Rµν is called Ricci scalar, the trace of Ricci tensor Rµν . The Ricci tensor itself is
defined in terms of the trace of the Riemann tensor
λ
Rµν = Rµλν (11.2)
which is given by the following expression in terms of the metric and its derivatives
µ
Rναβ = ∂α Γµνβ − ∂β Γµνα + Γµαλ Γλνβ − Γµβλ Γλνα . (11.3)
For the moment, don’t worry about the geometric meaning of these quantities. We will explore
them in more detail later. Think of the Einstein-Hilbert action as just a messy nonlinear action
for hµν , with two derivatives in each term. It is unique because everything can be packed into a
covariant expression.
To derive the equation of motion for hµν we can alternatively vary the action with respect to
gµν , or g µν . The variation of SEH gives (for a derivation see LL §95, or wait until I get a chance to
talk about first order formalism)
2 δLEH 1 1
√ = (Rµν − gµν R). (11.4)
−g δg µν 8πG 2
The tensor on the r.h.s. is called the Einstein Tensor Gµν . At linear order in hµν it reproduces
the l.h.s. of the unique linear spin-2 field equation:
1 αβ
G(1)
µν = − Eµν hαβ . (11.5)
2
The higher order terms should be thought of as the sum of purely gravitational contributions to
57
the stress-energy tensor on the r.h.s. of (10.3), taken to the l.h.s. of the equation. Namely
∞
X
Gµν − G(1)
µν = −8πG
(n)
tµν . (11.6)
n=2
The same argument which lead to covariant conservation of Tµν defined as (10.43) leads to
∇µ Gµν ≡ 0, (11.7)
now as an identity (rather than an on-shell statement), because there is no field other than gµν in
SEH . This is called the Bianchi Identity. Thus the full equation of motion for g µν which is the
Einstein Equation
Gµν = 8πGTµν (11.8)
1. Show that the Einstein equation is a solution to the problem we encountered at the beginning
of the lecture. Namely, show that if we move the nonlinear pieces of Gµν to the r.h.s. the
result is ordinarily conserved
1 (1) µ
∂µ τνµ ≡ ∂µ Tνµ − µ
(G − G ν ) = 0, on-shell. (11.9)
8πG ν
Solution. This follows from the Bianchi identity and the Einstein plus matter equations of
motion:
µ
where we used ∂µ G(1) ν = 0. Similarly,
where to get to the second line we used the Einstein equation. Combining the above two
expressions we get (11.9). Note that even though I used the Einstein equation itself to derive
this result it is a nontrivial check since I didn’t use its derivative. It tells us that unlike
linearized spin-2 gravity, the Einstein equation is consistent with matter equations of motion.
That is, if the Einstein equation holds at some moment of time and we solve matter equations,
the Einstein equation can hold at a later moment.
Conserved Energy and Momentum. We Discovered that the stress-energy tensor is covariantly
58
(and not ordinarily) conserved. Can we define a conserved energy-momentum 4-vector Pµ by
integrating Tµ0 over a spatial surface, say
Z
d3 x Tµ0 , (11.12)
as we did in Cartesian coordinates? Not surprisingly, the answer is no. Technically because in non-
Cartesian coordinates the component µ at different spacetime points means different things, so we
cannot simply add them up. (Remember that we needed a connection to subtract them at different
points.) However, there is a more physical reason. The gravitational field also carries energy and
momentum, even in the absence of matter. Any expression for energy has to incorporate that. Tµν
does not, because it vanishes if we set all matter fields to zero:
ψ = 0 =⇒ Tµν = 0. (11.13)
Furthermore, unlike non-gravitational physics there cannot exist any covariant expression for the
local density of energy (or momentum). We learned that at any spacetime point we can set hµν and
its first derivative to zero by choosing an inertial frame (what we cannot do is to set them to zero
globally). So a would-be stress-energy tensor of the gravitational field must vanish at that point.
But if a tensor vanishes in some frame it vanishes in all other frames. By repeating this argument
at every spacetime point, one arrives at the conclusion that if an energy-momentum tensor exists
for hµν field, it has to be zero everywhere.
The best we can hope for is to find a non-covariant expression for energy-momentum density
such that once integrated over a spatial slice it gives a meaningful answer Pµ for the total energy-
momentum of the spacetime. This is a nontrivial problem. Mathematicians and Physicists have
been working on it up until today.13 The answer exists provided that we can nail the asymptotics
(behavior at large values of r) to some fixed spacetime, such as Minkowski. That is, if hµν and
consequently matter fields which source hµν fall off sufficiently rapidly as we take r → ∞ with t
being kept fixed. Then there exist Pseudo Tensors of Energy and Momentum, one example
of them given by τνµ in (11.9). Although τνµ is not unique, the following integral gives a unique
total 4-momentum Z
Pµ = d3 xτµ0 (11.14)
provided it is taken over an entire time-slice that as r → ∞ matches a constant t slice of the
fixed asymptotic Minkowski spacetime. Here, I am skipping lots of details about the fall-off rate
of deviations from Minkowski, as well as the proof of why under Lorentz transformations of the
asymptotic coordinates, Pµ behaves like a 4-vector. They can be found in LL §96. To learn even
more about them, you should wait until we acquire more geometric tools.
However, the definition I just gave has a very nice property that everyone can appreciate
13
For an example see https://arxiv.org/abs/hep-th/9902121.
59
immediately. On the equations of motion, we have
1 µ 1
τνµ |on−shell = G(1) ν = − ∂α Hναµ , (11.15)
8πG 16πG
where I used the relation (11.5) between the linearized Einstein tensor and the second order differ-
αβ
ential expression Eµν hαβ , which can be written as the total derivative of an anti-symmetric rank-3
object:
Hναµ = ∂ α hµν + δνµ ∂ β hαβ + δνα ∂ µ hββ − {α ↔ µ}. (11.16)
Inserting the µ = 0 component of (11.15) in (11.14), and using the antisymmetry of Hναµ , we obtain
a beautiful result: Z I
1 1
Pµ = − d3 x ∂i Hµi0 = − dSi Hµi0 . (11.17)
16πG 8πG r→∞
Just like the Gauss’s law which relates the total electric charge within a region of space to the flux
of the electric field through the boundary, the energy-momentum of the entire spacetime is related
to the flux of Hµi0 at infinity. There is no longer any reference to the interior of the spacetime.
This expression can be used even if there is a hole inside the spacetime (in fact there are holes in
GR, with a singularity inside them). Since asymptotically the spacetime is Minkowski, hµν has to
approach zero. Hence, we can return to the linearized theory and treat hµν as a tensor field which
lives on Minkowski spacetime. It is no longer a surprise that Pµ is a 4-vector in asymptotia.
60
12 Manifolds
In the last lectures, I argued that to study spin-2 gravity at nonlinear level one has to go beyond
the flat Minkowski spacetime and consider spacetime metric gµν as a dynamical degree of freedom
to be determined by the distribution of matter. In the next few lectures we develop some useful
mathematical tools that are relevant to this problem.14
Manifolds. Manifold is a topological space with a collection of open subsets Oα each of which
equipped with a one-to-one and onto map ψα : Oα → Uα ∈ Rd . When Oα ∩ Oβ 6= ø then ψα ◦ ψβ−1 :
ψβ (Oα ∩ Oβ ) → Rd is a smooth (C ∞ ) function.
If this is too abstract to digest, think of manifold as the surface of an apple. There are little
ants who are doing geometry on this surface. They draw lines and circles and they measure angles.
Or imagine ancient civilizations trying to produce a map of their territories. Each map (or chart)
covers a finite patch of the Earth which includes one state and parts of the neighboring states just
to make clear where the borders are. No single one of these charts covers the entire Earth but their
collection does (or could). Moreover, when the charts of two neighboring states are compared they
should agree on common regions.
As a more concrete example consider a two dimensional sphere S 2 , embedded in three dimen-
sions:
X 2 + Y 2 + Z 2 = 1. (12.1)
Every point on S 2 is uniquely identified with a point in R3 satisfying the above condition. We can
cover S 2 with six charts fi± , each of which covering one hemisphere:
Note that we followed the common practice of denoting embedding coordinates with upper-case
symbols and the internal coordinates with lower-case ones. The other five fi± are defined in an
−1
analogous way to fz+ . One can easily check that fi± ◦ fj± (where ◦ means composition of two
maps) are smooth functions when their domains overlap (check).
It is not possible to cover the entire S 2 with just one chart, but we can cover it with two
charts using Stereography. One chart is obtained by projecting every point p ∈ S 2 on the
14
These are very standard topics, which can be found in many textbooks. Depending on your depth of interest in
math and your patience you can choose different paths. Wald is very mathematical and concise. Carroll is a more
pedagogical version of Wald, expanding it by a factor of almost two. Weinberg and Landau-Lifshitz are very minimal
on geometrical aspects. LL is as usual very compact but it is sufficient (Zel’dovich learned GR from LL). Nevertheless,
if you choose to read a more mathematical book be aware that sometimes they are too orthodox in avoiding coordinate
systems. Don’t be afraid of µ, ν, · · · , it is often extremely efficient and illuminating to use coordinate systems and
tensor components. Great physicists like Weinberg and Landau used them all the time so there is no reason to be
ashamed of doing so.
61
plane at Z = −1 by extrapolating the line that connects p to (0, 0, 1). This chart doesn’t cover
(0, 0, 1) ∈ S 2 . The other chart can be chosen to be projection on Z = 1 plane from the bottom
point (0, 0, −1).
Vectors. In flat space, like R3 , once we choose an origin, every two points a and b define a
finite displacement vector
∆xab = xa − xb (12.3)
which corresponds to another point on R3 . In general, this is impossible in curved space. For
instance, there is no sense in which two points on a sphere define a third point. However, the
notion of Infinitesimal Displacement continues to exist. So one defines a Tangent Space Vp
as a d-dimensional vector space at every point p ∈ M . Think of tangent planes on the surface of an
apple. At different points these planes are not parallel to one another. Hence, there is no simple
way of comparing vectors that belong to the tangent space of two different points Vp , Vp0 .
Vectors can be defined as a list of d real numbers and a transformation law under coordinate
transformations. Imagine one of the ancient civilizations had systematically misaligned compasses.
Their description of the direction of rivers and passes near the border would differ from their
neighbors’, but at every point in the overlap of two charts there is a single rotation that relates all
directions.
However, there is a more abstract way of defining a vector v ∈ Vp . It is a linear map from the
set F to R that satisfies the Leibniz rule, where F is the set of all smooth functions f : M → R.
Namely, for f, g ∈ F and a, b ∈ R
So v(f ) = 0.
2. Show that partial derivatives form a basis for vectors. In a coordinate system, partial deriva-
62
tive at point p (with x(p) = a) with respect to µ coordinate is defined as
∂f
Xµ : F → R, ∀f ∈ F ⇒ Xµ (f ) = ∂µ f (p) = (p), (12.7)
∂xµ
where
Hµ (p) = ∂µ f (p). (12.9)
where we used v(f (a)) = 0 because f (a) is a constant, and that (xµ − aµ )|p = 0. Since f is
arbitrary, we conclude that
X ∂
v= vµ , v µ ≡ v(xµ − aµ ). (12.11)
µ
∂xµ
where x̃µ (x) is a smooth invertible map, follows from the transformation rules for partial derivatives:
∂ ∂ x̃ν ∂
= . (12.13)
∂xµ ∂xµ ∂ x̃ν
Curves. A curve is a function from real numbers (or an interval I ⊂ R) to the manifold
C : R → M . At every point p that a curve passes through, it defines a Tangent Vector Tp ∈ Vp .
Formally, it can be defined by its action on smooth functions: For any smooth function on the
manifold f ∈ F, we can combine f ◦ C : R → R to get a real function f ◦ C(t ∈ R) (definition:
f ◦ g(t) ≡ f (g(t))). Then
d
Tp (f ) = f ◦ C(tp ), C(tp ) = p. (12.14)
dt
Less formally and using a coordinate system, a curve is a set of d functions xµ (t). At any point
63
along the curve
dxµ
Tpµ = (tp ). (12.15)
dt
3. Let’s go back to the S 2 example. We can consider a Longitude which is characterized much
easier in spherical coordinates (recall that X 2 + Y 2 + Z 2 = 1)
p
θ = sin−1 X 2 + Y 2, φ = tan−1 (Y /X). (12.16)
dxµ
T µ (x(t)) = = (cos φ cos t, sin φ cos t) = (cos φ cos θ, sin φ cos θ). (12.18)
dt
We could have considered a curve that runs twice faster L0φ : [0, π/2] → S 2 :
x1 (t) = X(t) = cos φ sin 2t, x2 (t) = Y (t) = sin φ sin 2t. (12.19)
Vector Fields. A vector field (or a tangent field) is an assignment of a tangent vector vp ∈ Vp
to every point p ∈ M . Acting on a smooth function from the manifold to real numbers, f : M → R,
a vector field produces another function v(f ) : M → R. The field is called smooth, if for any f ,
v(f ) is a smooth function. A vector field generates a set of Integral Curves on the manifold by
setting
T (p) = vp , (12.21)
4. As an example consider the field that generates Latitudes on the sphere, in fz+ chart
µ
Sθ,φ = (− sin θ sin φ, sin θ cos φ). (12.22)
64
The integral curves are lines of constant θ, i.e. latitudes lθ : [0, 2π) → S 2 defined as
Commutator, Lie Derivative. Given two vector fields v, w one can define another field
This is called the commutator of v and w, or the Lie derivative of w with respect to v, Lv w. In
components
[v, w]µ = v ν ∂ν wµ − wν ∂ν v µ . (12.25)
d mutually commuting and linearly independent vector fields form a coordinate basis. Their integral
curves can be used to chart the manifold.
5. Take vector fields T, S corresponding to the tangent vectors of longitudes and latitudes with
uniform parametrizatio. We have
T = T µ ∂µ = ∂θ , S = S µ ∂µ = ∂φ , (12.26)
and as a result
T ν ∂ν S µ = S ν ∂ν T µ = cos θ(− sin φ, cos φ). (12.27)
and then move along S for curve parameter s, we end up at the same point as if we first
moved along S for s and then along T for t. Hence, we can unambiguously label points on
the manifold (at least in the vicinity of p) by the pair (t, s).
For an example of non-commuting fields consider again longitudes and latitudes, but this
time parameterize latitudes non-uniformly. For instance:
x1 (s) = X(s) = sin θ cos(s(1 + cos θ)), x2 (s) = Y (s) = sin θ sin(s(1 + cos θ)). (12.29)
It is easy to see that the assignment of (t, s) would be path dependent and hence T, S 0 do not
65
form a coordinate basis.
Tensors. Once the tangent space is defined the construction of tensors is an easy task. Of course
as before we could have defined them in terms of their transformation properties. Alternatively,
we can define the Cotangent Space Vp∗ at any point p ∈ M as the collection of Covectors. A
covector ω ∈ Vp∗ is a linear map from the tangent space at p to real numbers: ω : V → R. We can
choose a basis for them:
dxµ (∂ν ) = δνµ (12.31)
and expand
ω = ωµ dxµ =⇒ ω(v) = ωµ v µ . (12.32)
The standard transformation property of covariant vectors for ωµ immediately follows from the
transformation property of vector components.
∗
A tensor of rank (k, q) is then defined as a multilinear map T : |V × ·{z
· · × V} × V · · × V }∗ →
| × ·{z
q times k times
R, were multilinear means separately linear in every argument. Tensors can be expanded in the
coordinate basis ∂µ , dxµ and the components satisfy the well-known tensor transformation rules.
For instance for a rank (2, 1) tensor
T = Tσµν ∂µ ∂ν dxσ . (12.33)
The upper indices that are summed together with ∂µ basis are called Contravariant, and those
that sum with dxµ are called Covariant. Tensor Fields are defined analogously to vector fields.
Metric Tensor. In order to talk about almost any local geometric concept on manifolds,
such as straight lines, angles, area one needs to have a notion of distance. That is, we need to
associate a length to an infinitesimal displacement (a vector) and an angle between two different
displacements. This is a symmetric map from two vectors in Vp (for all p ∈ M ) to real numbers,
which is a symmetric covariant rank-2 tensor field, called the metric g. It has to be non-singular,
i.e. if g(v, w) = 0 for all w ∈ Vp then v = 0. Therefore, the metric is invertible. The components
of the inverse metric g −1 are denoted by the same symbol g:
Metric gives a natural one-to-one map between elements of tangent space and cotangent space: For
every v ∈ Vp fill one of the slots of the metric with v, i.e. g(v, .). This is an element of Vp∗ , with
components given by the familiar formula
Embedded Manifolds. Let’s have a closer look at embedding of manifolds in higher dimen-
66
sional spaces. In particular, let’s focus on embedding in flat Euclidean (or Minkowski) space of
dimension d + 1. Embedding is equivalent to a condition
f (X A ) = 0. (12.36)
for some functions F A . Conversely, given the set of F A one can derive the embedding function.
One solutions is
f (X A ) = X 1 − F 1 (x({X 2 , X 3 , · · · , X d+1 })) (12.38)
where we used the fact that in a non-degenerate situation we can use d of d + 1 equations (12.37) to
solve for d xµ in terms of {X 2 , · · · , X d+1 }. Unless it causes ambiguity, I will denote the functions
F A (x) as X A (x) in the future.
The tangent spaces of M at different points are just d dimensional flat sections of the embedding
space (they are called Codimension-1 hypersurfaces). Therefore, any vector v in the tangent space
of M is naturally associated to a vector V in the embedding space. In coordinates:
∂X A (x) µ
VA = v . (12.39)
∂xµ
On the other hand, any covector in the embedding space naturally gives rise to a covector on M :
∂X A
ωµ = ΩA . (12.40)
∂xµ
where V, W are the same as v, w viewed as vectors in the embedding space. In any coordinate
system xµ the components of the induced metric can be easily related to the components of the
embedding space metric in coordinates X A in the following way. One considers a generic infinitesi-
mal displacement dxµ on the manifold. This results in a displacement dX A as a function of {dxµ }.
67
By definition of the induced metric
Of course, this is nothing but a special example of the above general statement about the mapping
of cotangent spaces.
x1 dx1 + x2 dx2
dZ = − p (12.44)
1 − (x1 )2 − (x2 )2
and
1
ds2S 2 = [(1 − (x2 )2 )(dx1 )2 + (1 − (x1 )2 )(dx2 )2 + 2x1 x2 dx1 dx2 ]. (12.45)
1− (x1 )2 − (x2 )2
68
13 Curvature
In the last lecture, we introduced manifolds and metric. Equipped with the metric, we study
the local structure of manifolds.
Parallel Transport. Imagine we want to compare two vectors belonging, respectively, to the
tangent spaces of two different points: v ∈ Vp , w ∈ Vq . Since Vp 6= Vq , such a comparison requires
associating an image ṽ ∈ Vq to v. For instance, we could fix a chart and take
µ∂
ṽ = v . (13.1)
∂xµ q
This is not very satisfactory, since depending on the choice of the chart we get different answers for
ṽ. We would like to find a more intrinsic way of doing so: The little ants on the apple can follow
the following procedure. First, draw a curve that connects p and q. For each curve, one can find an
image of v in Vq by following a rule that determines how to transport vectors infinitesimally along
the curve. Whatever the rule is, it should be linear in v, so that it fixes the failure of v µ (x) to be a
vector at x + dx, and linear in the displacement dx. The most general rule for such an infinitesimal
transformation in a fixed chart is therefore
for some matrix Γµαβ . I don’t call it a tensor because v µ is not a vector at xµp + dxµ , but ṽ is, so the
second term must compensate. Γµαβ is called an Affine Connection. Note that it is not unique.
If there is a Γµαβ such that the above sum is a vector, then it is so for any
µ
Γ0 αβ = Γµαβ + Cαβ
µ
(13.3)
µ
where Cαβ is a tensor. Conversely, the difference of any two affine connection Γµαβ and Γ0 µαβ is a
tensor. This is because ṽ − ṽ 0 is a vector and so is
µ
(Γµαβ − Γ0 αβ )dxα v β . (13.4)
Since there is already a factor of dx, the difference between transformation laws of vectors at x and
x + dx gives an O(dx2 ) contribution, which must be discarded.
We can also transport 1-forms (covectors) or higher rank tensors belonging to the point p.
Recall that a 1-form ω is a map from vectors to real numbers. Thus we can impose a further
requirement on the rules of transportation, namely
69
This fixes
ω̃µ = ωµ + Γβαµ dxα ωβ , (13.6)
Therefore, given a transportation rule, we can start from an arbitrary vector (tensor) at a given
point p and an arbitrary curve with tangent T µ (t) = dxµ /dt that passes through this point and
associate a vector (tensor) at all points along the curve. This is called parallel transport. Suppose
C(t = 0) = p then the parallel transport v(t) of a vector v ∈ Vp is the solution to
d ν
v (t) = −Γµαβ T α (t)v β (t), v(0) = v. (13.7)
dt
Unlike the ordinary derivative the covariant derivative of a vector is rank (1, 1) tensor. Covariant
derivative of other tensors is defined in a similar fashion. The compatibility of parallel transport
with the action of vectors, covectors, and tensors (equation (13.6)) ensures that covariant derivative
satisfies the Leibniz rule.
In terms of the covariant derivative the parallel transportation of a vector v along a curve (i.e.
v(t) with v(0) = v) with tangent T µ = dxµ /dt can be formulated as the condition
T µ ∇µ v ν = 0. (13.9)
µ
Tαβ ≡ Γµ[αβ] = Γµαβ − Γµβα . (13.10)
Torsion is a tensor field. This follows from the fact that for any two vector fields
The l.h.s. as well as the first term on the r.h.s. (Lie derivative defined in the previous lecture) are
ν must be a tensor.
vectors, so the last term should also be a vector. Since v, w are arbitrary Tαβ
It is natural (though not necessary) to set this tensor to zero. This is equivalent to requiring the
70
covariant derivatives commute on scalar fields:
∇µ ∇ν f = ∇ν ∇µ f. (13.12)
Metric Compatibility. As mentioned above there is no unique choice for the affine connection.
We can always add a (1, 2) tensor field to it. However, there is a natural choice when there is a
metric. One asks for the inner product of the vectors to be preserved under parallel transport:
The Leibniz rule and the definition of parallel transport (13.9) imply that this equation is satisfied
for all T, v, w if and only if
∇µ gαβ = 0. (13.14)
Assuming zero torsion this can be solved for the connection coefficients:
g µσ
Γµαβ = (∂α gσβ + ∂β gσα − ∂σ gαβ ). (13.15)
2
Geodesics. The generalization of the notion of a straight line to curved manifolds is a curve
whose tangent is parallel transported along itself and it is called an affinely parametrized geodesic.
This means that
d 2 xµ 2 α
µ d x dx
β
= −Γαβ dτ . (13.16)
dτ 2 dτ
This is the familiar equation of motion for a free particle we encountered before, parametrized
in terms of the proper time. Note however that in Lorentzian manifolds there are null geodesics
which satisfy the same equation but the affine paramter τ no longer measures the invariant distance
(which is zero). For a different parametrization of the curve σ(τ ), we get the condition that
2
d2 σ
µ ν ν dτ
T ∇µ T = cT , c=− . (13.17)
dσ dτ 2
This is the equation that we obtained by extremizing the worldline action Spp , with an arbitrary
parametrization σ. Therefore, geodesics are curves with (locally) extreme length between two points
on the manifold. Null, spacelike, or timelike, it is always possible to choose affine parametrization
for geodesics.
1. ∗ Show that the geodesic equation in the affine parametrization can alternatively be obtained
from the action
dxµ dxν
Z
S= dτ gµν . (13.18)
dτ dτ
The Lagrangian is L2pp . Why is it possible to replace L → f (L) (where f is a smooth
function)?
71
2. Find geodesics on a 2d sphere.
One can show that this only depends on the value of ω at p. To see this multiply ω by a smooth
function f . All derivatives of f cancel in the commutator and we get f [∇µ , ∇ν ]ωα . So the commu-
tator is a map from a 1-form to a (0, 3) tensor (since there are three lower indices and the object
is clearly a tensor). This is a rank (1, 3) tensor called Riemann tensor
There is a direct connection between the Riemann tensor and the path dependence of parallel
transport. Recall the original question we started with: How can one map vectors in Vp to those
in Vq , where p, q are finitely separated points? We introduced parallel transport to allow little ants
move vectors from p to q. However, the result can definitely depend on the path. Equivalently,
parallel transportation along a closed loop might not bring the vector back to itself.
The Riemann tensor gives a local measure of the path dependence of parallel transport. To see
this consider two infinitesimal displacement dx1 , dx2 , and apply our parallel transport rule in two
72
different orders: first dx1 followed by dx2 and then dx2 followed by dx1 . The difference between
the final vectors is
δv µ = [∂α Γµνβ − ∂β Γµνα + Γµαλ Γλνβ − Γµβλ Γλνα ]dxα1 dxβ2 v ν . (13.21)
The tensor in square brackets is the same as −Rαβνµ defined above. Alternatively, we could use an
arbitrary 1-form field ω and ask how ω(v) changes. This is
Another property of the Riemann follows from the fact that exact forms are closed d2 ω = 0. First
note that
(dω)µν = ∇[µ ων] , (13.24)
where square brackets on indices means antisymmetrization, e.g. A[µν] = Aµν − Aνµ for any matrix
Aµν . Taking another derivative and antisymmetrizing, we get
It follows that
R[µνρ] σ = 0. (13.26)
5. ∗ Prove that in two dimensions Rµν = 21 Rgµν . The Ricci tensor and Ricci scalar are defined
as
R = g µν Rµν , Rµν = Rµρν ρ
. (13.28)
6. Calculate the Riemann tensor on a sphere in terms of the metric. Hint: use symmetries.
Solution: We can use the fact that the sphere is perfectly symmetric (Maximally Sym-
metric in technical words), so we can calculate at one point where the metric is diag(1, 1)
(e.g. the north pole in fz+ coordinates). We can relate the components of the Riemann tensor
73
to the metric at this point, to obtain a covariant expression that is valid everywhere. Based
on symmetries, the only nonzero component of Riemann is R1212 and those related to this by
symmetries of Riemann. And it must look like
where c is the value of R1212 in the coordinates where g = diag(1, 1). The meaning of this
element is that if we parallel transport a vector around a small square with sides dx1 , dx2 ,
the vector rotates by an angle
δθ = R1212 dx1 dx2 . (13.30)
The total rotation of the vector after parallel transport along a finite loop can be obtained by
dividing it into infinitesimal squares and integrating over the area. However, the symmetry of
sphere implies that the rotation angle must always be the same for a square of sides dx1 , dx2 ,
regardless of where it is. Hence, we can determine R1212 by using the result of the previous
exercise. There was a rotation of π/2 when the vector was parallel transported along a loop
that surrounded 1/8 of the sphere, which has area π/2. Therefore, c = R1212 = 1.
Theorem: The knowledge of Riemann tensor and its derivatives at any given point on a smooth
manifold are sufficient to uniquely reconstruct the metric in a finite vicinity of that point, up to
freedom in the choice of coordinates.
The theorem can be proven by giving an operational way of constructing the metric, for instance
by using Gaussian Normal Coordinates.
√
Z
1
χ= d2 x gR (13.31)
4π
8. ∗ Recalling the symmetries of Riemann tensor (antisymmetric in first and second pair of
indices, symmetric for exchange of the pairs, cyclic sum of any three indices is zero), find the
number of independent components of Riemann in d dimensions.
74
√
volume integral with measure d4 x −g of some scalar quantity. But according to the above theorem
apart from the trivial case of a constant Λ the only scalar quantities made just out of the metric
field have to expressible in terms of products of derivatives of Riemann. Riemann tensor is itself
second order in derivatives. Therefore, at zeroth order in derivatives there is only the cosmological
constant:
√
Z
Scc = d4 x −gΛ. (13.32)
At second order in derivatives, there has to be only a single Riemann tensor with no derivatives.
Due to the symmetries of Riemann tensor there is a unique scalar that can be obtained by taking
traces of a single Riemann, namely the Ricci scalar:
β
R ≡ Rµµ , Rµν ≡ Rµβν . (13.33)
√
Z
SEH = κ d4 x −gR. (13.34)
There is no nontrivial invariant that is made only of first derivatives of metric, because it would
vanish in the local inertial frame and make it zero. R is linear in second derivative of gµν , but one
can decompose it into a total derivative term and a term that only contains first derivatives (see
LL §93 for the derivation). This First Order form is useful to set up the canonical formalism.
Higher order terms in derivatives are often irrelevant phenomenologically.
δ det g
= tr log(1 + g −1 δg) = −trgδg −1 . (13.36)
det g
In components, we get
√ 1√
δ −g = − −ggµν δg µν . (13.37)
2
The variational derivative of the action is therefore
2 δScc
√ = −gµν . (13.38)
−g δg µν
10. Derive the Einstein tensor by varying SEH with respect to gµν .
75
Solution: The action can be written as
√
Z
−gg µν Rµν . (13.39)
√ √
Z
µν 1
−gδg Rµν − gµν R + −gg µν δRµν . (13.40)
2
The expression in parenthesis is the Einste tensor. The last term is a total derivative (LL
§95) and therefore doesn’t contribute to the equation of motion. The reason is one can go to
a local inertial frame in which the first derivative of metric (and hence Γµαβ ) vanishes. So we
have
Rµν = ∂α Γαµν − ∂ν Γααµ , locally inertial frame. (13.41)
where
wµ ≡ g ρσ δΓµρσ − g µρ δΓσρσ . (13.43)
Now we can use the fact that even though Γµαβ is not a tensor, its variation is, because it’s
the difference between two affine connections. Thus, wµ is a vector and we can write in any
frame
1 √
g µν δRµν = √ ∂µ ( −gwµ ) (13.44)
−g
where we replaced the ordinary divergence with covariant divergence and used the identity
√ √
∇µ wµ = ∂µ ( −gwµ )/ −g. In summary, in (13.39) variation with respect to Rµν gives a
boundary term.
I should again emphasize that spin-2 gravity is not the unique answer to this problem. One
could follow the same approach but assume that the determinant of the metric is the only dynamical
variable by writing
gµν = e2ϕ ηµν . (13.45)
For an appropriate choice of κ this would give a nonlinear completion of spin-0 gravity.
11. ∗ Calculate the Ricci scalar for the metric (13.45). What is κ in scalar gravity?
12. ∗ Express variation with respect to ϕ, i.e. δS/δϕ in terms of variation with respect to metric
δS/δg µν .
76
14 Two-Dimensional De Sitter and Anti-De-Sitter
1. De Sitter2 is the 1 + 1-dimensional spacetime of constant positive curvature. The metric may
be written
ds2 = L2 (−dt2 + cosh2 t dφ2 ), (14.1)
where φ has period 2π. In general dSd can be embedded as a hyperboloid in d + 1 Minkowski.
[Commentary: the t > 0 patch of de Sitter4 is a good model for the exponential expansion of
the universe during cosmic inflation (with L of perhaps 10−30 or 10−20 meters), and also for
the new inflationary phase we are believed to be entering now due to Dark Energy (with L
of about 10 billion light-years).]
(a) Two nearby observers follow comoving geodesics φ = 0 and φ = . What is the rate of
change of their distance s at t = 0? What is d2 s/dτ 2 to leading order in ? [Commentary:
positive curvature makes timelike geodesics diverge.]
dφ dφ
(b) A timelike geodesic passes φ = 0 at t = 0 with initial velocity dτ (0) = β. Calculate dτ (t).
[Commentary: expanding universes have “friction”.]
(c) A null geodesic passes φ = 0 at t = 0. How many times does it circumnavigate the
circle by t = ∞? [Commentary: in a rapidly expanding universe, observers become causally
disconnected so that it eventually becomes impossible to send a signal from one to the other.]
2. Anti-de Sitter2 is the 1 + 1-dimensional spacetime of constant negative curvature. The metric
may be written
dχ2
2 2 2 2
ds = L −(1 + χ )dt + . (14.2)
1 + χ2
Negative curvature makes spacelike geodesics diverge and timelike geodesics converge. A
timelike geodesic leaving χ = 0 will eventually return to χ = 0; calculate how long this takes
in
(a) proper time τ along the geodesic
(b) coordinate time t
dχ
as a function of the geodesics initial rapidity dτ (0) = β.
(c) What is the maximum value of χ reached as a function of β? [Commentary: anti-de Sitter
is a perfect lens.]
3. What is the proper acceleration required to stay at fixed χ in AdS2 ? In what direction do
you need to accelerate? In what direction is the “gravitational field” pointing?
77