0% found this document useful (0 votes)
106 views189 pages

Lecture Notes On Electrodynamics 118120

Uploaded by

Abod Aljasem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views189 pages

Lecture Notes On Electrodynamics 118120

Uploaded by

Abod Aljasem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lecture notes on Electrodynamics–118120

J.E. Avron1

November 8, 2024

1
Comments and typos welcome. Send to [email protected].
2

I thank my TAs Dr. Dana Levanony, Mr. Yaroslav Pollak and Barak Katzir
and Prof. Amos Ori and Dr. Oded Kenneth for all they taught me. The
students the class in 2012 pruned many typos and Daniel Klein showed me how
to query Wolfarm α.
November 8, 2024
Contents

1 Tensor calculus 9
1.1 The geometry of space time . . . . . . . . . . . . . . . . . . . . . 9
1.1.1 The metric tensor . . . . . . . . . . . . . . . . . . . . . . 10
1.1.2 Einstein summation convention . . . . . . . . . . . . . . . 12
1.1.3 Coordinate transformations . . . . . . . . . . . . . . . . . 12
1.1.4 Curvature: you may skip this . . . . . . . . . . . . . . . . 13
1.2 Vectors: Contravariant components . . . . . . . . . . . . . . . . . 14
1.2.1 Covariants components . . . . . . . . . . . . . . . . . . . 15
1.2.2 Contraction makes scalars . . . . . . . . . . . . . . . . . . 18
1.2.3 Orthogonal coordinates . . . . . . . . . . . . . . . . . . . 18
1.3 Scalars, vectors, tensors . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.1 Symmetric and anti-symmetric tensors . . . . . . . . . . . 19
1.3.2 Densities and Weights . . . . . . . . . . . . . . . . . . . . 19
1.3.3 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.4 Levi-Civita tensor and symbol . . . . . . . . . . . . . . . 20
1.4 Tensors and pseudo-tensors . . . . . . . . . . . . . . . . . . . . . 22
1.5 Isometries of Euclidean space . . . . . . . . . . . . . . . . . . . . 22
1.6 Tensorial equations are coordinate free . . . . . . . . . . . . . . . 23
1.7 Differential operators . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.7.1 Grad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.7.2 Div . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.7.4 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.8 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 Review of special relativity: Minkowski space-time 27


2.1 The principle of relativity . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Events, world line and proper-time . . . . . . . . . . . . . 29
2.3 Simultaneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.1 Time dilation and length contraction . . . . . . . . . . . . 32
2.3.2 Light-cone coordinates . . . . . . . . . . . . . . . . . . . . 33
2.4 4-Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.1 4 Acceleration: . . . . . . . . . . . . . . . . . . . . . . . . 35

3
4 CONTENTS

2.4.2 Linear acceleration . . . . . . . . . . . . . . . . . . . . . . 35


2.4.3 Space travel . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5 Cyclotron motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6 Lorentz transformations . . . . . . . . . . . . . . . . . . . . . . . 37
2.6.1 Space-time translations . . . . . . . . . . . . . . . . . . . 37
2.6.2 Generators of Lorentz transformations . . . . . . . . . . . 38
2.6.3 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6.4 Boosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.5 Commutators . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.7 Rotating frames in Minkowski space . . . . . . . . . . . . . . . . 40
2.7.1 Rindler coordinates and horizons . . . . . . . . . . . . . . 42
2.8 GPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.8.1 Two dimensional space time . . . . . . . . . . . . . . . . . 44
2.9 Space-time near a point mass . . . . . . . . . . . . . . . . . . . . 45

3 The electromagnetic fields 47


3.1 Electromagnetic fields in Minkowski space . . . . . . . . . . . . 47
3.1.1 Anti-symmetric tensors describe a pair of vectors . . . . . 49
3.2 The field of a uniformly moving charge . . . . . . . . . . . . . . . 51
3.3 Lorentz scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.1 Duality and Levi-Civita . . . . . . . . . . . . . . . . . . . 53
3.3.2 Second Lorentz scalar . . . . . . . . . . . . . . . . . . . . 54
3.4 The homogeneous Maxwell equations . . . . . . . . . . . . . . . . 54
3.4.1 No monopoles . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 Faraday law . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.3 Amalgamating the homogeneous Maxwell equations . . . 55
3.5 Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5.1 The 4-potential . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Gauge transformations . . . . . . . . . . . . . . . . . . . . . . . . 57
3.6.1 Non-local gauge invariants Lorentz scalars . . . . . . . . . 57
3.7 Electromagnetic fields in curvilinear coordinates . . . . . . . . . 58

4 Variational principle 61
4.1 Physics is where the action is . . . . . . . . . . . . . . . . . . . . 61
4.1.1 Action for a free massive particle . . . . . . . . . . . . . . 62
4.1.2 Interaction with the electromagnetic field . . . . . . . . . 63
4.1.3 Gauge invariance . . . . . . . . . . . . . . . . . . . . . . . 64
4.1.4 Euler-Lagrange . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Variation of the action . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.1 Variation of the action of a free particle . . . . . . . . . . 65
4.2.2 Variation of the action associated to interaction . . . . . . 66
4.2.3 Euler-Lagrange equation . . . . . . . . . . . . . . . . . . . 66
4.2.4 The non-relativistic limit . . . . . . . . . . . . . . . . . . 66
4.2.5 The minimiser of the action . . . . . . . . . . . . . . . . . 67
4.3 Geodesics in Curved space-time. (You may want to skip this) . 68
4.3.1 Relativistic Kepler law . . . . . . . . . . . . . . . . . . . . 70
CONTENTS 5

4.4 Supplement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.1 Fermat principle . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.2 Rainbow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5 Currents 75
5.1 Charge densities and currents . . . . . . . . . . . . . . . . . . . . 75
5.1.1 4-current-density . . . . . . . . . . . . . . . . . . . . . . . 75
5.1.2 Charge conservation . . . . . . . . . . . . . . . . . . . . . 77
5.1.3 Current conservation and gauge invariance . . . . . . . . 78
5.1.4 Gauge invariance and the continuity equation . . . . . . . 79
5.1.5 The continuity equation in curvilinear coordinates . . . . 79

6 The inhomogeneous Maxwell’s equations 81


6.1 Lagrangian field theory . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.1 The Lagrangian of the electromagnetic field . . . . . . . . 81
6.2 Variation of the field: Rules of the game . . . . . . . . . . . . . . 83
6.2.1 Variation of the field: Calculations . . . . . . . . . . . . . 83
6.2.2 Variation of the interaction . . . . . . . . . . . . . . . . . 84
6.2.3 The inhomogeneous Maxwell equations . . . . . . . . . . 84
6.2.4 Current conservation . . . . . . . . . . . . . . . . . . . . . 84
6.2.5 3-D form . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.6 Time reversal . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.7 Maxwell equations: Evolution equations and constraints . 85
6.3 New Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.4 Electrodynamics in 1+1 dimensions . . . . . . . . . . . . . . . . 87
6.4.1 Axion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.5 The quantum Hall effect . . . . . . . . . . . . . . . . . . . . . . . 88
6.5.1 The Chern-Simons action . . . . . . . . . . . . . . . . . . 89
6.6 Supplement: Axion electrodynamics . . . . . . . . . . . . . . . . 93
6.6.1 Quantum interface . . . . . . . . . . . . . . . . . . . . . . 93
6.6.2 Magnetic response to an electric field . . . . . . . . . . . . 94
6.6.3 Phantom monopoles . . . . . . . . . . . . . . . . . . . . . 95

7 Magnetic fields and magnetic induction 99


7.1 Constitutive relations . . . . . . . . . . . . . . . . . . . . . . . . 101
7.2 Poarization and Magnetization . . . . . . . . . . . . . . . . . . . 102

8 Cloaking 105
8.1 Dielectric media . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.2 Invisible dielectrics . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.3 Cloaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

9 The Stress-Energy tensor 111


9.1 Maxwell stress energy tensor . . . . . . . . . . . . . . . . . . . . 111
9.1.1 The stress-energy and conservation laws . . . . . . . . . . 112
9.2 Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6 CONTENTS

9.3 Stress tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115


9.3.1 Case study: Capacitor plates . . . . . . . . . . . . . . . . 115
9.4 Field lines as rubber bands . . . . . . . . . . . . . . . . . . . . . 116
9.5 The stress tensor as variation of the metric . . . . . . . . . . . . 117
9.5.1 Variation of the metric in mechanics . . . . . . . . . . . . 118
9.5.2 Variation of the metric in electrodynamics . . . . . . . . . 118
9.5.3 Matrix calculus . . . . . . . . . . . . . . . . . . . . . . . . 119
9.5.4 The stress tensor . . . . . . . . . . . . . . . . . . . . . . . 119
9.6 Nöther: Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . 120
9.6.1 Shifting the field . . . . . . . . . . . . . . . . . . . . . . . 120
9.6.2 Shifting the box . . . . . . . . . . . . . . . . . . . . . . . 121
9.6.3 Joint box and field shift . . . . . . . . . . . . . . . . . . . 122
9.6.4 Symmetry and traceless . . . . . . . . . . . . . . . . . . . 122
9.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
9.7.1 Radiation pressure . . . . . . . . . . . . . . . . . . . . . . 122
9.7.2 Solar sails . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
9.7.3 Halbach array . . . . . . . . . . . . . . . . . . . . . . . . . 123

10 Electrostatics and magnetostatics 125


10.1 Static electric fields: . . . . . . . . . . . . . . . . . . . . . . . . . 125
10.2 Harmonic functions . . . . . . . . . . . . . . . . . . . . . . . . . . 126
10.2.1 Beating Ehrenshaw: Magnetic and electric traps . . . . . 126
10.3 Laplace equation in two dimensions . . . . . . . . . . . . . . . . . 127
10.3.1 Harmonic functions and Riemann mapping . . . . . . . . 128
10.4 Harmonic polynomials and spherical harmonics . . . . . . . . . . 128
10.4.1 Harmonic functions and multipoles . . . . . . . . . . . . . 131
10.5 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . 131
10.6 Green’s function . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
10.6.1 Green function in arbitrary dimensions . . . . . . . . . . . 132
10.6.2 Volumes of d balls and spheres . . . . . . . . . . . . . . . 133
10.7 Proof of the fundamental property of Harmonic functions . . . . 134
10.8 Stationary magnetic fields . . . . . . . . . . . . . . . . . . . . . . 135
10.8.1 Biot-Savart law . . . . . . . . . . . . . . . . . . . . . . . . 135
10.8.2 Magnetic dipole . . . . . . . . . . . . . . . . . . . . . . . 136
10.9 Dirac monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.10Application to geometry . . . . . . . . . . . . . . . . . . . . . . . 141
10.10.1 Vector fields in 3D: Source and vorticity . . . . . . . . . . 141
10.10.2 Linking number . . . . . . . . . . . . . . . . . . . . . . . 141

11 Electromagnetic waves 143


11.1 Maxwell’s equations in the Lorenz gauge . . . . . . . . . . . . . . 143
11.1.1 Ambiguity of the Lorenz gauge . . . . . . . . . . . . . . . 143
11.2 Electromagnetic waves . . . . . . . . . . . . . . . . . . . . . . . . 144
11.2.1 Electric and Magnetic fields . . . . . . . . . . . . . . . . . 144
11.3 Plane waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
11.3.1 Electric and magnetic fields . . . . . . . . . . . . . . . . . 146
CONTENTS 7

11.3.2 Doppler . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146


11.4 Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
11.4.1 Amplitude and phase . . . . . . . . . . . . . . . . . . . . 147
11.4.2 Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . 148
11.4.3 Poincare sphere . . . . . . . . . . . . . . . . . . . . . . . . 149
11.4.4 Circular polarization . . . . . . . . . . . . . . . . . . . . . 149
11.4.5 Linear polarization . . . . . . . . . . . . . . . . . . . . . . 150
11.4.6 Stokes parameters . . . . . . . . . . . . . . . . . . . . . . 150
11.4.7 Partially polarized light . . . . . . . . . . . . . . . . . . . 150
11.5 The wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 151
11.5.1 The wave equation in one dimension . . . . . . . . . . . . 151
11.5.2 Waves with Gaussian waists . . . . . . . . . . . . . . . . . 151
11.6 Green’s function for the wave equation . . . . . . . . . . . . . . . 153
11.6.1 Conservation law . . . . . . . . . . . . . . . . . . . . . . . 153
11.6.2 Recursion relation for the Green function . . . . . . . . . 154
11.6.3 Even space dimensions . . . . . . . . . . . . . . . . . . . . 155
11.6.4 Odd space dimensions . . . . . . . . . . . . . . . . . . . . 156
11.7 Coulomb gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
11.8 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
11.8.1 Cosmic rays: GZK limit . . . . . . . . . . . . . . . . . . . 158
11.8.2 Laser cooling and optical molasses . . . . . . . . . . . . . 158
11.8.3 Covariant superposition . . . . . . . . . . . . . . . . . . . 159
11.8.4 Monochromatic waves . . . . . . . . . . . . . . . . . . . . 159
11.8.5 Evanescent waves . . . . . . . . . . . . . . . . . . . . . . . 159
11.8.6 Waves in dielectric media: Birefringence: . . . . . . . . . 160
11.8.7 3D glasses . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

12 Radiation 163
12.1 Wave equation with arbitrary source term . . . . . . . . . . . . . 163
12.1.1 Scalar wave generated by a moving point source . . . . . 163
12.2 Maxwell equation in the Lorenz gauge . . . . . . . . . . . . . . . 165
12.3 Lienard-Wiechert: Retarded potentials . . . . . . . . . . . . . . 166
12.3.1 The Lorenz Gauge condition . . . . . . . . . . . . . . . . 167
12.4 Lienard Wiechert formula for retarded field . . . . . . . . . . . . 168
12.4.1 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . 169
12.5 Accelerating particle in its rest frame . . . . . . . . . . . . . . . . 170
12.5.1 The Magnetic field: . . . . . . . . . . . . . . . . . . . . . 170
12.5.2 The electric field . . . . . . . . . . . . . . . . . . . . . . . 171
12.5.3 Magnetic field in the far field region . . . . . . . . . . . . 172
12.6 Retardation from a distant source . . . . . . . . . . . . . . . . . 172
12.6.1 The dipole approximation . . . . . . . . . . . . . . . . . 173
12.6.2 Dipole approximation: Successive approximations . . . . 174
12.6.3 Radiation from a charge in Harmonic motion . . . . . . . 175
12.6.4 Many particles . . . . . . . . . . . . . . . . . . . . . . . . 176
12.7 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
12.8 Classical instability of atoms . . . . . . . . . . . . . . . . . . . . 178
8 CONTENTS

13 Radiation reaction 181


13.1 Is electrodynamics a consistent theory? . . . . . . . . . . . . . . . 181
13.1.1 Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
13.1.2 Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . 182
13.2 Non-relativistic interacting particles . . . . . . . . . . . . . . . . 182
13.3 Radiation reaction: The Abraham-Lorentz force . . . . . . . . . . 183
13.3.1 When is radiation reaction important? . . . . . . . . . . . 184
13.3.2 Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
13.3.3 The Dumbbell . . . . . . . . . . . . . . . . . . . . . . . . 186
13.4 Conceptual difficulties . . . . . . . . . . . . . . . . . . . . . . . . 189
Chapter 1

Tensor calculus
Tensor calculus allows us to write equation without making a commitment to
a coordinate system.

1.1 The geometry of space time


Minkowski space-time is the stage on which classical electrodynamics takes
place. Before describing the geometry of space-time lets us collect the tools
we need from geometry.
In constructing physical theories we need to know some facts about the
system we consider. For example, how many particles we have of different
species, in a given region of space-time. These things are scalars. They only
depend on our ability to count. We also need various physical constants such
as e, ~, and the masses of varous particles etc. All of these are God given
scalars. We also need to know something about the space we live in and how to
measure distances between events in space-time. For the purpose of this course,
space-time is Minkowski and space is Euclidean. We take this to be another
God given fact.
In Minkowski space, we can distribute copies of the same meter stick and
the same clocks everywhere. (This depends on the fact that Minkowsky space
is homegeneous and isotropic). It is then an experimental fact that with such
sticks and clocks, the velocity of light c is a universal constant. In this sense c
can be viewed as a scalar.
We are still free to choose any (curvilinear in general) coordinate systems
to describe the points of Minkowski space. The coordinate system need not be
inertial. The price we pay is that meter sticks and clocks that are stationary in
such a general coordinate system may not remain synchronized.
Remark 1.1 (Euclid, Gauss and Einstein). Euclid took it for granted that
physical space is Euclidean. In a Euclidean world the angles of triangles sum
up to π. The first to seriously entertain the possibility that space need not be
Euclidean was Gauss. When one says that the world is to a good approximation
Euclidean one means that the deviations from π are small. Gauss who essentially

9
10 CHAPTER 1. TENSOR CALCULUS

invented lad surveying tested this. Later Einstein taught us that space-time is
actually curved and there are many physical tests of this. This is a another
story.

1.1.1 The metric tensor


In the Euclidean plane consider a Cartesian coordinate system (x, y), and a
polar coordinate systems (r, θ).
y
3

0
−3 −2 −1 0 1 2 3 x
−1

−2

−3

Figure 1.1: Cartesian coordinates for the Euclidean plane

The distance between two neighboring points using a standard meter is de-
noted d`: It does not depend on the choice of coordinates
X
(d`)2 = (dx)2 + (dy)2 = (dr)2 + r2 (dθ)2 = gij dxi dxj
ij

g is called the metric tensor, also known as the Riemann metric tensor. It is a
second rank tensor: It has two indices and is symmetric gij = gji . When g is a
diagonal, the coordinates are called orthogonal.
Note that the coordinates have upstairs indices while the metric has them
downstairs.
A one to one mapping

{x1 , x2 } 7→ {ξ 1 , ξ 2 }

is a coordinate transformation. In Euclidean one can pick cooridnates ξ j that


are curvilinear. For example, polar coordinates in the plane

x = ρ cos θ, y = ρ sin θ,
1.1. THE GEOMETRY OF SPACE TIME 11

90◦

120◦ 60◦

150◦ 30◦

180◦ 0◦
1 2 3 4

210◦ 330◦

240◦ 300◦

270◦

Figure 1.2: Polar coordinates for the Euclidean plane

where the domain is

−∞ < x, y < ∞, 0 ≤ ρ < ∞, −π < θ ≤ π

The mapping is 1 − 1 except for the origin where the point (x = 0, y = 0) maps
to a line (ρ = 0, θ). You see this in the pictures where the θ coordinates collapse
at the origin. This is a coordinate singularity, which does not reflect any bad
physical features, the space is still nice and smooth there.
The metric in the ξ coordinates, which we denote by γ is, the chain rule,
(and Pythagoras),

X X ∂xi
(d`)2 = (dxi )2 = γjk dξ j dξ k , γ = Λt Λ, Λi j = (1.1)
∂ξ j

where i is the row index and j the column index of the matrix Λ. For polar
12 CHAPTER 1. TENSOR CALCULUS

coordinates we get
   
cos θ −ρ sin θ 1 0
Λ= , γ=
sin θ ρ cos θ 0 ρ2
Note that
det Λ = ρ ≥ 0
so the map is invertible (and orientation preserving) except for ρ = 0. The
vanishing of the metric at ρ = 0 is a reflection of a coordinate singularity.

1.1.2 Einstein summation convention


This is a short hand which says: Sum over pairs of up-down indices. For example
X
gij dxi dxj = gij dxi dxj

It is also called contraction of indices.


Remark 1.2 (Dummy indices). Summation indices are sometimes called run-
ning and sometimes dummy. They can be relabeled freely
gja v a = gjb v b
Remark 1.3 (Warning). If you get an equation where the indices are not nicely
paired, such as
va ua , va ua wa
it is a good idea to search for a typo.

1.1.3 Coordinate transformations


It is intuitively clear that the sphere is intrinsically different from the plane.
For example, two points on the Euclidean plane can be arbitrarily far, but on
a sphere the maximal distance between two points is 2πR. You can figure the
metric on the sphere by embedding it in 3-dimensions:
z = R cos θ, x = R sin θ cos φ, y = R sin θ sin φ,
with
0 ≤ θ ≤ π, −π < φ ≤ π
The metric is, by Pythagoras in Euclidean 3 space:
(d`)2 = (dx)2 + (dy)2 + (dz)2 = R2 (dθ)2 + R2 sin2 θ(dφ)2
The metric is close to Cartesian near the equator, θ = π/2. But, since
det g = R4 sin2 θ
the metric has a coordinate singularity at θ = 0 and θ = π. Nothing bad
happens at the poles, the sphere is still smooth there. It is only the coordinates
which are messed up. This messing up is related to the fact that the poles
belong to all time zones on earth.
1.1. THE GEOMETRY OF SPACE TIME 13

Figure 1.3: Projective representation of the sphere on the plane

Remark 1.4. A choice of coordinates for the sphere, which has no finite coor-
dinate singularity is the spherical projection, shown in the figure. The image of
the south pole is at the origin, and the north pole at infinity.
The metric on the plane induced from the standard metric of the sphere is

(dx)2 + (dy)2
(ds)2 = 4R4
(R2 + x2 + y 2 )2

The metric tensor is diagonal

4R4
gjk = δjk (1.2)
(R2 + x2 + y 2 )2

Note that the coordinates (x, y) are dimensionless and det g = 0 at aa single
poiint: infinity.
If g is the metric tensor of some space (not necessarily Euclidean) in the
coordinate x and ξ is different coordinate system of the same space, then the
new metric γ is

(d`)2 = gij dxi dxj = γab dξ a dξ b

where
∂xi
 
t i
γ = Λ gΛ, Λ a = (1.3)
∂ξ a

1.1.4 Curvature: you may skip this


Given two metrics g and γ it is in general not a simple matter to decide if the two
represent different coordinates of one space or the two spaces are intrinsically
different, like a sphere and a plane. This problem was addressed by Gauss in
the special case of two dimensional surface. Gauss showed that a necessary
condition for the two surface to be the same is that the (Gaussian) curvatures
coincide.
I shall not really give a proof of the fact that you can not get the metric of
the sphere by making a coordinate transformation of the Euclidean metric, but
instead hope that the following observation gives an idea how to do that: Since g
14 CHAPTER 1. TENSOR CALCULUS

is a symmetric matrix, it can be diagonalized by an orthogonal transformation.


So, at any given, fixed point x there is a Λ that diagonalizes g: You can choose
coordinates so that at one point the metric looks like a Cartesian coordinates
of Euclidean space. This is an expression of the fact that any (Riemannian)
manifold is locally Euclidean. You can not make g the identity everywhere
unless the space is the Euclidean space and you choose Cartesian coordinates.
You can see this by counting the number of free parameters and the number
of constraints in the Taylor expansions in a coordinate transformations near a
point. You can choose coordinate transformations that make g the identity and
makes all its first derivative vanish, but can not make all the second derivatives
zero. (The curvature is expressed in terms of second derivatives.)

1.2 Vectors: Contravariant components


The mother of all vectors is the velocity vector v. We can associate with the
velocity and infinitesimal displacement

dx = vdt (1.4)

Consider such an infinitesimal displacement dx in the Euclidean plane. We want


to represent the components of the vector in the (non-orthogonal) coordinate
system shown in fig. 1.4:

e2

dx2
dx

e1
dx1

Figure 1.4: The contravariant components are the incremets dxj of the coordi-
nates.

dx = dx1 e1 + dx2 e2 (1.5)


j
ej are vectors pointing along the coordinate lines and dx are called contravari-
ant components. The vectors ej give the directions. They are not, in general,
unit vectors. Rather, they are related to the metric by

dx · dx = ei · ej dxi dxj = gjk dxj dxk (1.6)

g is diagonal in orthogonal coordinate systems. The covariant basis vectors ej


are, in general, not unit vectors.
1.2. VECTORS: CONTRAVARIANT COMPONENTS 15

Exercise 1.5. In a polar coordinate system the covariant basis vectors and the
normalized unit vectors are related by

eρ = ρ̂, eθ = ρ θ̂

Remark 1.6. Normalized unit vectors are defined only for orthogonal coordi-
nate systems.
Under a change of coordinates xj 7→ ξ j , the transformation of the contravari-
ant components dxj of a vector

(dx1 , dx2 ) 7→ (dξ 1 , ξ 2 ) (1.7)

transform like the differentials of the coordinates, dxj , i.e.


 j
j ∂x
dx = dξ a = Λj a dξ a ⇐⇒ dx = Λdξ (1.8)
∂ξ a | {z }
Contravariant

Think of the contravariant components as column vector and Λ as a matrix.


This rule applies to the contravariant components of any vector, not just the
infinitesimal displacement considered above.
Exercise 1.7. The matrices
 j
∂ξ b
 
j ∂x −1 b
Λ a= , (Λ ) k =
∂ξ a ∂xk
are inverses.
Example 1.8. The matrix Λ that converts Cartesian components to Spherical
components is
 
r cos(φ) sin(θ) r sin(θ) sin(φ) r cos(θ)
1
Λ(C-to-S) =  cos(θ) cos(φ) cos(θ) sin(φ) −sin(θ)  (1.9)
r
−csc(θ) sin(φ) cos(φ) csc(θ) 0

1.2.1 Covariants components


The covariant components of the vector are given by drawing normals rather
than parallels, as shown in fig. 1.5 We can write the vector in terms of the dual
basis vectors ej to the basis ej defined by

ei · ej = δi j (1.10)

and δ is the Kronecker symbol. The vector dx can be represented in two different
ways
dx = dxa ea = dxa ea (1.11)
Clearly
dxj = dx · ej (1.12)
16 CHAPTER 1. TENSOR CALCULUS

e2

dx2

e2

dx

e1
dx1

e1

Figure 1.5: The covariant components are the incremets dxj of the orthogonal
projections on the coordinates (schematic).

From the definitions of the metric tensor, and the notion of duality we get
dxk = dxa ea · ek = dxa ea · ek = gka dxa (1.13)
The metric tensor allows us to push indexes down.
Exercise 1.9. Show that
ej = (ej · ea ) ea
The length of the vector dx is given by
dx · dx = dxa dxb g ab = dxa dxb gab , g jk = ej · ek (1.14)
Taking the scalar product of exercise 1.9 with ek we conclude that gjk and g jk
are inversely related
δkj |{z}
= ej · ek |{z}
= (ej · ea )(ea · ek ) = g ja gak =
|{z} gj k
duality ex.10 index gym

g jk raises indexes since


g ja dxa = g ja gab dxb = δbj dxb = dxj (1.15)
1.2. VECTORS: CONTRAVARIANT COMPONENTS 17

The transformation rules for the covariant components of the position vector
follow:

dξa = γab dξ b
= γab (Λ−1 dx)b
= Λi a Λj b gij (Λ−1 )b k dxk
= Λi a Λj b (Λ−1 )b k gij dxk
= Λi a (ΛΛ−1 )j k gij dxk
= Λi a gij dxj
= Λi a dxi
= dxi Λi a

Contrast this with the transformation rule of the contravariant component

dxa = Λa i dξ i

Since all these indices can make one dizzy, let me write the two rules in a way
that make the comparison simple. If you write the covariant components as a
row vector and contravariant as a columns, the transformations can be written
in matrix and vector notation as

dξ t = dxt Λ , dx = Λdξ (1.16)


| {z } | {z }
covariant components contravariant components

The comparison between the rules of transformations of becomes even more


transparent when we write both as column vectors:

dξ = Λt x , dξ = Λ−1 dx (1.17)
| {z } | {z }
covariant components contravariant components

The covariant components transform by Λt and the contravariant by Λ−1 . In


general, these two rules are different, but they coincide in the special case that
Λ is an orthogonal transformation, where Λt = Λ−1 . This is the reason why we
need no distinguish covariant from contravariant components in (orthogonal)
Cartesian coordinates.
Now, by decree, the same rules of transformations hold for any vector, not
just the coordinates. Hence, if v is any vector in the x coordinates, and ν the
corresponding vector in the ξ coordinates the components are related by

νk = uj Λj k , uk = Λk j ν j (1.18)

Note the interchange u ↔ ν and since the summation indexes j are adjacent uj
should be considered as a row vector and ν j as column.
18 CHAPTER 1. TENSOR CALCULUS

1.2.2 Contraction makes scalars


It follows from the first eqaution in 1.16 and the second equation in 1.17 that
the contraction uj v j = ut v is coordinate independent. Indeed, if we denote u0
and v 0 the corresponding transformed coordinates then
t
(u0 )j (v 0 )j = u0 v0 = ut ΛΛ−1 v = ut v = uj v j (1.19)

1.2.3 Orthogonal coordinates


Many of the standard curvilinear coordinate systems one encounters in practice
are orthogonal. In this case, the metric g is a diagonal matrix. Orthogonal
coordinates admit three types of components: The usual covariant and con-
travariant components and the “normalized” components. All three are given
by
v j ej = vj ej = vĵ nj

with ei · ej = gij = gi δij , ei · ej = g ij = (gi )−1 δij and nj · nj = δij .

Example 1.10 (Polar coordinates). In polar coordinates

v= v ρ eρ + v θ eθ = vρ eρ + vθ eθ = vρ̂ ρ̂ + vθ̂ θ̂ (1.20)


| {z } | {z } | {z }
contravariant components covariant components normalized

The local frames have basis vectors:

eρ · eρ = 1, eθ · eθ = ρ2 , eρ · eθ = 0
ρ ρ θ θ −2
e · e = 1, e ·e =ρ , eρ · eθ = 0
ρ̂ · ρ̂ = 1, θ̂ · θ̂ = 1, ρ̂ · θ̂ = 0

Example 1.11 (Mechanical model). A particle of unit mass moves on a ring


of fixed radius, r. Its orbit in polar coordinates is (r, θ(t)). Its velocity vector is:

v = (rθ̇) θ̂ = θ̇
|{z} eθ = (r2 θ̇) eθ (1.21)
|{z} | {z }
velocity angular velocity “angular momentum00

1.3 Scalars, vectors, tensors


The charge of a particle, its mass, or the length of a vector are all scalars: You do
not need to decide on coordinates to give a numerical value of scalars. If you do
use coordinates, the result should be independent of the choice of coordinates.
This is the defining property of scalars.
Vectors are geometric objects and as such do not rely of a coordinate system
either. But, their representation by their components, covariant or contravari-
ant, depend on the choice of coordinate system. Vectors are also called rank
1.3. SCALARS, VECTORS, TENSORS 19

one tensors because their components have a single index. The rule of transfor-
mation under coordinate change, as we have seen is:

νk = uj Λj k , uk = Λk j ν j (1.22)

Tensors are multi-index objects that transform as the product of vectors. The
number of indices of the tensor is called each rank. Each index transforms
according to whether it is up or down.
The metric tensor is an example of a symmetric second rank tensor. In
particular, the metric of a coordinate transformation of the Euclidean plane
discussed in section 1.1.1, follows this rule:

gjk = (Λt Λ)jk = (Λt 1Λ)jk = (Λt )j δab Λb k = δab Λa j Λb k


a
(1.23)

Exercise 1.12. The identity, δ j k , a mixed second rank tensor, is invariant


under coordinate transformations.

1.3.1 Symmetric and anti-symmetric tensors


Coordinate transformations respect the symmetry of tensors: If T is symmetric
(anti-symmetric), i.e. Tjk = ±Tkj , so is (T 0 ).
Exercise 1.13. Show that if Tjk is symmetric (anti-smmetric) so is T jk but in
general Tj k 6= Tk j . (One finds instead Tj k = T k j ).

1.3.2 Densities and Weights


Scalars and tensors are basic objects in the equation of nature. But it turns
out that there there are subtle features that force us to distinguish tensors from
objects that are almost tensors. For example, det g has no indices, but is not
quite a scalar since by Eq. 1.3
2
det γ = det Λ det g

Indeed, det g = 1 in Cartesian coordinates; it is invariant under orthogonal


transformations. But, it is not a scalar under change of scale of the coordinates:
One says that det g is a scalar with density −2.

1.3.3 Volume
det g enters into the volume element dV . For the sake of simplicity, let us
consider the case of two dimensions where volume is area. Recall that the
(signed) area of the parallelogram associated with the two vectors u and v is

u×v =u∧v

and ∧ is a generalization of × to any dimension. We have for the (signed) area

dV = dx1 e1 ∧ dx2 e2 = e1 ∧ e2 dx1 dx2 = z|e1 ||e2 | sin θ dx1 dx2


20 CHAPTER 1. TENSOR CALCULUS

e2

dx2

θ
e1
dx1

Figure 1.6: The area of a parallelogram

where z denote the notion of oriented unit area. Observing that

|e1 |2
 
|e1 ||e2 | cos θ
g=
|e1 ||e2 | cos θ |e2 |2

we conclude that p
dV = z det g dx1 dx2
Objects with such a rule of transformation are called weights. det g has weight
−2.

Exercise 1.14 (Spherical coordinates). Let (x, y, z) be Cartesian coordinates off


Eucliden space with metric d`2 = (dx)2 + (dy)2 + (dz)2 . The standard spherical
coordinates
z = r cos θ, x = r sin θ cos φ, y = r sin θ sin φ
have the metric tensor

d`2 = (dr)2 + r2 (dθ)2 + r2 sin2 θ (dφ)2

and that the volume element is


p
dV = det g dr dθ dφ = r2 sin θ dr dθ dφ = −r2 dr d cos θ dφ

1.3.4 Levi-Civita tensor and symbol


We can always define a tensor by giving its numerical components in a particualr
coordinate system, and then declare that it is given in any other coordinate
system by the rules of tensor transformations.
Pick a Cartesian coordinates in two dimensional Eucliden space and define
the Levi-Civita tensor by setting

εxx = εyy = 0, εxy = −εyx = 1


|{z} | {z }
even odd
1.3. SCALARS, VECTORS, TENSORS 21

Since the Cartesian metric is the identity we also have

εxx = εyy = 0, εxy = −εyx = 1


|{z} | {z }
even odd

In other coordinates the transformation rule of tensors imply

ε012 = εab Λa 1 Λb 2 = det Λ, ε011 = εab Λa 1 Λb 1 = 0, a, b, ∈ x, y

It is always anti-symmetric, but its numerical value is not, in general, the iden-
tity except if Λ is a rotation so det Λ = 1. For example, in polar coordinates,
the Levi-Civita tensor would be

ερθ = −εθ,ρ = ρ, ερρ = εθθ = 0

and
1
ερρ = εθ,θ = 0, ερθ = −εθρ =
ρ
In any dimension, the Levi-Civita symbol is defined as the highest rank of com-
pletely anti-symmetric tensor with components ±1 and 0. In particular in three
dimensions
123 231
|ε = ε{z = ε312} = −ε321 = −ε213 = −ε132 = 1 (1.24)
| {z }
even permutations odd permutations

The Levi-Civita symbol is a tensor with density. The relation between the
Levi-Civita tensor (bold) and symbol is,
p i...j 1 i...j
εi...j =
|{z}
det g εi...j ,
|{z} |{z} = √det g ε|{z}
ε (1.25)
tensor symbol tensor symbol

Remark 1.15. An orthogonal coordinate system is right handed if

ε1...d = 1

Remark 1.16. In even dimensions cyclic permutations are odd, while in odd
dimensions cyclic permutations are even.

Exercise 1.17. Show that in n dimensions

εij... εij... = n!

Exercise 1.18. Show that (in 3 dimensions)

εijk εiab = δaj δbk − δbj δak , εijk εijb = 2δbk


22 CHAPTER 1. TENSOR CALCULUS

1.4 Tensors and pseudo-tensors


Inversion of Cartesian coordinates is associated with Λ = −1. The Cartesian
components of ordinary vectors flip signs under inversion: (v 0 )j = −v j . (The
vector still points in the same direction.)
However, we sometimes encounter in physics vectors that do not flip sign.
These are pseudo vectors.
The angular momentum is an example:

Li = (x × p)i = εijk xj pk =⇒ Lj = L0j

with ε the Levi-Civita symbol.

1.5 Isometries of Euclidean space


Euclidean space looks the same no matter where you are or how you are ori-
ented: It is homogeneous and isotropic. These symmetries reflect invariance
properties of the metric tensor of Euclidean space under suitable coordinate
transformations. In Cartesian coordinates xj a shift

(x0 )j = xj + aj =⇒ Λ = 1 =⇒ g 0 = g = 1,

leaves g invariant, reflecting the homogeneity of Euclidean space. The compo-


nents of vectors are invariant under coordinate shift: (v 0 )j = v j . This is what
is meant by saying that vectors in Cartesian system do not have a location.
Rotation is a linear transformation of the Cartesian coordinates that keeps
the origin fixed
(x0 )j = Λj a xa (1.26)

and leaves the metric invariant. By definition, coordinates are Cartesian if


g = 1. And indeed, a rotation of Cartesian coordinates are rotates Cartesian
coordinates:
1 = (g0 ) = Λt gΛ = Λt 1Λ = Λt Λ
This says that Λt is the inverse of Λ:

Λt Λ = 1

which is the standard definition of orthogonal transformation. It follows that

det Λ2 = 1 =⇒ det Λ = ±1

Orthogonal transformations are associated with two types of symmetries of the


Euclidean space: When det Λ = 1 they represent rotations. When det Λ = −1
they represent reflections.
1.6. TENSORIAL EQUATIONS ARE COORDINATE FREE 23

Example 1.19 (Rotations). In three dimensions, rotation by θ about the x3


axis is given by  
cos θ sin θ 0
R(θ) =  − sin θ cos θ 0 
0 0 1
The inverse is the transpose

R(−θ) = Rt (θ) (1.27)

Example 1.20. In a two dimensional Euclidean space there are two second rank
tensors that are invariant under rotations (up to multiplication by a scalar): The
identity and Levi-Civita
   
1 0 0 1
, (1.28)
0 1 −1 0

1.6 Tensorial equations are coordinate free


The nice thing about tensor equations is that once

T jk... = 0

holds in one (fixed) coordinate system, it hold in any other coordinate system.
For example, Newton’s equation

f j = maj

is a tensor equation relating force and acceleration (m is a scalar). If it holds


in one coordinate system it hold in any other.

1.7 Differential operators


The equations of motions of fields are partial differential equations. This forces
us to mind how differential operators behave under change of coordinates. The
general case is complicated and one needs to introduce the notion of covaraint
derivatives. A simplification however occurs for the three differential operators
we need for Maxwell’s equations: grad, div and curl.

1.7.1 Grad
The chain rule
∂ ∂xk ∂ ∂
j
= j k
= Λk j k
∂ξ ∂ξ ∂x ∂x
says that partial derivatives of scalar functions behave like covariant components
of a vector. That is, if φ(x) is a scalar field then ∇φ give the components of a
covariant vector field.
24 CHAPTER 1. TENSOR CALCULUS

Exercise 1.21. Show that ∇φ in spherical coordinates is

∇φ = (∂r φ) er + (∂θ φ) eθ + (∂ϕ φ) eϕ


∂θ φ ∂ϕ φ
= (∂r φ) r̂ + θ̂ + ϕ̂
r r sin θ

The covariant components lead to simple formulas while the normalized


components are a mess.

1.7.2 Div
In any coordinate system

1 √
∇ · E = √ ∂j ( gE j ) (1.29)
g

The formulas clearly hold in Cartesian coordinates where g is the identity.


The defining property of the divergence is Gauss law
Z Z
dV (∇ · E) = dS · E
V olume surf ace

Consider a smal cube in the coordinates dxj . The putative expression for div
indeed satisfies Gauss law

dV (∇ · E) = gdx1 dx2 dx3 (∇ · E)
√ f
= dx2 dx3 g E 1 + . . .
i
3√
f
= dx2 dx ge1 · E 1 e1 + ...
i
√ f
= g dx2 dx3 e1 ·E + ...
| {z } i
dS
= dS · E

Example 1.22. In spherical coordinates div is

1
∂r (r2 sin θE r ) + ∂θ (r2 sin θE θ ) + ∂φ (r2 sin θE φ )

∇·E =
r2
sin θ
1 1
= 2 ∂r (r2 E r ) + ∂θ (sin θE θ ) + ∂φ (E φ )
r sin θ
1 1  
= 2 ∂r (r2 Er̂ ) + ∂θ (sin θEθ̂ ) + ∂φ (Eφ̂ )
r r sin θ
The last line is in terms of the normalized coordinates.
1.7. DIFFERENTIAL OPERATORS 25

1.7.3 Curl
An important differential operator we shall need to discuss is the curl:
εijk εijk
(∇ × E)i = √ ∂j Ek = √ (∂j Ek − ∂k Ej ) (1.30)
g 2 g | {z }
anti-symmetric tensor

The term on the right is the contraction of the Levi-Civita tensor with an anti-
symmetric tensors so the object is a bona-fide vector.
Eq. 1.30 is evidently the standard definition in Cartesian coordinates. To
see why it is true in general we take Stokes law as defining property of the curl:
Z Z
dS · (∇ × E) = d` · E

The putative formula for curl indeed gives Stokes for dS a small square dx1 ×dx2 .
dS · (∇ × E) = dS3 (∇ × E)3
 ε3ij
 

= gdx1 dx2 √ ∂i Ej
g
= dx1 dx2 (∂1 E2 − ∂2 E1 )
f
= dx2 E2 − ...
i
f
= (dx2 e2 ) · (E2 e2 ) − ...
i
f
= (dx2 e2 ) · E − ...
i
= d` · E
1
Example 1.23. The φ components of curl in spherical coordinates (recall
Remark1.15 ) is :
1
(∇ × E)ϕ = (∂θ Er − ∂r Eθ )
r2 sin θ
and in normalized components
1
(∇ × E)ϕ̂ = (∂θ Er − ∂r Eθ )
r
1 
= ∂θ Er̂ − ∂r (rEθ̂ )
r
Exercise 1.24. Compute the (∇ × E)r and (∇ × E)r̂ .
Exercise 1.25 (Vector identities). Show the vector identities
∇ × (∇φ) = 0
∇ · (∇ × E) = 0

1 Note that Wolfram Mathematica notation compares with mine by interchanging ϕ ↔ θ


26 CHAPTER 1. TENSOR CALCULUS

1.7.4 Laplacian
The Laplacian of a scalar function is defined by
1 √ 1 √
∆φ = ∇ · ∇φ = √ ∂j ( gg jk ∂k φ) = √ ∂j ( g ∂ j φ)
g g

and for a vector field by

∇ × (∇ × E) = −∆E + ∇(∇ · E)

Example 1.26. ∆φ in spherical coordinates:


    
1 2 2 ∂θ φ 2 1
∆φ = 2 ∂r (r sin θ∂r φ) + ∂θ r sin θ 2 + ∂ϕ r sin θ 2 2 ∂ϕ φ
r sin θ r r sin θ
1 1 1
= 2 ∂r (r2 ∂r φ) + 2 ∂θ (sin θ ∂θ φ) + 2 2 ∂ϕϕ φ
r r sin θ r sin θ

1.8 Bibliography
• S. Weinberg, Gravitation and Cosmology, gives all a physicist needs to
know about tensors.
• B. Schutz
• Flanders
Chapter 2

Review of special relativity:


Minkowski space-time

Minkowski space-time is a good approximation of our physical space-time. (This


would not be the case if we lived near a black hole.) It gives the geometric setting
of special relativity and encapsulates the fundamental observations that: The
velocity of light c is finite and has the same value in all inertial frames.

2.1 The principle of relativity


Physicists unlike, say, lawyers, do not need to replace their textbooks when they
relocate. Your physics library would still be useful even if you relocated to a
different earth like planet is a different galaxy. You do not need to make an
adjustment for the relative motion between earth and your new home and or
how the new planet is oriented relative to earth.
Empty space, far from any planet or star, has no distinguished inertial frame
and no distinguished origin or orientation. The speed of light c is a scalar, a
constant of nature, which takes the same value in all inertial frames. This
counter-intuitive property of light was established around 1887 in experiments
of Michelson and Morley. It conflicts with our common intuition about adding
velocities much smaller than c.

Remark 2.1 (c is large). Practical units, like MKS, are chosen to be of O(1)
on human scale. The period of a pendulum of a meter length (about the length
of a leg) is about 2 [sec] and the second is about a heart beat. However, a light
second is about the distance to the moon. On human scale c ≈ 3 × 108 [m/s]
is essentially infinite. It is likely that ancient natural scientist entertained the
thought that c is finite rather than infinite. (Maimonides cautions against logical
pitfalls resulting from the use of infinities.) But as c is so large it is difficult
to devise an elementary experiment to measure it. The first to estimate c from
irregularities in motion of the moons of Jupiter was the Danish astronomer

27
28CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

Rømer (1644-1710). He essentially used the Doppler effect (two hundred years
before Doppler) to measure the ratio v/c where v is the velocity of earth around
the sun.

2.2 Space-time
Space-time is the stage on which events happen. An event, like my typing this
text, is something that happens in space and time and is labeled by 4-coordinates
(t, x). It is convenient to measure space and time in the same units, e.g. measure
distances in light second, or alternatively, measure time in light-meter. With
the latter choice we write

(ct, x1 , x2 , x3 ) = {xµ }, µ = 0, . . . 3

Index convention
Greek indices µ, ν run on 0, 1, 2, 3. Roman indices j, k on 1, 2, 3

t
e
ur
t
Fu

x
t
s
Pa

Figure 2.1: Space-time: The red lines represent the light cones. The blue dot
at the origin is the event where light was emitted (and absorbed).

The (Cartesian) metric of Minkowsky space is commonly denoted by η:


 
−1 0 0 0
 0 1 0 0 
ηµν = 
 0 0 1
 (2.1)
0 
0 0 0 1

It allows to associate a scalar with a vector. In the case of a vector dxj relating
two nearby events in space-time the scalar is called interval

(d`)2 = ηµν dxµ dxν = −(cdτ )2 (2.2)

Note that (d`)2 is indefinite. When (d`)2 > 0 it has a spatial character and has
a real root measured in [cm]. When (d`)2 < 0 it has time-like character and it is
2.2. SPACE-TIME 29

then convenient to consider instead the real root of (dτ )2 > 0 which is measured
in [sec].
• A vector v µ is called space like if vµ v µ > 0;
• v µ is called time-like if vµ v µ < 0
• v µ is light-like if vµ v µ = 0
• The 1-dimensional line x = 0 is the time axis, and the clock attached to
the origin measures time τ related to the interval by (cτ )2 = −(d`)2 ≥ 0.
• The 3-dimensional portion of space time
ηµν xµ xν = 0
is a 3-dimensional light cone.
Example 2.2. In spherical coordinates (ct, r, θ, φ) Minkoski metric is
 
−1 0 0 0
 0 1 0 0 
ηµν =   0 0 2
 (2.3)
r 0 
2 2
0 0 0 r sin θ

2.2.1 Events, world line and proper-time


An event is an objective happening and does not depend on the choice of coor-
dinates. However, its representation in terms of coordinates x = (x0 , x1 , x2 , x3 )
depends of the choice of coordinates.
A line in space-time is called a world line. For example, the collection of
events associated with a clock (moving at subluminal speed) gives the world line
of the clock.
Consider a curve in space-time parametrized by Lab-time
X(t) = (ct, x(t)), −∞ < t < ∞
The velocity is a 4-vector which is tangent to the world line
V = ẋ(t) = (c, v(t)), v = ẋ
A massive particle that does not move faster than c has time-like V at all times.
The Minkowaski length of a world line is a scalar. Its physical meaning is
the time measured by a standard clock moving along the path
1 tp
Z t
dt0
Z
2
1
τ= −(d`) = 0
< t, γ = p (2.4)
c 0 0 γ(t ) 1 − v2 /c2
It is a common, and depressing, fact that when you meet your high school
buddies after a long time you note how everybody else aged. This is psychology,
and I have nothing to say about it. There is an analog objective property of
time: Your clock is slower than the lab clock and traveling keeps you young.
30CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

t t¢

S
x
O

Figure 2.2: The backward light cone of a given point is the collection of the
events in your past that you can observe. An inertial observer that lives long
enough eventually sees all the events in Minkowski space-time.

Exercise 2.3 (Measuring intervals with a clock. Skip on first reading). To


measure space-like intervals you would normally use meter sticks. Wigner found
a clever trick to measure space like intervals using a clock and light signals. This
is illustrated in Fig. 2.3 and is expressed by the formula
(xO − xS )µ (xO − xS )µ = ab

b
S

O
a

Figure 2.3: Wigner method of measuring space-like interval OS with a clock.


The interval between the two space like events O and S is related to the clock
readings a, b. The red lines in the figure are light-like.
2.3. SIMULTANEITY 31

2.3 Simultaneity
Two events X1 and X2 occur simultaneously in the lab if

∆X = X1 − X2 = (0, x1 − x2 )

∆X is Minkowski orthogonal to the lab-time vector T̂ = (1, 0, 0, 0)

(∆X)µ T̂µ = 0

Now consider the notion of simultaneity in an inertial frame moving at uniform


speed v with respect to the Lab. The time-like direction is

(c, v)

and the corresponding unit vector (in Minkowski metric) is


 v
T̂0 = γ 1,
c

T̂ T̂ 0

x0

Figure 2.4: The vector T̂ is the time direction in the lab and the black dot is
simultaneous with the red dot in the lab. The vector T̂ 0 is the time direction
in a moving inertial frame and the blue dot is simultaneous with the red dot in
this frame.

Two events are simultaneous in the moving frame if the 4-vector of their
difference ∆X is Minkowski orthogonal to T̂ 0 , i.e.

(∆X)µ T̂µ0 = 0
32CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

In particular, the events simultaneous with the event at the origin (0, 0, 0, 0) are
given by the points (x0 , x) that satisfy

cx0 − v · x = 0

The two straight lines marked x and x0 in Fig. 2.4 show the events simultaneous
with the origin in the two frames.
The notion of now in the lab frame and the notion of now in the inertial
frame of the clock are incompatible. Now is a well defined concept at a point in
space-time.

2.3.1 Time dilation and length contraction


Eq. 2.4 makes it clear that proper-time is the shorter than lab time. In contrast,
proper length (measured in the frame of the moving object) is the longer length.
This seems confusing. The figures below, and their captions, try to explain time
dilation and length contraction with pictures.

t Interval

x x

Figure 2.5: The Minkowski “distance” (d`)2 between the origin and an event
t = 1 in the lab, the blue line, takes negative values inside the light cone and
positive values outside. The minimum (negative value) is taken at the red dot.
This corresponds to the proper time of a clock which is at rest at the lab. A
moving clock, as shown by the green line, registers cdτ = |d`|. The intersection
with the blue line will give a smaller value for the proper time of the moving
clock.
2.3. SIMULTANEITY 33

t t0 Interval

x0

x t

Figure 2.6: The blue and red dots are the ends of a ruler of length 1 at rest in
the lab. The straight blue line is the world line of the right end. The Minkowski
distance between the origin and the blue line takes its maximum at the red dot.
The green frame is Lorentz frame where the rod is moving. x0 is the time 0 slice
in the moving frame. It intersects the blue line at a point whose Minkowski
distance is smaller than one. The proper length is largest in the rest frame.

Exercise 2.4. Suppose that you have a factory at the origin that makes identical
clocks. Explain how you can distribute the clocks while keeping them (approxi-
mately) synchronized in the Lab. (Hint: What happens to time delay if you half
the speed and double the travel time?)

Remark 2.5. In the 1970’s, when Paul Krugman, A Nobel Laureate in eco-
nomics, was a young assistant professor he wrote an amusing article about the
consequences of time-dilation in economics. It can be found here and is worth
reading.

2.3.2 Light-cone coordinates


Light cone coordinates in Minkowski space are
√ √
2u = x − ct, 2v = x + ct, y = y, z=z

Exercise 2.6 (Metric in light cone coordinates). Show that the Minkowski met-
ric tensor in light-cone ordinates is
 
0 1 0 0
 1 0 0 0 
ηµν = η µν =
 0
,
0 1 0 
0 0 0 1
34CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

u v

Figure 2.7: Light-cone coordinates. This is not a Cartesian frame since u·v = 1.

2.4 4-Velocity
The proper time dτ is a Lorentz scalar. It is non-zero and real for a clock that
travels subluminally. We can then define the velocity as a 4-vector

dxµ
uµ = (2.5)

The length of u is always −c2 , essentially by the definition of the interval, Eq.
2.2,
dxµ dxµ (d`)2
uµ uµ = = = −c2 (2.6)
(dτ )2 (dτ )2
The 4-velocity is therefore a time-like vector which lies in the forward light cone.
It is related to the usual velocity by

dxµ dxµ dτ uµ
(c, v) = = = (2.7)
dt dτ dt γ

Remark 2.7 (Newtonian velocities). The components of the 4-velocity trans-


form like a contravariant 4-vector under Lorentz transformations. In contrast,
the Newtonian velocity has complicated and ugly transformation properties. This
is the result of the fact that for the Newtonian velocity both the numerator and
the denominator are components of a vector.

Think of a world line xµ (τ ) as parametrized by proper time. Since the


4-velocity is normalized and time-like, we can always write it as

u = ẋ = c (cosh φ, n sinh φ), n · n = nj nj = 1 (2.8)


2.4. 4-VELOCITY 35

The direction, n = n(τ ), and so is the rapidity, φ = φ(τ ). Note that, in spite of
the notation, φ ∈ R is not an angle. Comparison with Eq. 2.7 gives the relation
between rapidity and the Newtonian velocity

γ = cosh φ, |v| = c tanh φ (2.9)

The rapidity is often a convenient representation of the velocity.

vc

Rapidity

Figure 2.8: The velocity as function of the rapidity

2.4.1 4 Acceleration:
The 4-acceleration can be similarly defined as
duµ
aµ = (2.10)

It is always Minkowski orthogonal to the velocity

uµ aµ = 0 (2.11)

(Since the Minkowski length of the velocity is constant). It follows that The
4-acceleration is always a space like vector.

2.4.2 Linear acceleration


In linear acceleration n is fixed. Differentiating Eq. 2.7 we find for the 4-
acceleration
a = u̇ = c φ̇(sinh φ, n cosh φ)
Evidently
aµ aµ = c2 φ̇2
In particular, constant acceleration, g, corresponds to linear dependence of φ
on the proper-time τ

φ(τ ) =
c
In this case we can easily integrate u = ẋ to find the world line
c2
x(τ ) − x(0) = (sinh φ, n cosh φ) (2.12)
g
36CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

The world line is a hyperboloid that is eventually tangent to the light cone, as
shown Fig. 2.9.

Remark 2.8. By the equivalence principle, you may think of an accelerating


observer as an observer in free fall in constant gravitational field. Such an
observer has an horizon, like the horizon of a black hole. This is illustrated in
the Fig. 2.9.

Figure 2.9: An accelerating observer who lives forever, will still see only half
the events in Minkowski space-time. He will never see the black dot on the left.
The red line is his horizon.

Remark 2.9 (Numerical coincidence). An amusing numerical coincidence is


that the year, the gravitational acceleration on earth g, and c are simply related:
g × year/c = 1.03

Exercise 2.10. What fraction of the velocity of light would you reach in this
case. Answer: tanh 1.03 = 0.77

2.4.3 Space travel


You may worry that since c is the ultimate speed, a human being, living for, say
80 years, can explore at most a neighborhood of 80 light-years around earth.
This is wrong. The rapidity increases linearly with T and by Eq. 2.12, the
distance is exponential in the rapidity. A space traveler who lives for T years,
in a space ship which accelerates at g will travel a distances, cosh T , measured
in light years. The visible universe has radius of about 1011 light years. So you
will get there is about 26 years.
2.5. CYCLOTRON MOTION 37

2.5 Cyclotron motion


Consider relativistic circular motion with fixed rapidity φ and constant acceler-
ation, aµ aµ = g 2 . The 4-velocity is

uµ = c(cosh φ, cos(ωτ ) sinh φ, sin(ωτ ) sinh φ, 0), ωc sinh φ = g

The acceleration is

aµ = ωc sinh φ(0, − sin(ωτ ), cos(ωτ ), 0) = g (0, − sin(ωτ ), cos(ωτ ), 0)

The orbit is a helix in space-time


c
x(τ ) − x(0) = (ωτ cosh φ, sin(ωτ ) sinh φ, − cos(ωτ ) sinh φ, 0)
ω

2.6 Lorentz transformations


Definition 1 (Lorentz transformations as isometries of Minkoski space). The
linear transformation Λ, a 4 × 4 matrix, is a Lorentz transformation if it leaves
the Minkowski metric η invariant

Λt ηΛ = η ⇐⇒ Λµ α Λν β ηαβ = ηµν (2.13)

It follows that
det Λ = ±1

This divides Lorentz transformations into two classes: The proper Lorentz trans-
fromations where det Λ = 1. which contain the identity and the improper ones
where det Λ = −1 that contain the reflection.

Remark 2.11. Lorentz transformations are the analog of orthogonal transfor-


mations of Euclidean space.

2.6.1 Space-time translations


Space time translations are trivial Lorentz transformations. They are given by

(x0 )µ = xµ + aµ

This gives
∂(x0 )µ
Λµ ν = = δµν ⇐⇒ Λ = 1
∂xν
Translations expresses the homogeneity of Minkowski space time.
38CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

2.6.2 Generators of Lorentz transformations


The Lorentz group is made of rotations and boosts. It is convenient to describe
these transformations in terms of their generators. L L is a 4 × 4 matrix that
generates the one parameter family of transformations

Λ(t) = eLt

If Λ(t) is to be a family of Lorentz transformations it must preserved the


Minkowski metric:
η = Λ(t)t ηΛ(t)
Differentiating with respect to t gives the condition that L generates Lorentz
transformation
ηL + Lt η = 0 (2.14)
This relation makes it evident that the generators make a linear space which is
spanned by a finite set of linearly independent generators. It turns out that the
space has six independent generators: Three rotations and three boosts.
Since L commutes with itself, the associated Lorentz transfromations satisfy
a simple addition rule
et1 L et2 L = e(t1 +t2 )L (2.15)
This is what you expect for addition of rotations about one fixed axis. But,
more surprisingly, this is also the rule for addition of boosts along a fixed axis,
provided we identify t with the rapidity, as we shall see below.
Exercise 2.12. Show that:
1. If λ is an eigenvalue of Λ so is λ∗ .
2. If λ is an eigenvalue so is 1/λ (Hint: Use Λ−1 = ηΛt η to show that
det(Λ − λ) = det(Λ−1 − λ) )

2.6.3 Rotations
Rotations by θ about the x-axis and are generated by
 
0 0 0 0
 0 0 0 0 
Lyz =  0 0 0 1 
 (2.16)
0 0 −1 0

and a rotation by θ about the x axis is given by


 
1 0 0 0
 0 1 0 0 
Λyz (θ) = 
 0
 (2.17)
0 cos θ sin θ 
0 0 − sin θ cos θ

Lyz is is closely related to the angular momentum about the x axis.


2.6. LORENTZ TRANSFORMATIONS 39

Exercise 2.13. Check that Lyz satisfies Eq. 2.14

Similarly for rotations about the x, y space axes. Rotation by θ about an


arbitrary axis n is given by

eθ(nx Lyx +ny Lzx +nz Lxy ) (2.18)

The isometry under rotations, expresses the isotropy of space.

Exercise 2.14. Show that

[Lyz , Lzx ] = −Lxy , (2.19)

Up to factor, Lab 7→ −iLab , these are the commutation relations of angular


momentum

Remark 2.15. An airplane has three rotation controls: Stick, for pitch, rudder
for yaw, and ailerons for roll. The three are linearly independent, but by the
commutation relation you can always generate the third from the first two.

Exercise 2.16. Calculate the residual rotation of (pairwise-cancelling) rotations

Λyz (π/2)Λzx (π/2)Λyz (−π/2)Λzx (−π/2)

Show that this is a rotation about the (−1, 1, 1) axis.

2.6.4 Boosts
A boost in the x direction is generated by
 
0 1 0 0
 1 0 0 0 
Ltx = 
 0
 (2.20)
0 0 0 
0 0 0 0

The associated Lorentz transformation with rapidity φ is given by:


 
cosh φ sinh φ 0 0
 sinh φ cosh φ 0 0 
Λtx (φ) = 
  (2.21)
0 0 1 0 
0 0 0 1

Exercise 2.17. Check that Ltx satisfies Eq. 2.14.

To check that φ is indeed the rapidity, consider the world-line of the origin.
Since
Λtx (φ)(ct, 0, 0, 0)t = ct(cosh φ, sinh φ, 0, 0)t
its velocity is c(cosh φ, sinh φ, 0, 0). Comparison with Eq. 2.7 shows that φ is
the rapidity.
40CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

Exercise 2.18. Check Eq. 2.20 in 1-1 dimensional space-time. (Hint: use the
formula for eφσx from quantum mechanics.
One similarly defines the generators of the boost along the y and z axis. A
boost by φ in an arbitrary direction n is then given by

eφ(nx Ltx +ny Lty +nz Ltz ) (2.22)

The addition law for velocities in special relativity is a mess. It is, however,
simple for the rapidities: Let n be the direction of motion and L = (nx Ltx +
ny Lty + nz Ltz ). Since L commutes with itself

eφ1 L eφ1 L = e(φ1 +φ2 )L

It follows that rapidities (in the same direction) add.


Exercise 2.19 (Galilean transformations). Show that for small rapidities Lorentz
boosts reduce to Galilean transformation:

t0 = t + O(c−2 ), x0 = x − vt + O(c−2 )

2.6.5 Commutators
The commutators of the generators of rotations are given by cyclic permutations
of Eq. 2.19. The commutators of the generators of rotations with the generators
of boosts follow the same rules, since the three generators of boosts are naturally
associated with a vector, i.e.

[Lyz , Ltx ] = 0, [Lxy , Ltx ] = Lty , [Lzx , Ltx ] = −Ltz (2.23)

The most interesting and surprising commutator is between two generators of


boosts. It turns out that the commutator is a rotation:

[Ltx , Lty ] = Lxy ,

This is related to the physical phenomenon known as Thomas precession.

2.7 Rotating frames in Minkowski space


You can sometimes hear claims that special relativity can not correctly describe
the motion of rotating frames and you need general relativity for this. This is
wrong. There is no problem in describing rotating frames, provided they are
rotating relatively to Minkowski space-time.
The rotating earth is such a non-inertial frame. Let Ω be the angular fre-
quency and (ct0 , x0 ) be the coordinate of an inertial fame and (ct, x) the coordi-
nates in a rotating frame. A coordinate transformation between the two frames,
represented in cylindrical coordinates is

t0 = t, ρ0 = ρ, z 0 = z, φ0 = φ + Ωt
2.7. ROTATING FRAMES IN MINKOWSKI SPACE 41

The Euclidena metrics are the related by


dx0 · dx0 = (dρ0 )2 + (dz 0 )2 + ρ02 (dφ0 )2
= (dρ)2 + (dz)2 + ρ2 (dφ + Ωdt)2
= dx · dx + 2ρ2 Ωdφdt + ρ2 Ω2 (dt)2
It follows that the Minwoski metric is
(−cdt0 )2 + dx0 · dx0 =
 Ω2 ρ 2 
−(cdt)2 1 − 2
+ 2Ωρ2 dtdφ +dx · dx
c
| {z } | {z }
Sangac-Coriolis
centrifugal

Exercise 2.20. Ωρ/c at the equator of earth is 1.5 × 10−6


In the case of earth the centrifugal correction can normally be neglected,
but the Sangac term, which is first order in Ω, is important and to a good
approximation
(d`)2 = −(cdt)2 + 2Ωρ2 dtdφ + dx · dx
= −(cdt)2 + 2Ωρ2 dtdφ + ρ2 (dφ)2 + (dρ)2 + (dz)2 (2.24)
In particular, if you consider light propagating along the equator ρ = R, z = 0
and, it takes different times for the clockwise and counter clockwise beams to
complete a 2π rotations, given by the two solutions of the quadratic equation
for T± :
0 = −(cT± )2 + 4πΩR2 T± + R2 (2π)2 (2.25)

Exercise 2.21 (Coordinate times and clock times). 1. Compare the change
in coordinate time dt in the rotating earth frame with the proper-time dτ
measured by a clock at a fixed location in the rotating frame
2. Compare the change in coordinate time dt0 in the inertial frame with the
change in coordinate time dt in the rotating coordinates
3. A clock is taken for a one year trip around earth equator. Show that the
time lag relative to a clock that stayed is
2AΩ
∆τ = ± 2 , A = πRe2
c
where Re is the earth radius and the ± depends on whether the trip was
towards the east or towards the west.
4. Compute ∆τ . (Answer: 207 [ns])
5. Explain why the result implies that one can not synchronize clocks on earth.
6. Is it still true that uµ uµ = −c2 in a rotating frame? (Yes)
7. What does get modifies is γ. Show that
γ −2 = 1 − (v/c)2 + 2Ωφ̇ρ2 /c2 − Ω2 ρ2 /c2
42CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

Figure 2.10: Minkowski space in rotating earth coordinates. It takes light dif-
ferent time, in the rotating frame, to complete a clockwise round-trip and coun-
terclockwise round-trip.

2.7.1 Rindler coordinates and horizons


Rindler coordinates are the analog of polar coordinate system in a two dimen-
sional Minkowski space time

x = ρ cosh τ, t = ρ sinh τ

Since cosh τ ≥ sinh τ the Rindler coordinates cover 1/4 of space time.

Exercise 2.22. Show that

(dx)2 − (dt)2 = (dρ)2 − ρ2 (dτ )2

As we shall see the world line ρ = const describes a uniformly accelerated


observer. So the Rindler coordinates are useful in describing accelerated frames
in Minkowsky space.
2.8. GPS 43

Figure 2.11: The red lines shows the ρ Rindler coordinate.

2.8 GPS
Every time you use your GPS and find, with relief, that the GPS really knows
where you are, you are testing special and general relativity. It took a century
to turn Einstein’s revolution into a useful gadget.
GPS works like that: There are about 24 GPS satellites orbiting earth at
a radius of about 26, 000 [km] and period of about half a day. Their orbit are
known (and monitored) with great precision (few centimeters). On each satellite
there is an atomic clock that measures its proper time with great precision1 .
Each satellite radios at specific interval, a message that contains its identity and
the reading of its clock. Since the orbit of the satellite is known, the data specify
the transmission event Xbµ of satellite b. The corresponding event received by
the user is xµ . Let us focus of the ideal case when both the transmitter and
receiver are in empty space so that electromagnetic wave propagate at c. It
follows that
(Xbµ − xµ )((Xb )µ − xµ ) = 0 (2.26)
This is an equation for the 4 unknowns xµ . To determine the four unknown
coordinates xµ of the reception event you need 4 equation. You need to receive
simultaneously, at least 4 signals from 4 GPS satellites and record 4 transmission
events all light-like. One expects that these equations are typically independent
and to have a unique solution which is the position and time of thee lost tourist.
1 To locate a point with a precision of 1 [m] you need to measure time to a precision of

O(10−9 )[sec].
44CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

Eq. 2.26 clearly incorporates special relativity as it uses the fact that light
propagates at c irrespective of the motion of the satellite and the receiver. It
turns out that for GPS to be practically useful, special relativity is not enough.
One needs to take account of the slight deviations of space-time from Minkowski.
In particular, the self-time of atomic clocks depends, not only on their velocity,
but also on the gravitational field that they see and moreover, one needs a
better approximation for the metric, that takes into account the gravitational
field of eaarth, see section 2.9.) Yet another complication is that in practice one
does wants to know the coordinate of the event in the non-inertial coordinate
system that rotates with earth (as in section 2.7.) There are many additional
complications that need to be taken care of: Atmospheric effects on the velocity
of light etc.
Ignoring special or general relativity would degrade the the accuracy of GPS
to about 10 km and make it useless. You can then satisfy yourself that special
and general relativity has been tested many billions of times.
If you want to know more about GPS, then the article of Neal Ashby in
Living Reviews of General Relativity is a good place to learn. Wikipedia is, as
usual, quite good as well.
Exercise 2.23 (Orders of magnitudes).
• What is a typical velocity of GPS satellite? (Anser: 3.8 [km/s])
• Compute the difference between the coordinate time and the self-time of a
GPS clock after one day. ( ∆t ≈ πRv/c2 ≈ 3.6 × 10−6 [s])
• What is the resulting positioning inaccuracy?

2.8.1 Two dimensional space time


Consider a toy GPS problem in 1+1 dimensions, shown in the Fig. 2.12. If you
see 2 satellites in 1+1 dimensions, this means that one is on your right and the
other on your left, as in Fig. 2.12. (Otherwise one would eclipse the other.)
The light-cones intersect.
Exercise 2.24  (GPS in 1+1 dimensions). Two satellites with known orbits
, a0b (τ ), a1b (τ ) , b = 1, 2, emit signals at τa and τb respectively. Assuming that
aµ − bµ is space like, show that light-cone intersect at
2x1 = ±(a01 − a02 ) + a11 + a12 , 2x0 = (a01 + a02 ) ± (a11 − a12 )
(One solution is in the past and the other in the future.)
This toy model tests:
1. Space-time is approximately Minkowski
2. Electromagnetic waves propagate at c
3. The velocity of the satellite at the transmission event is irrelevant
4. The velocity of the lost tourist at the reception event is irrelevant
2.9. SPACE-TIME NEAR A POINT MASS 45

ct

a
b

Figure 2.12: The world line of the two satellites are the blue lines. The intersec-
tion of the light cones is the event whose coordinates we seek. Since the orbits
of the satellites are known, the events (a0 , a1 ) and (b0 , b1 ) are known given the
proper times τa and τb .

2.9 Space-time near a point mass


Minkowski space-time is a good local approximation of physical space-time. It
fails at cosmological distances, at distant times, and when great accuracy is
needed. In particular, Minkowski is not a sufficiently approximation of space-
time in the theory of GPS (see Neal Ashby in Living Reviews of General Rela-
tivity). A more accurate model of the metric in the vicinity of a point mass M
is
2GM
(d`)2 = − (1 + Φ) (cdt)2 + (1 − Φ) dx · dx, Φ(x) = − 2 (2.27)
c |x|
G is Newton constant. (t, x) are space-time coordinates. This metric is not
equivalent to Minkowski under change of coordinates.
A clock at a fixed location ticks at rate

dτ = 1 + Φ dt

Far from the star, when Φ = 0, the clock rate coincides with the coordinate
time rate. However, close to the star dτ < dt: the clock ticks more slowly than
the coordinate time. This means that if you moved near a very massive star,
you will outlive your friends who stayed far from it. Kip Thorn reinterpret
gravitational attraction as our wish to live longer. On earth, the difference the
gravitational metric and Minkoiwski is very small:

Example 2.25. Φ at the surface of the earth is dominated by the pull of the
sun, Φ = 2 × 10−8 and the pull of the earth is about an order of magnitude
smaller due to its pull is Φ = −1.4 × 10−9 . This adds about 1 [sec] to our life
expectancy.

Bibliography

1. S. Weinberg, Gravitation and Cosmology, Chapters 1 and 2


46CHAPTER 2. REVIEW OF SPECIAL RELATIVITY: MINKOWSKI SPACE-TIME

2. B. F. Schutz, A first course in general relativity, geometric exposition of


relativity.
3. Neal Ashby in Living Reviews of General Relativity
Chapter 3

The electromagnetic fields

3.1 Electromagnetic fields in Minkowski space


The basic objects of mechanics, velocity and accelerations can be viewed as the
3 dimensional shadows of 4-vectors in Minkowski space-time. In both cases the
spatial 3-vectors, gained a zero component. 4-vectors then behave much more
nicely under Lorentz transformations than their 3-dimensional shadows.
What about the electromagnetic fields E and B? What are they shadows
of? From a Euclidean perspective E and B are quite different: The force of
the electric field is independent of the velocity of the particle and the force of
magnetic field is linear in the velocity1 . In the c.g.s. (=Gaussian) units (which
we shall nevertheless use):
v e
f= eE + e × B =⇒ fi = eEi + εijk v j B k (3.1)
|{z} c
| {z } c
Coulomb Lorentz

Remark 3.1 (SI). In the (defunct) Gaussian (c.g.s.) units the unit of electric
field is rather large, 300 [V /cm], being three orders of magnitude larger than the
field near an A battery, and the unit of magnetic field is Gauss which is rather
small, on the scale of the earth magnetic field. The more practical SI units which
involve arbitrary constants such as the permittivity 0 = 8.85 × 10−12 [F/m] and
permeability µ0 = 4π × 10−7 [H/m], the unit of electric field is almost five orders
of magnitudes smaller, 1 [V /m], and the unit of magnetic field is four orders of
magnitude larger: T esla = 104 Gauss, which is comparable to the field near a
strong toy magnet. In SI units2 the Coulomb-Lorentz law is
f = e (E + v × B)
The force of electric field of 1 [volt/cm] and the magnetic force of 1 [Gauss]
have comparable magnitudes at velocities of 1000 [km/sec].
1 Moreover, since f , v are vectors so is E and under inversion E 7→ −E. But B is a pseudo-

vector: Under inversion B 7→ B.


2 And also if one takes units where c = 1.

47
48 CHAPTER 3. THE ELECTROMAGNETIC FIELDS

The partition into electric and magnetic field is, of course, different in dif-
ferent inertial frames3 .
Exercise 3.2 (Galilei invariance). In Newtonian mechanics the force is Galilean
invariant: f = f 0 (Why). Show that the transformation rules for the fields under
Galilean transformations with relative velocity v is
1
E0 = E +v × B , B0 = B (3.2)
c
The mixing of E and B under the change of inertial frames suggests that
they come from a single entity in space-time. This entity is not a 4-vector
since we need 6 slots and a 4-vector has too few. It is not a general second
rank tensor, since it has 16 components which are too many. It is not even a
symmetric second rank tensor since it has 10 components, still too many. An
anti-symmetric rank 2 tensor
Fµν = −Fνµ (3.3)
has 6 components, which is just right.
Exercise 3.3 (Symmetry is Lorentz invariant). Symmetry is a tensorial invari-
ant, e.g. if Fµν is anti-symmetric so is (F 0 )µν = Λµ α Λν β Fαβ under arbitrary
change of coordinates (and Lorentz transformation in particular). As a conse-
quence if F is anti-symmetric in Cartesian coordinates it is also anti-symmetric
in spherical coordinates.
This leaves us with the problem of how to put the two vector fields (E, B)
in Fµν . The quickest way to do this is to write the Coulomb-Lorentz law in a
way that is manifestly Lorentz invariant, i.e. as an identity between 4-vectors,
which reduces to its non-relativistic form in the approximation γ ≈ 1. Since the
Coulomb-Lorentz force in linear in the fields and in the velocity, we can get a
4-force vector by contracting the field tensor with the 4-velocity
e e 
fµ = Fµν uν =⇒ f1 = F10 u0 +  u
F11 1
+ F12 u2 + F13 u3 (3.4)
c c
Since u = γ(c, v) in the non-relativistic approximation γ ≈ 1 we identify the
electric field with the first row of F
F10 = E1 =⇒ Fj0 = Ej (3.5)
and the magnetic field with
F12 = B 3 =⇒ Fjk = jki B i (3.6)
In conclusion the identification of F with E and B is given by
 
0 −Ex −Ey −Ez
 Ex 0 B z −B y 
Fµν =   Ey −B z
 (3.7)
0 Bx 
Ez B y −B x 0
3 Einstein was lead to the discovery of special relativity y considering the symmetry of

Maxwell’s equations.
3.1. ELECTROMAGNETIC FIELDS IN MINKOWSKI SPACE 49

Exercise 3.4 (Mixed components). The matrix associated with the mixed tensor
F is neither symmetric nor anti-symmetric. Verify that
 
0 Ex Ey Ez
 Ex 0 Bz −By 
F µν =  Ey −Bz
 (3.8)
0 Bx 
Ez By −Bx 0

Remark 3.5 (Coordinate free form). In a coordinate free form

F = Fµν eµ ⊗ eν
Eq. 3.4 gives the relativistic Coulomb-Lorentz force 4-vector.

3.1.1 Anti-symmetric tensors describe a pair of vectors


More generally, given any anti-symmetric tensor
 
0 x1 x2 x3
 −x1 3 2
0 y −y 
F =  −x2 −y 3 1
 (3.9)
0 y 
−x3 y 2 −y1 0

the triplets (x1 , x2 ,3 ) and (y 1 , y 2 , y 3 ) transform like vectors under 3 dimensional


rotations considered as a subgroup of the Lorentz group.

Euclidean Rotations
Consider a 3×3 rotation matrix R of Euclidean space so R−1 = Rt . In Euclidean
space contravariant components are the same as covariant components. The
components of matrix R are
column row
z}|{ z}|{
R j k = (Rt ) k
j
|{z} |{z}
row column

The lift of R to a rotation in Minkowski space is


 
1 0 0 0
 0 
ΛR = 
 0
, (3.10)
R 
0

The (covariant) components xj transfrom by

(x0 )j = F0j
0
= Λ0 µ Λj ν Fµν
= Rj k F0k
= Rj k xk (3.11)
50 CHAPTER 3. THE ELECTROMAGNETIC FIELDS

which is the rule of transformation of Euclidean 3-vectors under rotations. Now


consider the triplet 2y j = εjmn Fmn . The Levi-Civita symbol is invariant under
(proper) rotations since det R = 1. Hence

(2y 0 )j = (ε0 )jmn (F 0 )mn


= εjmn Rm a Rn b Fab
= εjmn εabk Rm a Rn b y k
j
= (R−1 )k (2y)k (3.12)

The last step uses the formula for inverse of a 3 × 3 matrix R


j 1
(R−1 )k = εkab εjmn Rm a Rn b
2 det R
and the fact that det R = 1 for a rotation. It remains to get rid of the inverse.
To do that write
j j
(R−1 )k = (Rt )k = Rj k = Rj k
(in Cartesian coordinates). This gives
j
(2y 0 )j = (Rt )k (2y)k
= Rj k (2y)k (3.13)

One sees that x and y both transform as vectors. More precisely, x is a vector
while y is a pseudo-vector as the transformation rule relied on the use of Levi-
Civita (and det R = 1).
Example 3.6. The covariant components of the field tensor in cylindrical co-
ordinates (ct, ρ, φ, z) are
 
0 −Ex c − Ey s ρ(−Ey c + Ex s) −Ez
 ... 0 ρBz −By c + Bx s 
Fcylind =   ...

... 0 ρ(Bx c + By s) 
... ... ... 0

where c = cos φ and s = sin φ. In normalized coordinates


 
0 −Ex c − Ey s −Ey c + Ex s −Ez
 ...
 0 B z −B y c + Bx s 

 ... ... 0 Bx c + By s 
... ... ... 0

Exercise 3.7 (Magnetic field of currnet line). A line of current I along the
z-axis creates a magnetic field B = 2I
cρ θ̂ in cylindrical coordinates. Show that
in cylindrical coordinates Fρz = −Fzρ = 2I cρ and all other components vanish.
(Hint: It is simpler to use properties of the basis vectors eρ , ez rather than
compute the transformation matrix.)
3.2. THE FIELD OF A UNIFORMLY MOVING CHARGE 51

3.2 The field of a uniformly moving charge


For a charge e at rest at the origin
xj
Ej = e , Bj = 0 (3.14)
r3
Everything is time independent so we can compute this at any time we want.
Consider the Lorentz boost
 
C S 0 0
 S C 0 0 
Λ=  0 0 1
, C = cosh φ, S = sinh φ (3.15)
0 
0 0 0 1

where φ is the rapidity connected with the usual γ and β by

γ = cosh φ, β = tanh φ (3.16)

The two Lorentz scalars are


e2
E2 − B2 = , E·B =0 (3.17)
r4
The transformation rules are

F 0 µν (x0 ) = Λµ α Λν β Fαβ (x0 ) = Λµ α Λν β Fαβ (Λx) (3.18)

In the frame where we see a moving charge, everything depends on time. So let
us compute everything at t0 = 0 when the charge is at the origin. We have

x = Cx0 , y = y0 , z = z0 (3.19)

In particular

r2 = C 2 x02 + y 02 + z 02 = C 2 x02 + (C 2 − S 2 )(y 02 + z 02 )


= C 2 r02 − S 2 (y 02 + z 02 ) (3.20)

Let us turn to the fields. Ex does not change

Ex0 = F 0 01 = Λ0 α Λ1 β Fαβ = (Λ0 0 Λ1 1 −Λ0 1 Λ1 0 )F01 = (C 2 −S 2 )Ex = Ex (3.21)

Hence, at t0 = 0
x x0
Ex (x) = e = eγ = Ex0 (x0 ) (3.22)
r3 r3 (r0 )
where r(r0 ) is the ugly expression Eq. (3.20).
For the transverse directions

Ey0 = F 0 02 = Λ0 α Λ2 β Fαβ = Λ0 α Fα2 = CF02 = γEy (3.23)


52 CHAPTER 3. THE ELECTROMAGNETIC FIELDS

Figure 3.1: The vector field of a moving charge with rapidity φ = 1. The field
is manifestly radial but not spherically symmetric.

and so
y y y0
Ey (x) = e , = Ey0 (x0 ) = γe = γe (3.24)
r3 r3 r3 (r0 )
The formula is the same but for different reasons. In one case γ came form the
field transformation and in the other from the coordinates.
It now follows that in both frames the field is radial, because

x Ex (x) E 0 (x) x0
= = x0 = 0 (3.25)
y Ey (x) Ey (x) y

Remark 3.8. This is a bit surprising. One could have argued that since the
field is radial in the rest frame, you may expect it to point in the direction of
the particle at the retarded time, not now.

The total strength of the field is

2 γ2
E 0 = e2 = γ2E2 (3.26)
r4

It is stronger in the frame where the charge is seen moving (computed for the
same event).

3.3 Lorentz scalars


One can construct interesting Lorentz scalars from the field tensor F . There
are no interesting scalars that are linear in the field at a given point since

Fµν η µν = Fµ µ = 0
3.3. LORENTZ SCALARS 53

We can however construct interesting quadratic scalars:

Fµν F µν = F0j F 0j + Fj0 F j0 + Fjk F jk


= −2Ej Ej + 2Bi Bi
= −2(E2 − B2 ) (3.27)

You might be worried that we have made a sign error: E2 +B2 is proportional to
the energy density of the field. Why the minus sign? Actually, it is a fortunate
that we did not get the energy density, for, as we shall see the energy density is
a component of a second rank tensor.
The minus sign is reminiscent of the minus sign you find in Lagrangian
mechanics: The Lagrangian is the difference of the kinetic and potential energies,
not their sum. As we shall see, this is not a coincidence.

3.3.1 Duality and Levi-Civita


Duality, denoted by ∗, is an operation whose square in the identity: ∗∗ = 1.
Taking a dual involves no loss of information. The Levi-Civita tensor allows us
to define such a duality between anti-symmetric tensors4 . In 2 dimensions, the
duality is between anti-symmetric tensors and scalars:

1 jk
ω∗ = ε ωjk ⇐⇒ (ω ∗ )jk = εjk ω (3.28)
2
and in 3 dimensions between vectors and anti-symmetric 2-rank tensors:

1 ijk
(ω ∗ )j = ε ωjk (3.29)
2
In 4-dimensions it is between anti-symmetric tensors and corresponds to ex-
changing the two 3-vectors associated to the tensor:

1 µναβ
(F ∗ )µν = ε Fαβ (3.30)
2
Remark 3.9. In any dimension there is a duality between anti-symmetric ten-
sors of rank r and anti-symmetric tensors of rank n−r. Anti-symmetric tensors
of rank r make a linear space whose dimension nr and since nr = n−r
  
r the
two linear spaces are isomorphic. The contraction of the Levi-Civita with an
anti-symmetric tensor of rank r gives an anti-symmetric tensor of rank n − r.

Exercise 3.10. Show that

1.
εαβγδ εαβµν = 2(δα µ δβ ν − δα ν δβ µ )
4 For the sake of simplicity, we shall take | det g| = 1 so that Levi-Civita tensor and symbol

coincide. In general, duality depends on the metric as it involves det g.


54 CHAPTER 3. THE ELECTROMAGNETIC FIELDS

2. Using this show that


F ∗∗ = F

Duality effectively interchanges E and −B.

(F ∗ )0i = 12 ε0ijk Fjk = εijk Fjk = B i , {ijk} ∈ even permutation of {1, 2, 3}.

Similarly,

(F ∗ )jk = 21 εjkαβ Fαβ = εjk0i F0i = ε0jki F0i = −εjki Ei

Exercise 3.11. Write the formulas in this section without assuming | det g| = 1.

3.3.2 Second Lorentz scalar


We can construct a second Lorentz scalar by contracting F with its dual F ∗

(F ∗ )µν Fµν = 2(F ∗ )0j F0j + (F ∗ )jk Fjk = −4E · B (3.31)

Electromagnetic fields up to scaling, give nine equivalence classes:


 
> 0
 > 0

E2 − B2 = = 0 E·B= =0
 
<0 <0
 

For example, if there is electric field and no magnetic field in one frame, then in
any other frame E2 − B2 > 0 and E · B = 0 etc. Similarly, the field of a plane
electromagnetic wave has E2 − B2 = E · B = 0, in any frame.

Exercise 3.12. Prove of give a counter example to: E2 − B2 > 0 and E · B = 0


imply that there is a Lorentz frame where B = 0

Exercise 3.13. How does the formula E · B change when | det g| =


6 1.

3.4 The homogeneous Maxwell equations


The two homogeneous Maxwell’s equations are
1
∇ · B = 0, ∂0 B + ∇ × E = 0, ∂0 = ∂t (3.32)
c
The first says that there are no magnetic monopoles and the second is the
Faraday law of induction. Together, they give 4 equations.
The equations have a coordinate independent form but assume inertial frame.
This means that the metric tensor in space-time has the form
 
−1 0
g= (3.33)
0 g3
3.5. POTENTIALS 55

where g3 is a time-independent 3 × 3 positive matrix.


Since I do not wish to enter into writing equations with covariant derivatives,
we shall proceed assuming Cartesian coordinates, i.e. assuming g3 = 1.
The two equations 3.32 have different character. Farady’s law is an evolution
equation that dictates how B evolves in time, while the no-monopole condition
is not an evolution equation. It can be viewed as a constraint equation on
the admissible magnetic fields at any given time. The two equations are not
independent, and must be consistent in the sense that if B starts divergence-
less, it must evolve in a way that its stays divergence-less. This is indeed the
case:  
∂0 ∇ · B = −∇ · ∇ × E = 0 (3.34)
The no-monopole condition may therefore be viewed as a constraint on the
initial data.

3.4.1 No monopoles
The absence of magnetic monopoles is expressed by

∇ · B = ∂j (B j ) = 0 (3.35)

This can be written in terms of F as

0 = ∂j (B j ) = ∂j (F ∗ )0j = ∂µ (F ∗ )0µ = ∂µ (F ∗ )µ0 (3.36)

Exercise 3.14. Generalize this to | det g| =


6 1

3.4.2 Faraday law


Faraday law of induction written in components

0 = ∂0 B i + ijk ∂j Ek = ∂0 (B i ) + ijk ∂j Ek = ∂µ (F ∗ )µi (3.37)

3.4.3 Amalgamating the homogeneous Maxwell equations


The 4 homogeneous Maxwell equations can therefore be amalgamated into a
single equation for a vector field

∂µ (F ∗ )µν = 0 (3.38)

This can be phrased as a conservation law for F ∗ , or as the statement that F ∗


is divergence free.

3.5 Potentials
The homogeneous Maxwell’s equations can be rephrased as the statement that
the fields are derived from potentials. B is derived from the vector potential A:

∇ · B = 0 ⇐⇒ B = ∇ × A (3.39)
56 CHAPTER 3. THE ELECTROMAGNETIC FIELDS

Similarly, ∂0 A + E being derived from a scalar potential φ is a consequence of


combining Faraday with the no monopole condition:

0 = ∂0 B + ∇ × E = ∇ × (∂0 A + E) ⇐⇒ ∂0 A + E = −∇φ (3.40)

You normally see this equation rearranged so the field is on one side and the
potentials on the other side.

3.5.1 The 4-potential


We want to amalgamate φ, A) into a single 4-vector A. Since F is an anti-
symmetric tensor which is related to derivatives of the potential a good starting
point is
Fµν = ∂µ Aν − ∂ν Aµ (3.41)
Comparing with Eq. 3.39 we see that A = (A1 , A2 , A3 ) and comparing with
Eq. 3.40 we find A0 = −φ. In summary

Aµ = (−φ, A) (3.42)

We can now reformulate the 4 homogeneous Maxwell’s equations concisely as

∂α (F ∗ )αβ = 0 (3.43)

Indeed,this follows from

2∂α (F ∗ )αβ = εαβµν ∂α Fµν = εαβµν (∂αµ Aν − ∂αν Aµ ) (3.44)

and observing that a contraction of a symmetric and anti-symmetric objects


vanish
εαβµν ∂αµ = 0 (3.45)
Eq. 3.43 follows.
Exercise 3.15. [Bianchi identity] Show that 4 equations (3.43) can also be
written as
∂α Fβγ + ∂β Fγα + ∂γ Fαβ = 0
In mechanics the price for using the a non-inertial coordinate system, such as
earth, leads to the price of the emergence of new forces: Coriolis and Centrifugal.
You may wonder if there is an analog in electrodynamics. The next exercise
explains why there is none.
Exercise 3.16. Consider the coordinate system attached to the rotating earth
introduced in the previous chapter.
1. Compute det g for the earth rotating coordinate system. (Answer: det g =
−1)
2. What does this imply about the homogeneous Maxwell equations in the
earth frame?
3.6. GAUGE TRANSFORMATIONS 57

Exercise 3.17 (Harmonic A). Given the vector potential


ν
Aµ (x) = Aµ eikν x

Show that

Fµν F µν = −2 kµ k µ Aν Aν − (kµ Aµ )2 , Fµν (F ∗ )µν = 0




It follows that if k µ is light like the first term in the brackets on the left vanish.
The second term in the brackets vanishes if k and A are Lorentz orthogonal.

3.6 Gauge transformations


The vector potential
Aµ = ∂µ Λ (3.46)
for any (differentiable) function Λ(x0 , . . . , xµ is called a pure-gauge. The asso-
ciated field F vanishes identically:

Fµν = ∂µ Aν − ∂ν Aµ = ∂µν Λ − ∂νµ Λ = 0 (3.47)

Since the (linear) mapping A 7→ F has a large kernel–the pure gauge fields,
it has no inverse and F does not determine the 4-potential A uniquely. For any
(scalar) function Λ of space-time

Aµ → Aµ + ∂µ Λ (3.48)

give the same field F . Fields have a direct physical meaning: they can be mea-
sured as forces. Potentials are tools for computations and no physical instrument
measure potentials. In particular, a voltmeter does not measure φ.

3.6.1 Non-local gauge invariants Lorentz scalars


Figure 3.2: Loop and surface element for Stokes.
58 CHAPTER 3. THE ELECTROMAGNETIC FIELDS

Although A is not gauge invariant, line integral of A over closed loops in


space-time is a gauge invariant scalar. By Stokes:
I Z
Aµ dxµ = Fµν dS µν

where dS is the area element spanned by the loop. Both sides are manifestly
Lorentz scalars, and the right hand side is manifestly gauge invariant.
A familiar, special case of the formula is a closed loop at fixed time, dxµ =
(0, dxk ), where
I I Z Z
Ak dxk = A · d` = ∇ × A · dS = B · dS = Φ

gives the magnetic flux through the loop.


Exercise 3.18. Suppose that γ is a curve in Euclidean spaceR and consider the
surface S spanned
R by the curve for t ∈ [0, t0 ]. Show that F dS is the emf
action, i.e. Eemf dt.
Quantum mechanics gives a fundamental unit of magnetic flux Φ0 = 2 ×
10−15 [W eber]. In SI [W eber] = [~/e]. There are about 10 quantum flux quanta
flux through a 1 [µ2 ] area of the earth magnetic field. So for a bacterium
a quantum flux is a natural magnetic flux scale. It is interesting, and even
mysterious, that when quantum mechanics meets special relativity, the Lorentz
scalars give rise to quantized objects. Here are some examples whey Φ0 shows
up: It is the quantized magnetic flux in vortices that threadRa superconductor;
The charge of magnetic monopoles; and it is the emf action E · dt that shows
up in the quantum Hall effect.
Exercise 3.19. What is Φ0 = 2 × 10−15 [W eber] in c.s.g in terms of ,̄e and c .

3.7 Electromagnetic fields in curvilinear coordi-


nates
In the case that we want to describe the electromagnetic fields in non-Cartesian,
curvilinear coordinates we need to pay attention to the distinction between
covariant and contarvariante components of E and B and between the Levi-
Civita tensor and symbol. In particular we need to adjust Eq. 3.1
ep
fi = eEi + |g|εijk v j B k (3.49)
c
where g is as in Eq. 3.33. It follows that
p
Fj,0 = Ej , Fjk = |g|εijk B i (3.50)

We retain Eq. 3.41


Fµν = ∂µ Aν − ∂ν Aµ (3.51)
3.7. ELECTROMAGNETIC FIELDS IN CURVILINEAR COORDINATES59

and define the dual so that F ∗ is a tensor, (not a density). This means that we
replace Eq. 3.30 by
1
(F ∗ )µν = p εµναβ Fαβ (3.52)
2 |g|
From the rhs of Eq.3.44 we have the Bianchi-like identity
p
0 = ∂µ ( |g|(F ∗ )µν ) (3.53)

The 0-component reads


p
0 = ∂µ ( |g|(F ∗ )0µ )
p 1
= ∂i ( |g|(F ∗ )0i ) = ∂i (ε0ijk Fjk )
2
1 ijk
= ∂i (ε Fjk )
2 p
= ∂i ( |g|)B i )
=∇·B (3.54)

which is, as before, the no-monopole condition in curvilinear coordinates.


Similarly, for the i-th spatial component
p
0 = ∂µ ( |g|(F ∗ )iµ )
p p p
= ∂0 ( |g|(F ∗ )i0 ) + ∂j ( |g|(F ∗ )ij ) + ∂k ( |g|(F ∗ )ik )
1
= ∂0 (εi0jk Fjk ) + ∂i (εij0k F0k ) + ∂k (εik0j F0j )
2
1
= − ∂0 (εijk Fjk ) + ∂i (εijk F0k ) − ∂k (εijk F0j )
2p
= ∂0 ( |g|B i ) + εijk (∂i Ek − ∂k Ej )
p p
= ∂0 ( |g|)B i ) + |g|(∇ × E)i
p i
= |g| ∂0 B + ∇ × E (3.55)

This gives Faraday law. In the last line we used the fact that curvilinear co-
ordinates for Maxwell equations in an inertial frame, have a g which is time-
independent.
It is quite remarkable that Electrodynamics in curvilinear coordinates can
be formulated with no reference to Christoffel symbols.
Bibliography
1. F. Hehl and Y. Obukhov, A gentle introduction into the foundations of
electrodynamics
60 CHAPTER 3. THE ELECTROMAGNETIC FIELDS
Chapter 4

Variational principle

The equations of motion of a relativistic (charged) particle are formulated as a


variational principle.

4.1 Physics is where the action is


Lagrangian mechanics reformulates Newtonian mechanics: Newton equations
are interpreted as the Euler-Lagrange equations—The minimizers of the action.
The advantages of the Lagrangian formulation are

• It spares the trouble of finding the components of possibly complicated


vectors by giving the problem a scalar formulation involving the action

• It is manifestly independent of the choice of coordinates, and allows for


using even non-inertial coordinate systems (and automatically takes care
of fictitious, d’Alambert, forces.)

• It is convenient for handling symmetries and the associated conservation


laws.

• It can be generalized to the theory of Electromagnetism, to Gravitation


and Quantum Mechanics. Essentially, all known theories admit a La-
grangian formulation

• It is elegant

The property of being minimizer does not depend of the the choice of co-
ordinates and so the Lagrangian formulation guarantees the tensorial character
of the equations. Since Lorentz transformations are special coordinate trans-
formations of Minkowski space, a theory is guaranteed to be Lorentz invariant
once the action is a Lorentz scalar.
In this section, we shall see how one formulates relativistic mechanics using
Lagrangian formalism. The algorithm for doing that is:

61
62 CHAPTER 4. VARIATIONAL PRINCIPLE

Figure 4.1: The minimum at the point p does not depend on what coordinates
you choose for the x axis)

• Find a Lorentz scalar that could serve as an action

• Verify that in the non-relativistic limit the action reduces to its Newtonian
form

This does not fix the relativistic action uniquely. This is a feature that allows
for creativity to be involved.

Figure 4.2: The world line is required to have time-like tangents–for the action
to be real. The blue curve represents the variation of a world line with fixed end
points. The black line maximizes the proper-time τ . The red lines are light-like.
The red path between the end points has zero proper-time.

4.1.1 Action for a free massive particle


Massive particles move at subluminal speed. A natural Lorentz scalar associated
to a path in space-time of such a particle is the elapse of proper time τ . We could
have used the proper time as a candidate for the relativistic action. However,
we also want the relativistic action to reduce to the non-relativistic action for
slow particle. So, we need to massage the proper time a bit.
4.1. PHYSICS IS WHERE THE ACTION IS 63

We start by fixing the dimensions. Traditionally, the action has units of


[p][x], we append scalars c and m to the proper time fix the dimension:
Z Z
2
Sp = −mc dτ
|{z} = L dt
|{z} (4.1)
proper time coordinate time

The Lagrangian

mc2
L=−
γ
= −mc2 (1 − v2 /c2 )−1/2
m
mc2 +
≈ − |{z} v·v (4.2)
const |2 {z }
kinetic energy

This is the same as the classical Lagrangian of free particle, up to a (large)


negative constant −mc2 . Adding a constant to the action does not affect the
minimizing path, of course.
Exercise 4.1. In rotating (earth) coordinates, 2.7,
1 1  
≈ 1 − 2 v2 − Ω2 ρ2 + Ω φ̇ρ2
γ 2c
 
1  2
=1− v − (Ω × x)2 + 2 Ω · v × x

2c2
 |{z}
| {z } | {z }
Kinetic centrif ugal Coriolis

Use this to drive the equations of motion of a free particle in a rotating frame.

4.1.2 Interaction with the electromagnetic field


Consider now a particle that interacts with an external electromagnetic field.
In classical mechanics we describe the Coulomb and Lorentz forces by adding
to the action two terms
Z Z
e
Sint = −e φ(x)dt + A(x) · v dt (4.3)
c
where φ is the scalar potential and A the vector potential (and I have chosen
c.g.s. units). We can write the action as a Lorentz scalar by introducing the
Minkowski gauge field
Aµ = (−cφ(x), A(x)) (4.4)
The interaction is the Lorentz scalar
Z
e
Sint = Aµ dxµ (4.5)
c
64 CHAPTER 4. VARIATIONAL PRINCIPLE

4.1.3 Gauge invariance


Since the action is constructed from the potentials, it is not manifestly gauge
invariant. One needs to worry about gauge invariance. Under change of gauge

A0µ = Aµ − ∂µ Λ (4.6)

This leads to
Z
0 e e
(∂µ Λ) dxµ = Sint −

Sint = Sint − Λ(xf ) − Λ(xi ) (4.7)
c c
Although Sint changes under a gauge transformation, the change is fully de-
termined by the end points of the orbit. Variations of the orbit that keep the
endpoints fixed do not affect the the action. This guarantees that the Euler
Lagrange equations are gauge invariant.

4.1.4 Euler-Lagrange
S is a Lorentz scalar, of course. If we want to use the ready made Euler-Lagrange
equations, familar from mechanics,
d ∂L ∂L
= (4.8)
dt ∂v ∂x
we nee to write Z
S= Ldt, L = L(x, v) (4.9)

The relativistic Lagrangian L is given by


e
γL = −mc2 + Aµ uµ (4.10)
c
When Euler-Lagrange equations are appllied to L one find the correct relativistic
equations of motions. But the formulas above suffer from the deficiency that
formulation is not covariant: γL is a scalar, but L is not. This reflects the choice
of lab-time in Eq. 4.9. For this reason we shall re-derive the variation equation
in a covariant fashon in the next section.

4.2 Variation of the action


The action is a function on paths: It associates a number with a given path
xµ (τ ). We may think of the path as parametrized by its proper time. The
action clearly the form Z xf
S= f (x, u)dτ (4.11)
xi

In fact
e
f (x, u) = −mc2 + Aµ (x)uµ
c
4.2. VARIATION OF THE ACTION 65

The end point (events) xi and xf are fixed.


We shall denote the variation of the path by

δx = { δx0 (τ ), . . . δx3 (τ ) }
| {z }
inf initesimal f unctions

δx vanishes at the end points. The strategy one uses to derive the Euler-
Lagrange equations is to use integration by parts to bring δS to the form
Z xf
x
δS = hµ (x, u)δxµ xfi + gµ (x, u, u̇)δxµ dτ
xi

Since δx vanish at the end points the boundary term drops. And since δxµ are
µ

arbitrary,1 the action is extremal if

gµ (x, u, u̇) = 0

These are Euler-Lagrange equations and the derivation is manifestly covariant.


This discussion masks the subtlety that a variation of the path entails a variation
of the proper-time, which is the integration variable.

4.2.1 Variation of the action of a free particle


When the path is varied, the proper time dτ is affected. The variation in the
proper-time with is given by

δ(cdτ )2 = 2c2 dτ δ(dτ ) = δ(−dxµ dxµ ) = −2dxµ δ(dxµ ) (4.12)

Factoring 2dτ we can write this in terms of the 4-velocity as

c2 δ(dτ ) = −uµ δ(dxµ ) = −d uµ δxµ + duµ (δxµ )




The variation of Sp for a free particle is then


Z
− δSp = −m uµ δxµ end pts
+m u̇µ δxµ dτ (4.13)

Euler-Lagrange equation follow by setting the boundary term and the variation
to zero. This gives
mu̇µ = 0 (4.14)
Recall that the momentum in classical mechanics is defined as the rate of change
of the action due to change of the end-point of a classical path. This means that
we look at the boundary term when we set mu̇ = 0 in the integral in Eq.4.13,
namely
δSp = pµ δxµ = m uµ δxµ (4.15)
The covariant components of the momentum are the gradient of the action with
respect to the end points.
1 Viewed as functions of τ the variations satisfy the constraint: (dδxµ )(dδxµ ) = −(cdτ )2
66 CHAPTER 4. VARIATIONAL PRINCIPLE

4.2.2 Variation of the action associated to interaction


For Sint write

δ(Aµ dxµ ) = (δA)µ dxµ + Aµ δ(dx)µ = (∂ν Aµ ) (δxν )dxµ + Aµ δ(dx)µ

The basic idea in the calculus of variation is always to use integration by parts
to get rid of terms of the form δdx. Hence, rewrite the last term

Aµ δ(dx)µ = d Aµ δxµ − (dA)µ δxµ = d Aµ δxµ − (∂ν Aµ )dxν δxµ


 

Combining the two expressions and changing summation indices where needed
we find

δ(Aµ dxµ ) = d Aµ δxµ + (∂µ Aν ) (δxµ )dxν − (∂ν Aµ )dxν δxµ



  
= d Aµ δxµ + ∂µ Aν − ∂ν Aµ uν dτ δxµ
= d Aµ δxµ + Fµν uν dτ δxµ


Hence, Z
e e
Aµ δxµ |bdry + Fµν uν dτ δxν

δSint = (4.16)
c c
where we introduced the second rank tensor Fµν to describe the electromagnetic
fields fields E and B.

4.2.3 Euler-Lagrange equation


The variation of the total action vanishes for any δxµ provided the integrand in
Sf ree + Sint add to zero. This gives the Euler-Lagrange equation
e
ṗµ = m u̇µ = Fµν uν (4.17)
c
we have guessed in the previous section.
Exercise 4.2 (Charged particle in a constant fields). Solve the equations of
motion of a charge particle in constant parallel electric and magnetic fields
Exercise 4.3 (Charged particle in a radiation field). Show that the equations
of motion of a charge particle in the radiation field of a circularly polarized
plane wave admit solutions that are circular orbit in the plane orthogonal to the
direction of propagation of the light.

4.2.4 The non-relativistic limit


It is also easy to check that Eq. 4.17 reduces to the standard formulas of non-
relativistic classical mechanics. For a slow particle has

uµ ≈ (c, v), u̇j = aj


4.2. VARIATION OF THE ACTION 67

(recall that η = (−1, 1, 1, 1)). The Euler-Lagrange equations reduce to Newton


equations of motions
e
ma = eE + v × B (4.18)
c
provided we identify
1
Ej = Fj0 , Bi = εijk Fjk (4.19)
2

4.2.5 The minimiser of the action


In general the action may not have an reasonable minimizer and when it does,
the minimizer need not be unique. The (non-relativistic) Harmonic oscillator is
an example. The Lagrangian is
2L = ẋ2 − x2 (4.20)
Consider paths that starts and terminates at the origin in half the period, t = π.
The Euler-Lagrange equation are satisfied by
x(t) = A sin t
for arbitrary amplitude A. All solve the equation of motion ẍ = −x and so are
local minimizers. For all of these the action vanishes:
Z π Z π
S= Ldt = A 2
(cos2 t − sin2 t)dt = 0,
0 0

The action has a positive contribution from the kinetic energy and a negative
contribution from the potential energy. Each of the two can be arbitrarily large.
This suggests that S is actually unbounded below when considered as a function
of paths. This is indeed the case.

Figure 4.3: There are infinitely many paths connecting the origin when the time
difference is half the period. But there is no honest minimizer connecting the
origin to any other point on the red line at half the period. The minimizer ”goes
through infinity”.

To see this consider paths that connects x = 0 at time t = 0 with x0 at time


t = π. A family of such paths is
1
x(t) = A sin ωt, ω =1− arcsin(x0 /A)
π
68 CHAPTER 4. VARIATIONAL PRINCIPLE

The action is
Z π Z π

S= Ldt = A2 (ω 2 cos2 ωt − sin2 ωt)dt −−−−→ 2x0 A
0 0 A→∞

We see that indeed S is unbounded below.

Convexity and uniqueness


A function S is called convex if

S(λx + λ0 y) ≤ λS(x) + λ0 S(y), λ + λ0 = 1, 1 ≥ λ, λ0 ≥ 0

For example, the function in Fig. 4.1 is convex. It is evident that if a function
is convex its minimum is unique. (It may, however, lie at infinity).
The notion of convexity extends to the case that x, the argument of S, is
itself a function–a path. γ is a convex function of v. It follows that the action
is a convex function of the path. This then implies that the minimizer for a free
relativistic particle is unique.

Figure 4.4: γ is a convex function of v . This implies that S is a convex


functional of the path.

4.3 Geodesics in Curved space-time. (You may


want to skip this)
In a general space-time (such as in section 2.9) the notion of proper-time is
defined by
(cdτ )2 = −gµν (x) dxµ dxν (4.21)
The 4-velocity u̇µ is still normalized to −c2 since

(cdτ )2 = −gµν (x) uµ uν (dτ )2 = −uµ uµ (dτ )2

Remark 4.4. The acceleration is orthogonal to the velocity in Minkowski space,


but not in the more general case when the metric is position dependent. In fact,

d µ
0= (u gµν uν ) = 2u̇µ uµ + (∂α gµν )uµ uν uα = 0 (4.22)

4.3. GEODESICS IN CURVED SPACE-TIME. (YOU MAY WANT TO SKIP THIS) 69

We want to find the path that minimizes the action (equivalently, maximizes
the proper time) Z
S = −mc2 dτ

The variation of (cdτ )2 is

δ(cdτ )2 = 2c2 (dτ )δ(dτ ) = −(δgµν ) dxµ dxν − 2gµν δ(dxµ )dxν

Hence
− 2c2 δ(dτ ) = (δgµν ) uν dxµ + 2gµν uν δ(dxµ ) (4.23)
Rewrite the first term on the right as

(δgµν ) uν dxµ = (∂α gµν ) uν dxµ δxα = (∂µ gαν ) uν dxα δxµ

The second term can be rewritten as

gµν δ(dxµ )uν = d(gµν δxµ uν ) − d(gµν uν ) δxµ


= d(gµν δxµ uν ) − (∂α gµν ) dxα uν δxµ − gµν duν δxµ

collecting and factoring 2


 
− c2 δ(dτ ) = d(gµν δxµ uν ) + 1 ν
2 (∂µ gαν ) u dx
α
− (∂α gµν ) uν dxα − gµν duν δxµ

The first term is a boundary term which does not contribute to the variation of
the action. Dividing by dτ the brackets on the right, and renaming the dummy
index ν, β give  
gµν u̇ν + ∂α gµβ − 12 ∂µ gαβ uβ uα = 0 (4.24)

The equation has the form

u̇µ + Γµαβ uα uβ = 0 (4.25)

where Γ is linear in the derivatives of g. Since only the symmetric part of Γµαβ
contributes, we define Γµαβ so it is explicitly symmetric
 
Γµαβ = 12 g µν ∂α gνβ + ∂β gνα − ∂ν gαβ (4.26)

This is the celebrated Christoffel symbol.

Exercise 4.5. Show that great circles on the sphere are geodesics.

Exercise 4.6 (Geodesic equation in covariant components). Show that the


geodesic equation for the covariant components satisfies the equation
 
2u̇µ = − ∂µ g αβ uα uβ
70 CHAPTER 4. VARIATIONAL PRINCIPLE

4.3.1 Relativistic Kepler law


For a non-relativistic planet orbiting a sun in a circular orbit, equating the
centrifugal force with the gravitational attraction gives
GM
ω2 R = (4.27)
R2
This is Kepler third law which relates the periods of all planets in the solar
system with their radii
T 2 ∝ R3 (4.28)
Let us now see how one can get a Kepler type relation for geodesics of a diagonal,
time-independent, metric in 2 + 1 dimensions with circular symmetry, i.e. g =
g(ρ):
− (cdτ )2 = gt (cdt)2 + gρ (dρ)2 + ρ2 (dθ)2 (4.29)
For a circular, stationary geodesic

uρ = ρ̇ = 0, uθ = θ̇ = ω, ut = cṫ = cγ (4.30)

with ω and γ constant in time. For a time-like u we have

− c2 = uµ uµ = (cγ)2 gt + ω 2 ρ2 (4.31)

Now, since u̇µ = 0 for a stationary geodesic, the geodesic (differential) equations
reduce to constraints relating ρ, ω, γ. As uµ has only two non-zero components
the 4 geodesic equations are

0 = c2 Γµtt γ 2 + 2cΓµθt ωγ + Γµθθ ω 2 (4.32)

As g is a function of ρ the formula for the Christoffel symbols simplify2


 
Γµαβ = 21 g µ ∂α
gµβ
+ ∂β  − ∂µ gαβ , α, β ∈ t, θ
gµα (4.33)

The (non-zero) Christoffel symbols are Γρtt and Γρθθ so the geodesic equation
gives a single constraint relating ω, γ and ρ

0 = Γρtt (cγ)2 + Γρθθ ω 2 (4.34)

Combined with Eq. 4.31 which relates γ and ω, we get a relation between ρ and
ω for circular (time-like) stationary orbits

0 = −Γρtt (ρ)(c2 + ω 2 ρ2 ) + gtt Γρθθ (ρ)ω 2 (4.35)

Solving for ω 2 we get Kepler type law (in proper time)

c2
ωτ2 = (4.36)
gtt Γρθθ /Γρtt − ρ2
2 When g is diagonal, we denote its diagonal elements by gµ
4.3. GEODESICS IN CURVED SPACE-TIME. (YOU MAY WANT TO SKIP THIS) 71

It is more physical to measure ω in coordinate time, i.e


2
−c2 gt + ρ2 ωt2


ωt2 = ωτ2 = (4.37)
dt gtt Γρθθ /Γρtt − ρ2

where we used the metric and orbit to relate self and coordinate time.

− (cdτ )2 = gt (cdt)2 + ρ2 (ωτ dτ )2 = gt (cdt)2 + ρ2 (ωt dt)2 (4.38)

Solving for ωt we get:


Γρtt
ωt2 = c2 (4.39)
Γρθθ

The ratio of the Christoffel symbols can be computed from

1 1 1
Γρtt = − g ρ ∂ρ gtt , Γρθθ = − g ρ ∂ρ gθθ = − g ρ ∂ρ ρ2 = −ρg ρ

(4.40)
2 2 2

Hence
Γρθθ 2ρ
=
Γρtt ∂ρ gt

This gives for Kepler’s law


∂ρ gt
ωt2 = c2 (4.41)

independent of gρ . In the case of the Schwartzschild metric

2GM
gt = −(1 + Φ), Φ=− (4.42)
c2 ρ

Kepler’s law takes the familiar form

GM
ωτ2 = (4.43)
ρ3

Interestingly, the appears to be no GR correction to Kepler’s law. GR is hidden


in the relation between the self-time and coordinate time. Substituting Kepler
in Eq. 4.38 we get
 
3GM
(dτ )2 = 1 − (dt)2 (4.44)
ρ

so the computation only makes sense for

3GM
<1
ρ
72 CHAPTER 4. VARIATIONAL PRINCIPLE

4.4 Supplement
4.4.1 Fermat principle
The mother of variational principles is Fermat principle. It formulates geometric
optics at the the minimizer of the time of propagation between two points. The
ray propagates in a medium with index of refraction n(x). The propagation
time dt is c dt = n(x)|dx|. We can think of n as inducing a metric in Euclidean
space–one that measures the propagation time:

(c dt)2 = n2 (x) dx · dx = n2 (x) (d`)2

Such a metric is called conformal. √


Consider a ray x(`) parametrized by it Euclidean length: d` = dx · dx.
The tangent to the ray
dx
t=
d`
is a unit vector. For a variation of the path δx:
dx · dδx
δ(d`) = = t · dδx, δn = (δx · ∇)n
d`
The corresponding time variation is (the integral of)
 

δ(c dt) = (δx · ∇)nd` + nt · dδx = δx · ∇n d` − d nt + d nt · δx)

The integral of variation vanishes provided the brackets (and boundary terms)
vanish:
d(nt)
= ∇n
d`
In particular, in a region where the refraction index is a constant, ∇n = 0, the
ray keeps it direction of propagation: t is a constant.
Exercise 4.7. Show that the equation of motion is consistent with the t being
a unite vector.

Exercise 4.8. Derive Snell’s law, fig.4.5,

n1 sin θ1 = n2 sin θ2

from Fermat principle.

4.4.2 Rainbow
The simplest features of the rainbow can be understood from Snell’s law.
Exercise 4.9. Use Snell law and show that a light ray in air (na = 1) hitting a
water droplet, nw > 1, at lattitude θ is reflected back at angle 2α(θ) = 4φ(θ)−2θ,
see figure. The function φ(θ) is defined by Snell law: nw sin φ = sin θ.
4.4. SUPPLEMENT 73

Θ1 n1

Figure 4.5: The change in direction of a ray when n jumps is ddetermined by


Snell’s law

Φ Α=2Φ-Θ
Θ

Figure 4.6: The blue line shows a ray undergoing one internal reflection in a
drop of water. The impact angle θ is defined in the figure. The outgoing ray
is focused near the maximum 2φ − θ. This partial focusing is called a caustic.
This gives the direction of the rainbow.

The intensity of the light reflected at angle I(2α) is proportional to the


intensity of the incoming light:

d(sin θ) = I(2α) |dα|

A computation gives
dα cos θ
= −1 + 2 √ 2
dθ n − sin2 θ
The derivative vanishes for

3 cos2 θ = n2 − 1

which gives a real value for θ provided 1 < n < 2. This gives the maximal
value of 2α. Evidently, I(2α) = ∞ there. The divergence implies focusing of
the reflected light. This is called caustic in geometrical optics.
Exercise 4.10. Show that for water (n = 1.33) the caustic occurs for 2α = 42◦ .
This is the main angle of the rainbow, first found by Bacon in 1268. (Different
colors have slightly different angles due to the slight frequency dependence of n).
74 CHAPTER 4. VARIATIONAL PRINCIPLE

Figure 4.7: The cyan arrows represent light rays from the sun. The two small
light-blue balls represent two water droplets. The red arrow are the reflected
light rays in the direction of the rainbow caustics. The green eye represents the
observer. Pilots sometimes see rainbow that are circular.
Chapter 5

Currents
The sources of electromagnetic fields are the charges and currents. To describe
the sources in a Lorentz invariant framework we need to amalgamate the non-
relativistic notions of charge and currents into a single notion in space-time the
4-current.

5.1 Charge densities and currents


For a point charges e moving with the trajectories ξ(t) in 3 dimensions, the
non-relativistic notion of charge density ρ and current j is defined by

ρ(x, t) = eδ (3) x − ξ(t) , j(x, t) = eδ (3) x − ξ(t) v(t)


 
(5.1)

where v = ξ̇ is the velocity. For several particles, of charges ea , moving with


trajectories ξa , the generalization is
X X
ea δ (3) x − ξ a (t) , j(x, t) = ea δ (3) x − ξ a (t) va (t) (5.2)
 
ρ(x, t) =
a a

We would like to amalgamate ρ and j into a notion of a 4-current-density.


However, neither v nor δ (3) (x) are natural objects in space-time and, a-priori,
you may well worry that the non-relativistic notions of charge and currents are,
at best, a non-relativistic approximations so we will need to tinker with them
before being able to amalgamate them. It will turn out that these expressions
are fine as they stand. (We still need to adjust dimensions so we can fit both ρ
and j in a 4-vector with identical dimensions.)
It is clear that is is enough to formulate the notions of current and density
for a single particle. This simplifies the notation.

5.1.1 4-current-density
Consider a point particle whose trajectory is given ξ µ (τ ) as a function of its
proper-time. For the sake of simplicity, we work in Minkowski Cartesian coor-
dinates. We can make a 4-current density using only scalars 4-vectors and in

75
76 CHAPTER 5. CURRENTS

general, objects that behave nicely under Lorentz transformations:


Z
j µ (x) = ec dτ δ (4) x − ξ(τ ) ξ˙µ (τ )

(5.3)

where dot is a derivative with respect to the proper time. The scalar factor ec
fixes the dimensions to the dimensions of current density.
Remark 5.1. dτ is a scalar, and ξ˙µ a 4-vector. The delta function is a density,
i.e. under a coordinate transformation it is multiplied by a power of det g.
Indeed, under scaling x0 = λx, the metric transforms as λ2 gµν
0
= gµν , so in d
2d 0 d 0 d −d
dimensions λ det g = det g. Since δ (x ) = δ (λx) = λ δ(x) we see that δ
is a density as it transforms by √1 0 δ(x0 ) = √1 δ(x). The current density is a
|g | |g|
density as its name suggest.

ct

Figure 5.1: The parametrized orbit ξ µ (τ ).

To relate this expression to Eq. (5.2) integrate over τ . This gets rid of one of
the delta functions. Since ξ is a real orbit, there is a 1-1 correspondence between
coordinate time ξ 0 (τ )/c and the proper time τ . Changing variables from cdτ to
dξ 0
Z Z
 dτ
c dτ δ (4) x − ξ(τ ) ξ˙µ (τ ) = c dξ 0 δ (4) x − ξ ξ˙µ 0


Z
= dξ 0 δ (3) x − ξ(ξ 0 ) δ(ct − ξ 0 )v µ (ξ 0 )


= δ (3) x − ξ(t) v µ (t)




where v µ = (c, v) = γuµ . In conclusion, we get for the 4-vector of current


density
j µ = (cρ, j)
5.1. CHARGE DENSITIES AND CURRENTS 77

The result is a pleasant surprise because it essentially coincides what we knew


from non-relativstic physics. δ (3) is not a density in Minkowsky space, and v µ
is not a 4-vector. However, together they conspire to give the 4-vector (density)
j µ . There are no relativistic corrections one needs to make to the classical
formulas for charge densities and currents.

5.1.2 Charge conservation


Charge conservation says that the total charge, at any given instant
Z Z
d3 xρ(x, t) = e d3 xδ (3) (x − ξ(t)) = e

is independent of the time slice t.


Remark 5.2. It is also independent of the Lorentz frame in which the slice is
taken.
Charge conservation is expressed by the continuity equation. Consider the
rigid transport of a bump function: ρ(x − ξ(t)). Evidently, the total charge is
conserved Z Z
d3 xρ(x − ξ(t)) = d3 xρ(x) (5.4)

To derive te continuity equation write ξ̇ = v. Then


 
∂t ρ(x − ξ(t) = −(v · ∇x )ρ x − ξ(t) (5.5)

Since v(t) is not a function of x


 
∇x · v(t)ρ(x − ξ(t) = (v · ∇x )ρ x − ξ(t)

This, together with Eq. 5.5 gives the continuity equation

0 = ∂t ρ +∇ · (vρ) = ∂µ j µ , j µ = (c, v)ρ (5.6)


|{z} |{z}
density current

A point charge is the limit ρ → δ.


Once this equation holds for one charge, it holds for any number. When
we consider huge numbers of charges with poor spatial resolution we may then
think of j µ (x) as a smooth function on space time, which satisfies the continuity
equation.
Example 5.3 (Orders of magnitudes).

1. A current of 1 Ampere transports 6 × 1018 electrons per second


2. A copper wire of cross section S = 1 [mm2 ] has 8 × 1020 [cm−1 ] atoms per
unit length. Since copper has valence 2, the number of electrons per unit
length is ≈ 1.6×1021 [cm−1 ]. The classical formula for the current I = env
78 CHAPTER 5. CURRENTS

ct

Figure 5.2: Charge conservation expresses the fact that the orbit is a continuous
curve which does not terminate and moves always into the future. If it enters
a box in space-time it also leaves it. If the orbit enters the box at the bottom
leaves it at the top we say that the charge in the box is conserved. If it leaves
and enters on the sides we say that incoming current balances the outgoing
current.

gives very small velocities, about 40 [µ/sec]. This classical computation


is misleading. In reality, only a small fraction of the electrons, near the
Fermi energy, participate in the conductance, and these electrons actually
move at the Fermi velocity, which is quite large (about c/137). To treat
the problem honestly we need quantum mechanics.

5.1.3 Current conservation and gauge invariance


We want to generalize the expression for Sint form a finite collection of charges
to a continuous distribution. For a single charge the action representing the
interaction is
Z
e
Sint = Aµ (ξ)uµ (τ ) dτ
c
Z
e
= 2 Aµ (ξ)uµ (τ ) δ (3) (x − ξ(τ )) d3 x d(cτ )
c
Z
e
Aµ ξ 0 , x v µ (ξ 0 ) δ (3) (x − ξ(t)) d3 x dξ 0

= 2
c
Z
e
Aµ x v µ (t) δ (3) (x − ξ(t)) dΩ, dΩ = d4 x

= 2
c
5.1. CHARGE DENSITIES AND CURRENTS 79

The middle two terms are the 4-current, hence


Z
1
Sint = 2 Aµ (x)j µ (x) dΩ (5.7)
c
describing both smooth and discrete 4-current distributions.

5.1.4 Gauge invariance and the continuity equation


We have seen that under a change of gauge Sint for a single charge changed
by a boundary term. This then implied the gauge invariance of the Euler-
Lagrange equations. It is interesting to reconsider this issue for smooth current
distributions. We will learn something. Under a change of gauge
Aµ → Aµ + ∂µ Λ
The interaction action changes by
Z
1
Sint → Sint − (∂µ Λ) j µ dΩ (5.8)
c2
The integrand can be rearranged as
(∂µ Λ) j µ = ∂µ (Λ j µ ) − Λ ∂µ (j µ ))
The first term gives a boundary term and vanishes if Λj µ → 0 at infinity. Sint
is therefore guaranteed to be gauge invariance provided the current satisfies the
continuity equation
∂µ j µ = 0 (5.9)
The expresses the intimate relation between gauge invariance and charge con-
servation.

5.1.5 The continuity equation in curvilinear coordinates


It is instructive to generalize the derivation of the continuity equation to curvi-
linear coordinates. The main change is that now
p
dΩ = |g|d4 x (5.10)
Now the integrand in Eq. 5.8 can be rearranged as
p p p 
|g|(∂µ Λ) j µ = ∂µ ( |g|Λ j µ ) − Λ ∂µ |g|j µ )

As before, the first term gives a boundary term and vanishes if Λj µ → 0 at


infinity. Sint is therefore guaranteed to be gauge invariance provided the current
satisfies the continuity equation holds
1 p 
p ∂µ |g|j µ = 0 (5.11)
|g|
This is the covariant form of the continuity equation in curvilinear coordinates,
taking into account that the current density is a density in the sense of tensors.
80 CHAPTER 5. CURRENTS
Chapter 6

The inhomogeneous
Maxwell’s equations

The inhomogeneous Maxwell equations are derived from a variational principle.

6.1 Lagrangian field theory


In Lagrangian mechanics the basic object is the Lagrangian, L(qj , q̇j , t), a func-
tion of the “generalized coordinates” qj and their velocities q̇j and j labels the
degrees of freedom. Lagrangian field theory can be viewed as a generalization of
Lagrangian mechanics to infinitely many degrees of freedom where the discrete
indexR j is replaced by x, a point in space. The Lagrangian is then of the form
L = dx LF where LF is a suitable Lagrangian density for the field: a function
of the fields and their time derivatives.

6.1.1 The Lagrangian of the electromagnetic field


Now we come to deciding what replaces the qj and q̇j for the electromagnetic
field. The first natural choice appears to be Fµν . However, F can not be
viewed as independent generalized coordinates, since they are constrained by
the homogeneous Maxwell’s equation. Independent generalized coordinates are
Aµ :
qj ↔ Aµ (x), q̇j ↔ Ȧµ (x)
This reproduces Maxwell equations, as we shall see.
Lorentz invariance of the Euler-Lagrange equations is guaranteed if the ac-
tion is a Lorentz scalar. We have at our disposal two Lorentz scalars1 whose
dimensions are energy density:

F · F = Fµν F µν , F · F ∗ = Fµν (F ∗ )µν ,


1 For simplicity we assume | det g| = 1

81
82 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

Figure 6.1: The action associated a number with a given field configuration and
a box in space-time. We allow variation of Aµ inside the box: The variation
vanishes outside the box and on its boundary. This is the analog of what we do
when we vary the path.

Since the volume element in space-time dΩ = d4 x is a Lorentz scalar and since


the action must2 have dimension [Et], two candidates for the field action Sf are
suitable numerical multiples of
Z Z
1 1
SF = − dΩ F · F, S cs = dΩF ∗ · F
16πc |{z} c
volume element

We can rule out the action Ssc by the following observation: The homogeneous
Maxwell equation

(F ∗ ) · F = 2(F ∗ )µν (∂µ Aν )


= 2∂µ ((F ∗ )µν Aν )

This means that the associated action is a boundary term. Since the rules of
variation keep the boundary terms fixed, the variation of Scs vanishes identically.
We are left with the first candidate. We need first to justify the sign chosen
so that the action will have a minimum rather than a maximum.
2 So we can add it to Sp and Sint
6.2. VARIATION OF THE FIELD: RULES OF THE GAME 83

In Lagrangian mechanics the kinetic energy comes with a positive sign. Since
E is linear in Ȧ while B = ∇ × A, it is the E2 that plays the role of kinetic
energy and as
F · F = −2(E2 − B2 ) (6.1)
we must have Z
1
SF = − F · F dΩ (6.2)
16πc
The 16π gives Maxwell equations in c.g.s units and in particular leads to the
Coulomb potential in the form3 re . In the MKS system where Coulomb law is
e/4π0 r one needs to replace 16π by 40 .

6.2 Variation of the field: Rules of the game


The actions SF assigns a number for any given field Aµ (x). It is a functions
whose arguments are the functions Aµ (x). Such objects are sometimes called
functionals.
We consider variation δSF due to variations δAµ . We shall consider local
variations only, namely, variations in a finite region of space time: δAµ = 0
outside some large space-time box, so we do not need to worry about infinite
variations that can come with infinite boxes.

6.2.1 Variation of the field: Calculations


The variation of A causes a variation of F · F

δ(Fµν F µν ) = 2F µν δ(Fµν )

where
δ(Fµν ) = ∂µ δAν − ∂ν δAµ
By the anti-symmetry of F

δ(Fµν F µν ) = 2F µν (∂µ δAν − ∂ν δAµ ) = 4F µν (∂µ δAν ) (6.3)

In the calculus of variations one wants to end up with an expression proportional


to δAµ : We need to get rid of terms of the form δ∂A. This we can do by
integrating by parts. In the case of curvilinear coordinates,
p
dΩ = |g|d4 x (6.4)

Since det |g| is a function of the coordinates, and not a function of A, it is not
affected by the variation. From Eq. 6.3
p p p
|g|δ(Fµν F µν ) = 4∂µ ( |g|F µν δAν ) − 4∂µ ( |g|F µν ) δAν
3 Replacing 16π by 4 in the action, gives for the Coulomb potential e/4πr.
84 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS
p
The first term is the divergence of the vector field |g|F µν δAν which can be
converted to a 4-surface integral on the boundary of the box where δA = 0.
Hence, Z
1 1 p
δSF = dΩ p ∂µ ( |g|F µν ) δAν (6.5)
4πc |g|
| {z }
divergence of anti-symmetric tensor

Maxwell’s equations in free space follow from δSF = 0 for any δAν . This is the
case if
1 p
p ∂µ ( |g|F µν ) = 0 (6.6)
|g|

6.2.2 Variation of the interaction


We have already determined the action associated with the interaction when we
studied the dynamics of relativistic charged particles as
Z
1
Sint = 2 Aν j ν dΩ
c
To get the field equations we consider variations δA for a given source term j.
This variation is Z
1
δSint = 2 δAν j ν dΩ (6.7)
c

6.2.3 The inhomogeneous Maxwell equations


The Euler-Lagrange equations for the fields are those that minimize the action
SF +Sint . The minimizer is the stationary point of the variation. When det |g| =
1 we simply get
Z  
1 µν 4π ν
0 = δSF + δSint = dΩ ∂µ F + j δAν (6.8)
4πc c

The variation will vanish for arbitrary δA provided the brackets vanish. In the
case of curvilinear coordinates where det |g| is not necessarily 1 we have4

1 p  4π
p ∂µ |g|F νµ = jν (6.9)
|g| c

These are the 4-inhomogeneous Maxwell equations in a neat and concise form.

6.2.4 Current conservation


We have derived Maxwell equations as the Euler-Lagrange equations for the
field Aµ for a given source term j µ . This derivation did not assume that the
source j µ is a reasonable physical current and did not explicitly require that it
4 Note the interchange of the indices of F relative to Eq. 6.8.
6.2. VARIATION OF THE FIELD: RULES OF THE GAME 85

be a conserved current. However, a-posteriori, Maxwell equation enforce current


conservation on j as a direct consequence of the fact that F is an antisymmetric
tensor: p  4π p 
0 = ∂µν |g|F µν = ∂ν |g|j ν
| {z } c
0 by symmetry

in accordance with Eq. 5.11. If the source j was not current conserving, Maxwell
equations would not form a consistent set of equation.

6.2.5 3-D form


To translate back Maxwell equations from their covariant space-time form to
3-D form, consider first the ν = 0 equation. Since

Ej = F0j = F j0 , j 0 = cρ

we get Gauss-Coulomb law


4π 0
∇ · E = 4πρ ⇐⇒ ∂µ F 0µ = j (6.10)
c
The spatial components are:
1 4π j
∂µ F jµ = ∂0 F j0 + ∂k F jk = − ∂t Ej − ∂k (εikj Bi ) = j
c c
Using
∂k (εikj Bi ) = −εjki ∂k Bi = −(∇ × B)j
This gives Ampere-Maxwell equation:
4π 4π k
− Ė + ∇ × B = J ⇐⇒ ∂µ F kµ = j (6.11)
c c

6.2.6 Time reversal


Time-reversal of the orbits of the sources ξa (t) 7→ ξa (−t), sends ρ(t, x) 7→
ρ(−t, x) but flips the currents J(t, x) 7→ −J(−t, x). It follows that solutions
to Maxwell’s equations transfrom under time-reversal as E(t, x) 7→ E(−t, x)
and B(t, x) 7→ −B(−t, x). We say that E is even under time reversal and B is
odd.

6.2.7 Maxwell equations: Evolution equations and con-


straints
Maxwell’ equations for E and B, partition into a set of two scalar equations and
two vector equations. The two scalar equations are Gauss laws:

∇ · E = 4πρ , ∇·B=0 (6.12)


| {z }
Gauss
86 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

The two vector equations are Faraday and Maxwell-Ampere laws


Ė = ∇ × B − J, Ḃ = −∇ × E (6.13)
c } | {z }
F araday
| {z
Ampere

and dot denotes partial derivative with respect to ct. The vector equations are
evolution equations that allow to propagate E and B in time, given their initial
values and the source J.
In total, there are 8 Maxwell equations for the 6 unknown fields. This looks
like an over constrained system. It is better to view them as two evolution
vector equation for two vectors and view the scalar equations as a constraint on
the initial data. This constraint is preserved by the evolution provided (ρ, J)
satisfy the continuity equation.

Exercise 6.1. Show that the evolution respects the constraint.

Example 6.2 (Current carrying wire). An electrically neutral, infinitely long,


metallic straight wire (along the z-axis) with circular cross section of radius a
carries a stationary current I. Suppose Ohm’s law in the form J = σE with σ a
constant in the wire. Find the profiles of the electric and magnetic fields inside
and outside the wire. Assume cylindrical symmetry, translational symmetry in
the z direction and stationarity. Analyze the problem in cylindrical coordinates
(ρ, θ, z).
By Gauss and the symmetry Eρ = 0 Since the magnetic field is assumed
stationary, by Faraday and the symmetry Eθ = 0. You might then be tempted
to say that outside the wire, one should have Ez = 0. This, however, leads to a
contradiction: Combinig Gauss and Faraday
 
0 = ∇ × E ⇒ 0 = ∇ × ∇ × E = −∆E + ∇ ∇ · E ⇒ ∆E = 0

Which says that E is harmonic everywhere. Hence, if it is zero outside the wire,
it is zero everywhere. This, together with Ohm’s law, contradicts the assumption
that the wire carries current.
Let us then retreat to the next line of defense and take E = E0 ẑ with E0 a
constant. This is still harmonic By the integral form of Ampere
(
2I 1 ρ>a
B= θ̂ × ρ 2

cρ a ρ<a

where I is the total current. We have used the fact that inside the wire, the
constancy of E implies the constancy of J.
It may be a little shocking at first that a neutral current carrying wire bundles
with it an electric field that does not decay as you get far from the wire. This is
a pathology due to the assumed infinite length of the wire.
6.3. NEW PHYSICS 87

6.3 New Physics


Lagrangian field theory allowed us to repackage Maxwell’s electrodynamics in
an elegant formalism. But more importantly, it allows to explore new models.
Most models are, at the end, just models, and with application as questions
in homework sets and exams. But occasionally, some turn out to capture new
physics.

6.4 Electrodynamics in 1+1 dimensions


It is not possible to explicitly solve Maxwell’s equation for the fields for general
motion of the sources. However, in 1+1 dimensions this is possible.
In 1+1 dimensions F is an anti-symmetric 2 × 2 matrix. Its single entry is
the electric field, which is a Lorentz scalar. Maxwell’s equations are then

∂x E = 4πρ, ∂t E = −4πJ (6.14)

and they are consistent if the source satisfies the continuity equation.

Exercise 6.3 (Solution of Maxwell equations for arbitrary motion of a point


charge). Show that for a charged particle with a given, arbitrary, orbit, the
solution of Maxwell equations for E takes two constant values in the space-time
plane separated by the world line of the particle. Determine the jump across the
world line.

-1

-2
-2 -1 0 1 2

Figure 6.2: Maxwell’s equation in 1+1 dimensions can be solved geometrically


for arbitrary motion of the source. The figure illustrates the solution in space-
time for two point sources undergoing constant acceleration. The field takes
constant values in the different regions delineated by the orbits of the charges.
88 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

6.4.1 Axion
In Maxwell’s theory the source term is a the vector field of currents j µ . In 1+1
dimensions there is a different option for a source term, namely a scalar field
φ(x) 5 :
1 1
L = − Fµν F µν + φ(x)εµν Fµν
4 2
This looks first like a different theory, but it is actually equivalent to Maxwell’s.
The variation of A gives, up to boundary terms,
δL = (∂µ F µν )δAν − ∂µ φ(x)εµν δAν
The Euler-Lagrange equations for this model are then
∂µ F µν = j ν , j ν = (∂µ φ)εµν (6.15)
Note that ∂ν j µ = 0 so the current is conserved. We have recovered Maxwell
theory except that the current is interpreted as the gradient of a scalar.

6.5 The quantum Hall effect


The quantum Hall effect was discovered in the 1980’s ushered a new era of
research now called the study of topological phases. These phases are are in-
trinsically quantum and are topolgical in the sense that the quantum state of
the system has certain topological features that I shall not go into. The Hall
effect is a two dimensional phenomenon and the topological phases are labeled
a quantized value of the Hall conductance. Let me explain.
In 2 dimensions the conductance is a rank 2 tensor. There are two rank 2
tensors that are rotationally invariant: The identity and Levi-Civita. So the
most general isotropic conductance is
 
σk σH
J= E, σ > 0 (6.16)
−σH σk
The diagonal part is the dissipative conductance, and this is why it is always
positive. The off-diagonal is the Hall conductance and can have either sign.
At sufficiently low temperatures, and sufficiently strong magnetic field, the
system is almost in the ground state. One then finds that σH as a function of
B displays the following features
• It is a step-like function of B
• In the steps, there is no dissipation and the Hall conductance is quantized
e2
σk = 0, σH ∈ Q (6.17)
h
Q denotes the rationals. Planck constant is an indication that the phe-
nomenon is quantum.
5 Frank Wilczek, who invented this field, called it Axion field.
6.5. THE QUANTUM HALL EFFECT 89

• The magnetic field and the charge density are related by

∂B (cρ) = σH (6.18)

We can summerize the equations 6.16-6.18 in one space-time vector equation

j = σH F ∗ , j = (cρ, j 1 , j 2 ), F ∗ = (B, E2 , −E1 ) (6.19)

Figure 6.3: The phase diagram for the Integer quantum Hall effect for the Hofs-
tadter model on the triangular lattice at T = 0. The vertical axis is the magnetic
flux through the unit cell. The horizontal axis is the chemical potential. Figure
made by Gal Yehoshua for an undergrad project.

6.5.1 The Chern-Simons action


The Chern Simons field theory captures some of the basic features of the quan-
tum Hall effect:

• It is an intrinsically 2 space dimensional theory

• It has broken time reversal and space inversion, symmetries that are bro-
ken by the large external magnetic field

• It incorporates charge conservation.

• It has the right scaling dimension

Charge coservation in 2+1 dimensions is

0 = ∂µ j µ , j = (cρ, j 1 , j 2 ) (6.20)

If you think of current conservation as the statement that the divergence of a


vector field vanishes, then this vector field must be the rotation of some vector
90 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

field. To make contact with the Hall effect we identify the vector field with the
electromagnetic gauge field A

j α = kαβγ ∂β Aγ (6.21)

k is an appropriate constant that we shall choose later6 . It is easy to see that


Eq. 6.20 indeed holds. Since we picked A to be the electromagnetic gauge field,
by definition of the dual
(F ∗ )α = 21 αβγ ∂β Aγ (6.22)
We can now ask what Lagrangian would give Eq. 6.19 as its Euler Lagrange
equations ?
Since Eq. 6.19 is Lorentz covariant, the Lagrangian must be a Lorentz scalar.
As the Hall effect breaks time and space reflections, we want the Lagrangian
to have this feature as well. The Levi-Civita tensor breaks these symmetries,
so the Lagrangian should be the contraction of Levi-Civita εαβγ with a third
rank tensor made from F and A. There is one way to do that, and this is the
Chern-Simons action
k k αβγ
LCS = (F ∗ )α Aα = ε Fβγ Aα , (6.23)
4πc2 8πc2
α runs over 0, 1, 2. k is a constant that we shall adjust later.
The Chern Simons Lagrangian density is not gauge invariant. However, it
nevertheless leads to gauge invariant equations of motion. Under a change of
gauge Aµ → Aµ + ∂µ Λ, the Chern-Simons Lagrangian density LCS changes by
a boundary term.

k αβγ k αβγ
 
LCS 7→ LCS + ε F βγ ∂α Λ = LCS + ε ∂α ∂β A γ Λ (6.24)
8πc2 4πc2

Exercise 6.4. Show that ∂µ (F ∗ )µ = 0 is a consequence of the fact that F is


defined by the potentials A. (This is known as Bianchi identity.) Interpret
Bianchi as the equations in terms of Faraday law.

The CS Lagrangian is that it breaks both time reversal symmetry and parity.
You can see this either from the fact that it is first order in the derivatives, or
from the Levi-Civita tensor. This is a reflection of the symmetry of the Hall
effect where the external magnetic field breaks both symmetries.
To find the equations of motion for CS consider first the variation of the
action

δ(F ∗ A) = εαβγ (δ∂β Aγ )Aα + (F ∗ )α δAα


= −εαβγ (δAγ )(∂β Aα ) + (F ∗ )α δAα + ∂β (εαβγ Aα δAγ )
= 2(F ∗ )α δAα + ∂β (. . . )
6 By dimensional analysis k has the dimensions of conductance.
6.5. THE QUANTUM HALL EFFECT 91

The interaction term is the same as in Maxwell theory, and the full Lagrangian
is
1
LCS + 2 Aα j α
c
The Euler-Lagrange equations are

k ∗
F +j =0 (6.25)

Unlike Maxwell’s equations, this is not a set of differential equation, but an
algebraic relation between the fields and sources. Comparing with Eqs. 6.19
gives
k = −2πσH (6.26)

CS is the boundary of axion electrodynamics in the bulk


CS can be viewed as the holographic shadow of the action associated with E · B
in the bulk:
∗ µν
1
4 Fµν (F ) = εαβγδ ∂α Aβ ∂γ Aδ
= ∂α εαβγδ Aβ ∂γ Aδ

(6.27)
1 αβγδ

= 2 ∂α ε Aβ Fγδ (6.28)

It follows that Z Z
1
2 dx3 Fµν F µν = dSα εαβγ Aβ Fγδ (6.29)

CS is therefore related to the 3+1 axion electrodynamics associated with the


2+1 dimensional boundary.

Quantization
To explain why k must be quantized one needs to input quantum mechanics and
also assume that the two dimensional space of the system is a closed mmanifold,
e.g. a two dimensional torus.
In quantum mechanics one allows the system to explore all configurations.
A configuration is weighted by a complex phase

eiS/~

The CS action changes by a boundary term under a gauge transformation. The


weight therefore is gauge invariant only if under a gauge transformation Λ
Z
k
F ∗ ΛdS α = 0 M od 2π (6.30)
4π~c2 bdry α

To examine this condition, we need to discuss how Λ enters into quantum me-
chanics. Λ affects the phase of the wave function. Since [Λ] = [e] and phase is
92 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

dimensionless, we first need to adjust the dimensions. As α = e2 /~c is dimen-


sionless, a dimensionless quantity is
αn
Λ
e
The phase should have something to do with QM the simplest choice for a phase
change is
eieΛ/~c (6.31)
The next interesting thing to observe about this expression is that a constant
Λ of the form
eieΛ/~c = 1 (6.32)
is a trivial gauge transformation. Therefore, a gauge transformation
hc
Λ(t) = (tanh t/T + 1) (6.33)
2e
is asymptotically trivial: It does not affect either Aµ or the phase of the wave
function in the distant past and the distant future. The condition that the
weight of a path is gauge invariant, Eq. 6.30 is then
Z
k
0 M od 2π = F ∗ ΛdS α (6.34)
4π~c2 bdry α
Now suppose the physical system is a torus with −L < x, y < L and that the
time is −10T < t < 10T . The boundary of this 3-D box has 6 faces, so the
integral above can be broken into 6 integrals
Z Z Z
∗ ∗
ΛFt dxdy + ΛFt dydx 0
+ ΛFt∗ dx0 dx (6.35)
t=±10T x=±L y=±L
| {z } | {z }
=0 =0

The terms that vanish do so because ΛF is a periodic function in x and y with
period 2L. Since Λ ≈ 0 for t = −10T and Λ ≈ hc/e for t = 10T we finally get
the quantization condition
Z Z
k ∗ k
0 M od 2π = F 0 Λdxdy = Bdxdy (6.36)
4π~c2 4πce
Now we need one more input from QM: Dirac monopole condition. In the case
that the Hall system is a 2-D torus, the magnetic flux through the torus is
quantized Z
e
B dxdy = 2πm, m ∈ Z
~c
This gives for the rhs of Eq. 6.34
Z
k mk~
0 mod 2π = Bdxdy = (6.37)
4πce 2e2
which quantizes mk to a multiple of e2 /~. Since k = −2πσH we get that the
Hall conductance is a fraction multiple of the quantum unit of conductance
e2 2N
σH = , N, m ∈ Z (6.38)
h m
6.6. SUPPLEMENT: AXION ELECTRODYNAMICS 93

6.6 Supplement: Axion electrodynamics


In 3+1 dimensions F ∗ · F is a boundary term, and as such it does not affect
the equations of motion. However, this terms can do something interesting if
its coupling constant is replaced by a function. The function is called the Axion
field φ(x):
1 σ0
L=− F ·F − φ(x)F ∗ · F
16πc 8π
Evidently, this Lagrangian is gauge invariant7 . Since

φF ∗ · F = 2φ (F ∗ )µν ∂µ An = 2φ∂µ (F ∗ )µν Aν = −2(∂µ φ) (F ∗ )µν Aν + ∂µ (. . . )




we can then replace it by (a formally gauge dependent Lagrangian density)


1 σ0
L=− F ·F − (∂µ φ)(F ∗ )µν Aν
16πc 4π
Up to boundary terms, the variation of the action is

4πc δL = (∂µ F µν ) δAν + α (∂µ φ)(F ∗ )µν δAν ,


2
e
Exercise 6.5. Determine α (Answer:α = −3 hc )
The Euler-Lagrange equations are

(∂µ F µν ) = −α (∂µ φ)(F ∗ )µν

It is instructive to write the equations in terms of E and B. Gauss law (without


external sources) now takes the form

∇ · E = α(∇φ) · B

Ampere law
∂0 E − ∇ × B = αφ̇B + α∇φ × E
Exercise 6.6. Verify.
When φ is a constant one recovers the sourceless Maxwell equations. In
general, ∂µ φ acts like a source term in Maxwell equations.

6.6.1 Quantum interface


Axion electrodynamics started as a speculative model of an elementary particle:
The Axion. A different perspective was taken by Qi et. al. who proposed looking
at the interface between topologically distinct quantum phases. In the bulk of
the two insulators Maxwell theory applies. This says that φ is constant in
each. The constant is quantized to be 0 or π, by an gauge invariance argument
7 Since E is even and B odd under time reversal the Lagrangian breaks time-reversal unless

φ is also odd under time reversal. The notion of time reversal in the quantum case is subtle.
94 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

Φ=0

Φ=Π

Figure 6.4: The interface between two insulators that are topologically distinct
gives rise to a singular Axion field.

similar to the one in CS theory of the quantum Hall effect. By definition, the
two insulators are topologically distinct if the constant is different in each. This
means that ∇φ = δ (2) (x)n. Gauss law is replaced by
∇ · E = αδ (2 (x)n · B
The magnetic field on the surface acts as if there was a charge on the interface.
This is something we have already encountered in the CS theory of the quantum
Hall effect.
Ampere law is replaced by
∂0 E − ∇ × B = αδ (2) (x)n × E
This means that electric field on the surface acts as if there were currents at the
interface. In particular in Axion electro-Magneto-statics
∇ · E = αδ (2 (x)n · B, −∇ × B = αδ (2) (x)n × E

6.6.2 Magnetic response to an electric field


Consider a non-trivial insulator in the form of an infinitely long cylinder of
radius a immersed in the trivial vacuum and z-oriented. Take uniform electric
field everywhere and uniform magnetic field inside the cylinder
E = E0 ẑ, B = B0 θ(a − ρ) ẑ
Clearly
∇ · E = 0, n·B=0
so Axion Gauss law is satisfied.
∇ × B vanishes inside the cylinder and outside the cylinder, but has a delta
jump on the boundary. The magnetic field in the wire is proportional to the
constant electric field:
6.6. SUPPLEMENT: AXION ELECTRODYNAMICS 95

B
E

Figure 6.5: A cylinder of a (non-trivial) topological insulator is immersed in


vacuum (trivial insulator). A uniform electric field E in the axial direction
leads to response in a magnetic field inside the cylinder

Exercise 6.7. Use Stokes theorem for the (blue) rectangle shown in the figure
to show that
B0 = αE0

6.6.3 Phantom monopoles


In electrostatics an electric charge near a (grounded) conductor has an oppo-
sitely charged image. I want to show that in Axion electrodynamics you can
create image which is a magnetic monopole. We want to find consistent solution
as if there was a magnetic monopole in the lower half space. Namely

x + dẑ
B(x) = g θ(z) + (yet unknown f unction)θ(−z)
|x + dẑ|3

Exercise 6.8. Explain why ∇ · B = ∇ × B = 0 in the half space z > 0

The magnetic provides a source term for the electric field. The source term
is precisely the same as the source term in the corresponding electrostatic image
charge problem provided
gα = 2e
Now, if we add to this field the electric field given by electric monopole of charge
e above the x-y plane we obtain the same electric field configuration as in the
electrostatic image charge problem, everywhere, i.e.
 
x − dẑ x + dẑ
E=e − θ(z)
|x − dẑ|3 |x + dẑ|3
96 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS

E, B
Φ=0

Φ=Π
E=0

Figure 6.6: A electric charge, (red dot) is placed near a different topological
insulator with zero fields. On the left, the physical setup. On the right the
image method.

This describes the electric field everywhere. It remains to see what values B
takes in the lower half-space. Now E · n = 0 on the boundary and so we see
that B is the solution of
∇×B=∇·B=0
everywhere subject to the boundary condition that fixes B on the plane z = 0.
We introduce a scalar potential for B in the lower half pace

B = ∇φ, ∆φ = 0

subject to the boundary condition ∇φ = B on the plane z = 0. Evidently


g d
φ(x, y, z = 0) = p , Bz = ∂z φ = g
x2 + y2 + d2 |x2 + y 2 + d2 |3/2

The problem then reduces to solving Laplace equation with two types of bound-
ary conditions.
Bibliography Xiao-Liang Qi, et al, Inducing a Magnetic Monopole with
Topological Surface States, Science 323, 1184 (2009);
6.6. SUPPLEMENT: AXION ELECTRODYNAMICS 97

E B

Image Image

Figure 6.7: The red curve shows the surface charge density that allows the
field to terminate at the surface. On the right one sees the response in the form
of a magnetic field that seems to have a magnetic monopole at the image point.
There is no real magnetic monopole anywhere, of course.
98 CHAPTER 6. THE INHOMOGENEOUS MAXWELL’S EQUATIONS
Chapter 7

Magnetic fields and


magnetic induction

So far we considered the electric field and the magnetic induction (E, B), and
derived Maxwell equations from Lorentz invariance. Let us now turn to (D, H)
known as the displacement field and magnetic field.
One good reason to introduce two different notions of electric fields, (E, D)
and two different notions of magnetic fields (B, H) is that they are associated
with a-priory different measurements. (E, B) are defined by measuring the
force and then using Coulomb-Lorentz law to determine (E, B). D is defined
by measuring the charge on a pair of metallic plates, as in Fig. 7.1 and H by
measuring the surface current on a thin superconductor, as in Fig. 7.2.
A second good reason is that they are associated with different equations:
(E, B) are associated with the homogeneous Maxwell equations

∇ · B = 0, Ḃ + ∇ × E = 0 (7.1)

+

Figure 7.1: Put two thin metallic plates in contact in the field. The surface
charge density on the plates is proportional to field strength. Separate the
plates and measure the charge on the top plate. Define the field D as the
maximal charge per unit area over all initial orientations of the plates.

99
100 CHAPTER 7. MAGNETIC FIELDS AND MAGNETIC INDUCTION

Figure 7.2: A thin superconducting cylinder expels the magnetic field by creat-
ing surface currents. Measuring the current gives H.

while (D, H) are associated with the inhomogeneous Maxwell equations


∇ · D = 4πρ, −Ḋ + ∇ × H = J (7.2)
c
In a stationary case D = 0 inside a metal and has vanishing tangential com-
ponent, Dk = 0, on the surface. Hence by Eq. 7.2, on the surface charge on
a metallic surface D⊥ is, up to a factor 4π, proportional to the surface charge
density. This explains why measuring the charges on the plates in Fig. 7.1 is a
measure of D. Similarly, in a stationary case H = 0 inside a superconductor,
and Hk is proportional to the surface current. This explains why measuring the
current in Fig. 7.2 is a measurement of H = 0.
The in-homogeneous equations follow from charge conservation: Given ρ,
define D by the solution of the differential equation

∇ · D = 4πρ (7.3)

The solution is explicitly given by an integral: By the superposition principle

x−y
Z
D(x, t) = d3 y ρ(y, t) (7.4)
|x − y|3

Now, use this Eq. 7.3 in the equation for charge conservation
 
c 4π
0 = ∂t ρ + ∇ · J = ∇ · Ḋ + J (7.5)
4π c

Since the brackets have zero divergence it is the a curl of something. This
something is H

Ḋ + J=∇×H (7.6)
c
This is the inhomogeneous Maxwell’s equations.
7.1. CONSTITUTIVE RELATIONS 101

We can amalgamate the two fields in tensors. E and B in F


 
0 −Ex −Ey −Ez
 Ex 0 B z −B y 
Fµν =  Ey −B z x
 (7.7)
0 
Ez B y −B x 0

and D and H in D
 
0 Dx Dy Dz
 −Dx 0 Hz −Hy 
Dµν =
 −Dy
 (7.8)
−Hz 0 Hx 
−Dz Hy −Hx 0

7.1 Constitutive relations


Maxwell equations written in terms of F and D are

4π µ
∂µ (F ∗ )µν = 0, ∂µ Dµν = j (7.9)
c
To close the set we need a constitutive relation between F and D. The general
form of such a relation is
Dµν = εµναβ
0 Fαβ (7.10)

εµναβ
0 is the permittivity (not to be confused with Levi-Civita). It is a tensor
of rank 4, which is anti-symmetric in the first pair of indices and the last pair
and so has, in principle, 36 components. It is a property of the material. In the
rest frame of the material it is a function of position, reflecting the composition
of the material, but not a function of time if the material is in equilibrium. It
is a generalization of the familiar relation

Dj = (ε0 )jk E k , B j = µjk Hk (7.11)

describing the relation between the electric displacement field and the electric
field in a dielectric and between the magnetic induction and the magnetic field
in a magnetic materials. Under Lorentz transformations ε0 transforms like a
4-th rank tensor. This reflects the fact that a constitutive relation normally
breaks Lorentz invariance since it is a property of a medium that has a rest
frame.
In the case of free space the constitutive relation is the same in all Lorentz
frames. Indeed, since in vacuum F µν = Dµν we have

Dµν = εµναβ
0 Fαβ = η µα η νβ Fαβ (7.12)

It follows that
1 µα νβ
εµναβ η η − η να η µβ

0 = (7.13)
2
102 CHAPTER 7. MAGNETIC FIELDS AND MAGNETIC INDUCTION

and we wrote ε0 so that it is explicitly unti-symmetric in the first and last pair of
indices. The tensor ε0 plays the role of the identity for 4-th rank anti-symmetric
tensors. It is invariant under Lorentz transformations, since η is.
This relation for E and D takes the form
E j = Dj = (ε0 )jk Ek = η jk Ek (7.14)
This gives the Minkowski metric plays the interpretation of a dielectric constant
of the vacuum. This observation is the starting point of the theory of cloaking.

7.2 Poarization and Magnetization


When we consider Maxwell’s equation near or inside material bodies, it is the
sources can be partitioned as
ρ = ρmicro + ρmacro , J = Jmicro + Jmacro (7.15)
we normally can not describe the microscopic sources.
The homogeneous Maxwell are oblivious to the sources and retain their usual
form
∇ · B = 0, Ḃ + ∇ × E = 0 (7.16)
The inhomogeneous equation care about the sources, so if we want to get rid
of the microscopic sources we need to do something. We do that by replacing
the microscopic sources by fields that describe the medium: The polarization P
and the magnetization M. Both are a property of the microscopic sources and
so are confined to the material bodies containing them:
P(x) = M(x) = 0, x ∈ {outside body} (7.17)
The polarization and magnetization characterizes the microscopic source by
1
ρmicro = −∇ · P =, Jmicro = Ṗ + ∇ × M (7.18)
c
The inhomogeneous Maxwell equations involving all the sources are

∇ · E = 4π(ρmicro + ρmacro ), −Ė + ∇ × B = (Jmicro + Jmacro ) (7.19)
c
If we define
D = E + 4πP, H = B − 4πM
we find differential equations that involve only the macroscopic sources:

∇ · D = 4πρmac , −Ḋ + ∇ × H = Jmac (7.20)
c
We succeeded in getting rid of the microscopic sources at the price of doubling
the number of unknown fields to four vector fields: (E, B, D, H). It is instructive
to go again through the counting of equations and unknown fields:
7.2. POARIZATION AND MAGNETIZATION 103

• The homogenoues Maxwell equations Eq. 7.16 for the six fields (E, B)
should be thought of as one vector valued evolution for Ḃ and a constraint
on the initial data.
• The in-homogenoues Maxwell equations Eq. 7.20 for the six fields (D, H)
should be thought of as one vector valued evolution for Ḋ and a constraint
on the source term: The continuity equation.
• The missing 6 equations are the constitutive relations

Dµν = εµναβ Fαβ (7.21)

where the tensor ε characterized the material. For the vacuum ε is given
by Eq. 7.13.
• In the general ε must be viewed, in general, as a tensorial linear operator.
If the material is homogeneous and translation invariant, then a transla-
tion invariant linear operator is a convolutions in space time. In the case
that the material is also memory-less and has zero-range correlations, ε
reduces to a constant. This is the case for the vacuum1 .
Since D, E are even under time reversal while B, H are odd, when time-reversal
is a symmetry the constitutive relation takes the form

Dj = εjk Ek , Bj = µjk H k (7.22)

In the case of isotropic media εjl = ε0 δ jk and µjk = µ0 δjk are proportional to
the identity.
Exercise 7.1. Show that the microscopic charge distribution of a homogeneously
polarized sphere of radius a is concentrated on the surface with surface density

P · x̂ δ(|x| − a)

Exercise 7.2. Show that the current distribution of a homogeneously magne-


tized sphere of radius a is concentrated on the surface with surface density

M × x̂ δ(|x| − a)

Exercise 7.3. Find E, D inside and outside a uniformly polarized sphere.

1 This has to be revisited in a quantum thery.


104 CHAPTER 7. MAGNETIC FIELDS AND MAGNETIC INDUCTION

P P

Figure 7.3: The intuitive notion of polarization of infinite macroscopic bodies


in terms of dipole moments is confusing: Is the polarization the short arrow
pointing to the right or the long arrow pointing to the left? Only by looking at
the boundary on can tell.
Chapter 8

Cloaking

8.1 Dielectric media


In a dielectric medium, the homogeneous equations, Faraday’s law and no-
monoples, are the same as in free space1 . In curvilinear coordinates they take
the form
1 √ εjk`
∇ · B = √ ∂j ( gB j ) = 0, Ḃ j + √ ∂k E` = 0 (homogenous)
g g

Dot stands for derivative with respect to x0 . In the absence of external sources,
the inhomogeneous Maxwell equations, Gauss and Ampere laws, in a dielectric
are
1 √ εjk`
∇ · D = √ ∂j ( gDj ) = 0, Ḋj − √ ∂k H` = 0 (inhomogeneous)
g g
There are twelve unknown fields, and six evolution equations and two con-
straints. The missing equations are the constitutive relations2
Dj = εjk Ek , B j = µjk Hk
where ε and µ are tensors. We can then write for Maxwell equations for di-
electrics, in the absence of sources as equations for E, H
√ εjk`
∂j ( gµjk Hk ) = 0, µjk Ḣk + √ ∂k E` = 0 (8.1)
g
and
√ εjk`
∂j ( gεjk Ek ) = 0, εjk Ėk − √ ∂k H` = 0 (8.2)
g
1 In Landau Lifshitz, E and B represent averages over a macroscopical small, but micro-

scopically large, ball.


2 Hopefully no confusion will arise between ε as dielectric constant and ε as Levi-Civita

tensor.

105
106 CHAPTER 8. CLOAKING

The vacuum may then be thought of as a dielectric where the metric tensor is
the dielectric constant, in c.g.s, is:
ε=µ=g
where g is the (Euclidean) metric. In a Euclidean metric g is proportional to the
identity and in any metric g j k = δkj . For example, in a two dimensional space
described by cylindrical coordinates the covariant components of the metric and
contravariant components of the dielectric tensors are
   
1 0 1 0
g= , ε=µ= (8.3)
0 ρ2 0 1/ρ2
In practice, it is often the case that µ ≈ g. Then Maxwell’s equations take the
(approximate) form
√ √ jk
∂j ( gg jk Hk ) = 0, gg Ḣk + εjk` ∂k E` = 0 (8.4)
and
√ √
∂j ( gεjk Ek ) = 0, gεjk Ėk − εjk` ∂k H` = 0 (8.5)

8.2 Invisible dielectrics


We have all, at one point or another in our lives, bumped into a glass door.
If conditions are right; the glass is clean, and there is more transmitted light
then reflected light, the glass is almost invisible. The glass is, in fact, not quite
invisible. It also reflects light. You can sometime make a glass almost reflection-
less, by making the reflections from the back surface interfere destructively with
the reflections from the front surface. You can not do that for all wavelengths,
but you can do that for some directions. What we want to do here is more
ambitious: Make a finite object with ε 6= g behave as free space no matter from
what direction it is viewed.
Lets examine invisibility from the perspective of 8.4, 8.5.
Let us first focus on the equations that involve ε namely, Eq. 8.5. Consider
now free propagation, i.e. no dielectric, in a different curvilinear coordinate
system. Let us denote the metric in the second coordinate system γ and the
corresponding coordinates ξ. Eq. 8.5 is now modified by replacing
√ jk √
gε (x) 7→ γγ jk (ξ) (8.6)
Consider now the case that both g and γ describe different coordinates of the
same physical space. Then
∂xk ∂xj
γab = gjk (8.7)
∂ξ a ∂ξ b
Now, suppose that we choose the the dielectric ε so that the two functions are
the same, i.e. √  √ 
gεjk (x) = γγ jk (x) (8.8)
8.2. INVISIBLE DIELECTRICS 107

Figure 8.1: Reflection from a mirror can be minimized when the reflection from
the front and back interfere destructively. This works for specific directions and
wavelengths.

This means that as far as Eq. 8.5 is concerned the dielectric behaves as if its is
a coordinate transformation of free space. This is also true for Eq. 8.4 which is
independent of ε, since it holds in any curvilinear coordinate system. If ε 6= g
in a finite ball, the coordinate transformation is restricted to the ball and one
gets a picture such as in Fig. 8.3. We have thus created an invisible dielectric.

Figure 8.2: The figure illustrates a local coordinate transformation of the Eu-
clidean plane, so that the image of straight lines become the curves in the
figure.Such a coordinate transformation can be implemented by a choice of a
suitable dielectric tensor ε. The corresponding dielectric, the reddish disk, is
invisible.

Example 8.1. Under the local coordinate transformation of polar coordinates

ρ = r + e−r − 1, θ 7→ θ (8.9)

The straight line in the ρ, θ) plane ρ sin θ = const transforms to (r + e−r −


1) sin θ = const transforms to a curve with straight asymptotics.
108 CHAPTER 8. CLOAKING

8.3 Cloaking
Now that we know how to make dielectrics that are invisible3 the next challenge
is to engineer dielectrics that can cloak arbitrary objects.
The basic idea behind cloaking is a singular coordinate transformation, that
maps the exterior of a ball in physical space to the Euclidean space (with the
origin removed).
This is best illustrated by working out through an example in the plane.
Let r be the radial coordinate of physical space, which hosts a dielectric tensor
ε(r). Consider the mapping from the plane with radial coordinates (ρ, θ) into
the plane with radial coordinates (r, θ)

r 2 = ρ2 + 1 (8.10)

This maps the entire plane ρ ≥ 0 to the r-plane minus a disk, namely r ≥ 1.
Physical space (r, θ) is the Euclidean plane with the standard Eclidean polar
metric a g. The induced metric γ on the (ρ, θ) plane is, by Eq. 8.7
 2  
∂r ρ 2
γρρ = = , γθθ = r2 , det γ = ρ2 (8.11)
∂ρ r
Maxwell’s equation 8.5 is fully determined by the functions
 2  2
√ ρρ r r2 1 √ θθ 1 ρ
γγ = ρ = =ρ+ , γγ = ρ = 2 (8.12)
ρ ρ ρ r ρ +1
The corresponding equation in the physical space with a dielectric is
√ rr √ θθ
gε = rεrr gε = rεθθ (8.13)

The differential equations in the two spaces are the same equations up to re-
naming ρ ↔ r if the functions are the same, i.e.
1 r
rεrr = r + , rεθθ = , r>1 (8.14)
r r2 + 1
This is the equation that makes ε a cloaking material. After simplification
1 1 1 1
εrr = 1 + 2
= g rr + 2 , εθθ = = g θθ − 2 2 , for r > 1 (8.15)
r r r2 +1 r (r + 1)
The second term makes it manifest that for r  1 the dielectric is surrounded
by the vacuum. The dielectric ε for r < 1 can be arbitrary and it does not affect
the solutions of Maxwell’s equations outside the ball r > 1. This shows that the
enveloping dielectric for r > 1 is cloaking the inside.
Of course, a difficult issue we have not addressed is how to engineer ε. In
fact, you’d expect a cloaking dielectric to be a wiered material. Landau and
Lifshits argue that ordinary materials at thermal equilibrium have

εjk ≥ g jk (8.16)
8.3. CLOAKING 109

Figure 8.3: The figure illustrates schematically a cloaking envelope shown as a


reddish annulus. Anything inside the while disk is invisible from the utside.

This is violated by Eq. 8.15.

Bibliography :
J. B. Pendry et. al .“Controlling electromagnetic fields”, Science 312, (2006)

3 The geometric description of cloaking is due to Ulf Leonhardt (Now at Weizmann).


110 CHAPTER 8. CLOAKING
Chapter 9

The Stress-Energy tensor

The energy, momentum and angular momentum of the electromagnetic field


are identified. Maxwell stress tensor is derived from variation of action due to
variations of the metric.

9.1 Maxwell stress energy tensor


It is a common experience that the electromagnetic field carries energy: You feel
the warmth of the sun. It is not a common experience that the electromagnetic
field also carries momentum: You do not feel the sum pushing you. Anyway,
you may remember that the energy density, and energy flux (and momentum
density) of the electromagnetic field are
E2 + B2 E×B
E= , S= , (9.1)
8π 4π
We want to derive the covariant generalization of this result.
We first want to argue that what we are looking for is a second rank tensor.
This is because in relativity energy and momentum are amalgamated into a

Figure 9.1: The stress tensor represents the flow of momentum (red arrow)
through a cross section in space-time (blue line)

111
112 CHAPTER 9. THE STRESS-ENERGY TENSOR

single tensor. Therefore energy density and momentum density will be members
of the same tensor, and so would be the density of energy currents and density
of momentum currents. T 00 wil stand for the energy density and T 0j for the
energy current in the j-th direction. T j0 stand for the j-th component of the
momentum density and T jk for the j-th component of the momentum current
in the k-th direction.
The second observation we make is that T µν must be quadratic in the field
F . This is because the energy density and Poynting vector have this form.
There are three second rank tensors that we can make that have these prop-
erties, using F , η and ε. T should therefore be a linear combination of

F µα F ν α , η µν F αβ Fαβ , εµναβ Fαβ Fµν (9.2)

The right linear combination is


1  µα ν 
T µν = F F α − 41 η µν F · F (9.3)

This is the Maxwell energy-stress tensor. You can easily verify that this is the
right choice by checking that
1  0j 0j 1 αβ 
T 00 = F F } − 4 F Fαβ = E (9.4)
4π | {z | {z }
E2 −2(E2 −B2 )

Explicitly, T is
 
0 B×E
E2 + B2
1
 
4πT = − 
 B×E

2 |{z} Ei Ej + Bi Bj 
relative sign

You recognize the energy density and Poynting vector. We shall discuss the
other terms below.

9.1.1 The stress-energy and conservation laws


Antennas are pumps of electromagnetic energy and momentum: Antennas al-
lows to convert the motion of charges into electromagnetic field, and and vice
versa. The energy and momentum of the field can be converted to the energy
and momentum of the particle. For a charged particle we found for the 4-force
vector, Eq. 4.17
e
fν = Fνµ uµ , pµ = muµ = mγ(−c, v) (9.5)
c
By Newton’s equation
ṗν = fν (9.6)
so the 4-force measures the rate of gain of the 4-momentum of the particles.
The rate of change of the energy is cf0 .
9.1. MAXWELL STRESS ENERGY TENSOR 113

The loss of ν-momentum from the field is measured by its divergence

∂µ T µν = ∂µ T νµ (9.7)

since T is symmetric. The loss of momentum of the field is the gain of mo-
mentum of the particle, in the case of continuous charge distribution, we must
have,
1
∂µ T µν = jµ F µν (9.8)
c
T has dimension of energy density (in space) and ∂T has dimension of energy
density (in space-time). This gives the rhs of Eq. 9.8 the interpretation of source
of energy per unit of space-time volume. The identity follows from Maxwell’s
equations. Since this is a subtle and computation let us first observe that since
T is bi-linear in F and Maxwell’s equations are linear with j its source term the
rhs has the structure one expects.
As preparation let us fist compute the divergence of the first term in T

∂µ (F µα F ν α ) = (∂µ F µα )F ν α + F µα ∂µ F ν α

= − j α F ν α + F µα ∂µ F ν α
c

= jµ F µν + Fµα ∂ µ F να
c

= jµ F µν + Fαβ ∂ α F νβ
c
We have used the the in-homogeneous Maxwell Eq. 6.8 in the second line, moved
indexes up and down in the third and renamed µ 7→ α and α 7→ β in the last
line. For the second term we have

∂µ (η µν F · F ) = 2Fαβ ∂ ν F αβ
= −2Fαβ (∂ α F βν + ∂ β F να )
= −2Fαβ ∂ α F βν − 2Fβα ∂ α F νβ
= 4Fαβ ∂ α F νβ

where I have used the homogeneous Maxwell equation in the second line in the
form of Exercise 3.15. In the third line I replaced α ↔ β. Putting these in the
equation for T µν we get Eq. 9.8.

Exercise 9.1 (Plane waves). Show that the stress tensor for plane electromag-
netic waves
Aµ = aµ eik·x , k µ Aµ = 0, kµ k µ = 0
is
4πT µν = a · a k µ k ν

Exercise 9.2. Compute ∂µ (E · B)


114 CHAPTER 9. THE STRESS-ENERGY TENSOR

9.2 Conservation laws


In the absence of currents j, the energy-momentum of the field is conserved.
This follows from
∂µ T µν = 0 (9.9)
To see this consider a space-time box as in Fig. 9.2. Suppose the spatial box

ct

Figure 9.2: The field is represented by the green ellipse and the space-time box
by the red rectangle

is large enough to embrace all the fields at any given time. Then
Z Z Z t2
Z t2
0 = dΩ ∂µ T µν = dSµ T µν = dS0 T 0ν = dV T 0ν
t1 t1

It follows that the 4 vector Z


dV T 0ν

is conserved in time. We identify T 00 with the energy density and T 0j /c with


the momentum density:

E2 + B2 B×E
T 00 = , T 0j = (9.10)
8π 4π
The Poynting vector is the momentum density up to factor c.
It may be worthwhile to note that even though T is symmetric the physical
interpretation of T µν is different from the interpretation of T νµ . For exam-
ple T 0j is interpreted as the momentum density (up to factor c) while T j0 is
interpreted as the energy flux.
9.3. STRESS TENSOR 115

Figure 9.3: You need to apply an external force to hold the capacitor plates
from collapsing on each other.

9.3 Stress tensor


The spatial part of the stress-tensor is

−Ex2 + Ey2 + Ez2


 
−2Ex Ey −2Ex Ez
8πT =  −2Ey Ex Ex2 − Ey2 + Ez2 −2Ey Ez  (9.11)
2 2 2
−2Ez Ex −2Ez Ey Ex + Ey − Ez

The term T k3 gives the density of the k-th momentum in the space-time volume
element
dP k = T k3 dx1 dx2 dt (9.12)

Since the force is the rate of momentum the k-th component of the force dF k
acting on the 3-surface dx1 dx2 is

dF k = T k3 dx1 dx2

This gives T jk the same meaning as the stress tensor in the theory of elasticity.

9.3.1 Case study: Capacitor plates


The two sides of Eq. 9.8 give us two ways to compute the forces in an electro-
magnetic fields. Let us illustrate this with the force of attraction between the
two capacitor plates in the figure.
Using the rhs of Eq. 9.8: The rhs gives the energy density per space time
volume. Consider the volume associated with the green box. The 4-current has
only the 0-component, given by the surface charge density, i.e.

j0 = cσδ(z − z0 ), 4πσ = −Ez (9.13)

and we took the top plate to be at z0 . The field is discontinuous at z0


(
0 z > z0
F 0z = (9.14)
Ez z < z0
116 CHAPTER 9. THE STRESS-ENERGY TENSOR

The product of a delta function with a discontinuous function is mathematically


problematic but the natural physical choice is to take the mean

E2
Z
1
dzj0 F 03 = 12 σEz = − z (9.15)
c 8π
The force per unit area pulls the upper plate down, indicated by the minus sign,

Ez2
− (9.16)

Using the lhs of Eq. 9.8: The stress T zz is
E2
(
zz − 8πz bottom plate
T = (9.17)
0 top plate

The bottom of the green box has space-time volume

dS3 = −dx1 dx2 cdt (9.18)

The rate of flow of z-momentum out of the green box per unit area is

Ez2
(9.19)

Since the box is loosing z-momentum, the force in the z-direction is negative.
As the green box can be shrunk to hug the top plate, the force on the plate
computed in both ways agree.

9.4 Field lines as rubber bands


Let us look at
8πT xx = Ey2 + Ez3 − Ex2 + By2 + Bz3 − Bx2
The parallel and perpendicular components come with opposite signs: The stress
can be positive or negative, but the sign has nothing to do with the signs of E.
When the stress is negative the electromagnetic field provides negative pressure.

Exercise 9.3. Compute the components of T for the example 6.2.Show that the
current carrying wire pumps energy into the electromagnetic field at a constant
rate. Where does the energy come from and where is it dumped?
9.5. THE STRESS TENSOR AS VARIATION OF THE METRIC 117

Figure 9.4: You need to apply a force to hold two oppositely charged capacitor
plates apart (red arrows). If the arrow is in the z-direction, T zz < 0. You get
this sign if you think of the electric field lines as rubber band: As if the pressure
is negative.

Figure 9.5: The magnetic field lines of a solenoid run parallel to the solenoid
axis. As a consequence the stress in the radial direction T ρρ > 0 inside the
solenoid. You get the right sign if you replaced the field lines by rubber bands:
Stretched rubber bands along the z-axis, that fan out in the radial direction,
will lead to a positive pressure in the radial direction and negative pressure in
the axial direction.

9.5 The stress tensor as variation of the metric

When Maxwell constructed his theory the queen of science was, of course, me-
chanics. In particular, he understood well elasticity and fluid mechanics. In
elasticity theory the concepts of stress and strain are important, and it was
natural for Maxwell to ask what is their analogs in electrodynamics. One can
think of a strain as a deformation of the metric. For example, the strain shown
in the figure 1 can be represented by deformation of the Euclidean metric

   
1 0 1+ 0
g= →g=
0 1 0 1−

1 The vector field is divergence-less and curl-free.


118 CHAPTER 9. THE STRESS-ENERGY TENSOR

Figure 9.6: In the theory of elasticity a strain is described by a vector field.


The figure shows the vector field associated with (uniform) contraction of y and
dilation of x: Namely (x, −y). The strain causes stress in the material. Energy
is stored in it like in a compressed spring.

9.5.1 Variation of the metric in mechanics


The Lagrangian of a free classical particle is
m ∂L
L= gij q̇ i q̇ j , pj = = mgjk q̇ k = mq̇j
2 ∂ q̇ j
The variation of the metric gives
∂L
= 1
2 mq̇ i q̇ j = 12 pi q̇ j
∂gij
The symmetric tensor describes the i-momentum current in the j direction.

9.5.2 Variation of the metric in electrodynamics


The action depends on the metric in two places. First, in the volume element
p
dΩ = |g|dx0 dx1 dx2 dx3
and also in the scalar product
F · F = F αβ gβγ F γδ gδα (9.20)
Hence
p p p
|g|F · F = δ( |g|) F · F + |g|F αβ F γδ δ(gβγ gδα )

δ
p p
= δ( |g|) F · F + 2 |g|F αβ F γδ gδα δgβγ
p p
= δ( |g|) F · F + 2 |g|F αβ F γ α δgβγ
9.5. THE STRESS TENSOR AS VARIATION OF THE METRIC 119

9.5.3 Matrix calculus


To compute the variation of det g we need tools from matrix calculus. Being
symmetric g can be diagonalized
X
g= Pµ γµ , Pµ = |µi hµ|

where γµ are its (real) eigenvalues and Pµ are orthogonal projections. By defi-
nition
Y X
det g = γµ =⇒ log |g| = log γµ

and so
p X δγµ
δ log det g = 21 δ log(|g|) = 1
2 γµ

We want to express the right hand side in terms of g and its variation δg. To
do that observe that,
X Pµ
g −1 =
γµ

and so
X Pµ (Pν δγν + γν δPν )
g −1 δg =
γµ

Exercise 9.4 (Projections). Show that if Pµ are orthogonal projections, Pµ Pν =


δµν Pν , then
T r(Pµ δPν ) = 0

We have then showed that


p
δ log − det g = 12 T r(g −1 δg) = 12 g γβ δgβγ

and so finally
p p p p
δ |g| = |g| δ log |g| = 12 |g|g γβ δgβγ (9.21)

9.5.4 The stress tensor


Collecting terms we get
p
∂( |g|F · F ) p  βα γ 
= |g| F F α − 14 g βγ F µν Fµν (9.22)
∂gβγ

We now shift back g into the volume element and recover the energy momen-
tum tensor.
120 CHAPTER 9. THE STRESS-ENERGY TENSOR

9.6 Nöther: Symmetries


Conservation laws expresses symmetries, a relation due to Nöther. If you com-
pute the action of a given field configuration in a box, you get the same number
if you translate (and rotate) the coordinates. This is a trivial statement, fol-
lowing from the homogeneity of Minkowski space-time. You may expect to get
no any interesting identities from it. The genius of Nöther was to realize that
if the field is not an arbitrary field configuration, but one that solves the Euler
Lagrange equations, then the statement reduces to statement about the fields
on the boundaries of the box. This is what you expect form a conservation law:
What comes in through one boundary must leave through another.
Nöther introduced a technique which is amusing: She split the real operation
of coordinate shift into two virtual operations: One that shifts the box–but not
the fields–and one that shifts the fields–but not the box. Think of moving your
bag in two steps: First you move the content of the bag and second you move
the empty bag. This relates to non-trivial quantities. For fields that satisfy
Euler-Lagrange equations, the variation live on the boundary.

t t t

x x x

Figure 9.7: The action remains the same when the fields and the integration
box are both shifted in space-time. For a small shift the change in action can be
split into two virtual shifts: A shift of the field with the box held fixed, shown
in the middle figure, and a shift of the box with the field held fixed field, shown
on the right.

9.6.1 Shifting the field


Consider the change in action due to a shift of the field without a shift of the
box. The variation of the action SF due to arbitrary variation δAµ of the fields
9.6. NÖTHER: SYMMETRIES 121

in a a fixed box, has been computed in Eq. 6.5 and was found to be
Z Z
µν
4πc (δSF ) = − dΩ ∂µ (F δAν ) + dΩ(∂µ F µν )δAν
| {z } | {z }
bdry term Euler-Lagrange

(9.23)
For fields that satisfy the Euler-Lagrange equation, only the left term matter–
the variation is a boundary term
Z
4πc (δSF ) = − dΩ ∂µ (F µν δAν ) (9.24)

A uniform space-time shift by δξ α leads to variation in the fields


δAµ (x) = −(∂α Aµ )δξ α (9.25)
Exercise 9.5 (Signs-Sigh). Explain the minus sign.
Inserting the variation into Eq. (9.24) and using Maxwell equation2 one finds
for the integrand
−∂µ (F µν δAν ) = ∂µ (F µν ∂α Aν )δξ α
= ∂µ F µν Fαν δξ α + ∂µ F µν ∂ν Aα δξ α
 

= ∂µ F µν Fαν δξ α + F µν ∂µν Aα δξ α
 

= ∂µ F µν Fαν δξ α


We see that the change in action due to shifting the field is:
Z
4πc (δSF ) = dΩ ∂µ F µν Fαν δξ α

(9.26)

9.6.2 Shifting the box


As a warmup consider the variation of a one dimensional integral upon shifting
boundary points by δξ
Z b ! Z
b+δξ Z b

δ f (x)dx = f (x)dx − f (x)dx = δξ f (b) − f (a)
a a+δξ a

The case at hand is the multidimensional version of this.


Shifting the box without shifting the fields changes the action by boundary
terms of the form
Z 
4πc (δSF ) = − 41 dSα F · F δξ α
Z
= − 41 dΩ ∂α (F · F ) δξ α (9.27)

and we used the fundamental theorem of calculus in the second line.


2 Note that ξ α is a constant, not a function.
122 CHAPTER 9. THE STRESS-ENERGY TENSOR

9.6.3 Joint box and field shift


For the joint shift we get by combining Eq. 9.27 and 9.26
Z  
0 = dΩ ∂µ F µν Fαν − 14 ∂α (F · F ) δξ α


Z  
= dΩ ∂µ F µν Fαν − 41 gαµ (F · F ) δξ α
Z  
= dΩ ∂µ T µα δξα

Since this is supposed to hold for any (infinitesimal) box and any shift, we get
the conservation law (in the absence off source currents).

9.6.4 Symmetry and traceless


The tensor is symmetric and traceless. Symmetry is obvious from the definition,
as η is symmetric. Traceless follows from the fact that η α α = 4 and so

1  αµ 
T αα = F Fαµ − 41 η α α F µν Fµν = 0

9.7 Applications
9.7.1 Radiation pressure
The luminosity of the sun L = 3.6 × 1026 [W ], giving a stream of 1045 pho-
tons/sec. The radial component of Maxwell energy momentum tensor at a
distance R from the sun is then
L
T0r̂ = (9.28)
4πR2 c
Consider a macroscopic (black) particle of radius r that perfectly absorbs radi-
ation. The force on the particle at a distance R from the sun is then

r2
Fradiation = L (9.29)
4R2 c
The gravitational force on a particle with density ρ is

4πρr3 M G
Fgravity = (9.30)
3R2
Where M = 2 × 1033 [gram] and Newton constant G = 6.7 × 10−8 [cgs]. The
ratio of the two is then
Fradiation 3L 0.06 [mgr/cm2 ]
= ≈ (9.31)
Fgravity 16πrρcM G ρr
9.7. APPLICATIONS 123

For water ρ = 1 [gm/cm3 ]. For earth, r = 6 × 108 [cm], the ratio is minuscule:
10−13 . However, for very small grains, of radius less than 6×10−5 [cm] radiation
dominates.
Radiation pressure cleans the solar neighborhood from fine dust. This could
be a mechanism of transporting viruses from our solar system to distant parts
of the universe3 .

Exercise 9.6 (Comet tails). Can you figure out the shape of a comet tail?
Suppose the tail is associated with a planet in circular non-relativistic orbit.
Hint: Figure out the tail in the rotation frame.

9.7.2 Solar sails


The computation above can be applied to solar sails. A sail of area A and width
d can be used to sail away from the sun provided
A ρAdM G
Fradiation = 2
L > Fgravity = (9.32)
4πR c R2
Cancelling the similar terms we get
1
L > ρdM G (9.33)
4πc
Up to factors of order unity we get, the same estimate as above. You need very
thin sails to build solar sails.

Figure 9.8: A planet encircling a star and the tail of dust it sprays (tail)

9.7.3 Halbach array


The Halbach array is shown in the fig 9.9. Make a qualitative plot of its field
lines. You will find a proper plot here. Since the field is large and essentially
parallel to the array on one side of the array, and small on the other. You may
then wonder if the array is a Baron von Munchhausen: Properly oriented, it
will float in gravitational field. Discuss the Maxwell stress tensor and resolve
this apparent paradox.
3 The assumption that the particle is black is not reasonable when the radiation penetrates

a distance comparable to the size.


124 CHAPTER 9. THE STRESS-ENERGY TENSOR

Figure 9.9: Halbach array gives a large magnetic field above the array and small
one below it.
Chapter 10

Electrostatics and
magnetostatics

Now, that we have Maxwell equations, it remains to look at solutions for inter-
esting physical problems. We start with time independent problems.

10.1 Static electric fields:


The electric field E is determined by Gauss and Faraday’s law

∇ · E = 4πρ, ∇ × E + |{z}
Ḃ = 0
=0

and dot is a derivative with respect to x0 = ct. In the static case Ḃ = 0 and
the equations reduce to

∇ · E = 4πρ, ∇×E=0

The source ρ(x) is assumed to be known.


These two equations1 determine E completely. Faraday’s law implies that

E = −∇φ

Substitution in Gauss law gives Poisson’s equation

∆φ = −4πρ (10.1)

which is a partial differential equation for the potential (if the dimension n ≥ 2).
Remark 10.1 (Time dependent ρ). From a formal mathematical point of view,
one may also consider Poisson’s equation with ρ that is time dependent (e.g. a
moving point charge). However, time dependent ρ would normally entail Ḃ 6= 0.
1 And the assumption that space is Minkowski and the fields vanish at infinity

125
126 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

10.2 Harmonic functions


Let us start with the simple, but important, case where ρ = 0. Functions that
satisfy
∆φ = 0 (10.2)
are called Harmonic. They are dear to both physicists and Mathematicians.
A fundamental fact about Harmonic functions is:
Theorem 10.2 (Harmonic functions). If φ is Harmonic, then φ(x) is the av-
erage value of φ on a sphere centered at x.

The theorem is evident in one dimension: A harmonic function is a linear


function and hence the average of equidistant neighbors. We shall postpone
the proof in the general case after we assemble some more tools. Let us first
consider an important consequence:

Corollary 10.3 (Erenshaw). Harmonic functions in a domain ω assume (local)


maxima and minima on the boundary ∂Ω. As a consequence, charges can not
be stably trapped by electrostatic fields.
Exercise 10.4. Suppose that locally
(0) (2)
Ej (x) = ej (0) + ejk xk + O(x2 )

Show that ∇ · E = 0 implies T r e(2) = 0. In particular the matrix e(2) can not
have all eigenvalues of one sign.

10.2.1 Beating Ehrenshaw: Magnetic and electric traps


Levitron is a spinning magnetic dipole that hovers in a gravitational field above
a stationary magnet. You may wonder if this toy violate Ehrenshaw. It does
not, because the top is spinning and Ehrenshaw deals with a stationary case.
But, this does not yet explain how Levitron works. Let’s look at the problem
in more detail.
The energy of a stationary magnetic dipole d whose mass is M in electro-
static fields E and gravitational field g is

E = −d · E + M g · x

The total force acting on the dipole is

F = −∇E = ∇(d · ∇φ) − M g (10.3)

The dipole is in equilibrium if F = 0. Suppose we found a point where F = 0.


We claim that such a point of equilibrium is unstable if d is a fixed vector and
φ is harmonic, which it must be in the electrostatic case. Indeed:

∆E = ∇ · F = ∆(d · ∇)φ = (d · ∇)∆φ = 0 (10.4)


10.3. LAPLACE EQUATION IN TWO DIMENSIONS 127

How can we turn this setting to one where the equilibrium is stable? Clearly,
this can only occur if we let d be a function of position. Suppose that, for some
reason, the dipole wants to orient itself opposite to the local field,

d(x) = −d Ê(x)

Eq. 10.4 is modified to


∆E = ∇ · F = d∆|E| (10.5)
|E| is, of course, not a harmonic function, even if φ is. For example, near a
point where |E| = 0, the function looks like a cone, with a minimum at the
vertex of the cone. This removes the in-principle bar from finding a stable
equilibrium. This, of course, does not yet guarantee stability and it turns out
that the analysis is involved, so we stop here.

10.3 Laplace equation in two dimensions


Two dimensions are special in that the machinery of holomorphic functions
allows to compute things explicitly in many cases.
A point in the plane can be described by a the complex number z = x + iy
and z̄ = x − iy. Here we regard z and z̄ as independent variables. Similarly,

2∂ = 2∂z = ∂x − i∂y , 2∂¯ = 2∂z̄ = ∂x + i∂y , (10.6)

This language also works for any real vector in 2-D, which can be represented
by a single complex number

E = E · x̂ + i E · ŷ

A (complex) function f (x, y) can be written as a function of f (z, z̄). Analytic


functions are functions of z only and
¯ =0
∂f (10.7)

is the Cauchy-Riemmann equation for analytic functions.


In two dimensions, the vector field ∇φ for real φ, can be identified with a
single complex function. Clearly, it must be some linear combination of ∂φ and
¯ From Eq. 10.6, one finds that actually
∂φ.
¯
E = −2∂φ (10.8)

Now, in two dimensions both ∇ · E and ∇ × E are (scalar valued) functions


¯ A computation shows that
which must be a linear combination of ∂E and ∂E.

2∂E = ∇ · E − i∇ × E (10.9)

Hence, for any real vector field, ∇ · E is the real part of 2∂E.
128 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

An easy computation gives for the Laplacian


∆φ = 4∂z ∂¯z φ (10.10)
A harmonic function in two dimensions is therefore of the form
∆φ = 0 −→ φ(z, z̄) = f (z) + g(z̄)
The general real solution is therefore
φ(z, z̄) = f (z) + f (z̄)

10.3.1 Harmonic functions and Riemann mapping


Harmonic functions in a domain D are determined by their values on the bound-
ary ∂D.
Let us first see how this works for the a unit disc. We assume that the
boundary values are given by the Fourier series2

X
φ(θ) = an einθ
n=−∞

Write the potential inside the disc as a sum of a holomorphic and anti-holomorphic
functions

X ∞
X
φ = f + ḡ + a0 , f (z) = fn z n , g(z) = gn z n
n=1 n=1

Evidently, for n ≥ 1
fn = an , ḡn = a−n
f and g are analytic in the unit disc since the corresponding series are absolutely
convergent.
We can now use Riemann mapping theorem to generalize this result for
the unit disc to any open, simply connected domain D in the plane. Riemann
mapping theorem states that if D is an open, simply connected domain in R2
then there is a holomorphic (and invertible) function ζ = ϕ(z) that maps D to
the unit disc |ζ| < 1.

10.4 Harmonic polynomials and spherical har-


monics
Consider homogeneous polynomials of degree n ≥ 0 in d variables. They form
a vector space Vn,d with elements
d
X
xn1 1 . . . xnd d , n= nj , nj ≥ 0
j=1
2 We therefore assume that
P
|an | < ∞. Note that φ need not be a continuous function
on the boundary.
10.4. HARMONIC POLYNOMIALS AND SPHERICAL HARMONICS 129

n1 nd

Figure 10.1: The n balls correspond to the degree of the polynomial. (d − 1)


red partitions define d boxes. The j-th box represents xj . The number of dots
in the j-th box gives the power of xj . The number of different polynomials is
the same as the the number of ways of putting the d − 1 red partitions among
the n balls. There are n + d − 1 place holders, and we need to decide if we put a
partition or a ball in the place-holder. This is the problem of selecting d objects
out of n + d − 1 objects, and so is is the binomial coefficient Eq. 10.11.

The dimension of this space is


 
n+d−1
dim Vn,d = (10.11)
d−1

This is explained in the figure. The Harmonic polynomial are the kernel of the
Laplacian3
∆ : Vn,d → Vn−2,d (10.12)

The homogeneous Harmonic polynomials of degree d are a vector space Hn,d


and its dimension is evidently
   
n+d−1 n+d−3
dim Hn,d = dim Vn,d − dim Vn−2,d = −
d−1 d−1

The harmonic polynomial of degree n can be written as

Hn (x1 , . . . , xd ) = rn Yn (Ω) (10.13)

where Ω denotes a point on the unit sphere in d dimension. Y is a spherical


Harmonic. As we shall now see, it is an eigenfunction of the “spherical Lapla-
cian”.
The Laplacian in spherical coordinates splits into a radial and angular pieces
by
1 ∂ ∂ 1
∆ = d−1 rd−1 − 2 L2 , d ≥ 2 (10.14)
r ∂r ∂r r
The second term is the angular part of the Laplacian. L2 may be thought of as
(minus) the Laplacian on the unit sphere, or alternatively as the kinetic energy
associated with the angular motion.

3V n+d−3
n−2,d = 0 for n = 0, 1 since d−1
= 0 for n < 2.
130 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

Since Hn is homogeneous and harmonic


1 ∂ d−1 ∂ 1
0 = ∆Hn = r Hn − 2 L2 Hn
rd−1 ∂r ∂r r
1 ∂ ∂
= Yn d−1 rd−1 rn − rn−2 L2 Yn
r  ∂r ∂r 
= rn−2 n(n + d − 2) − L2 Yn (10.15)

It follows that the “spherical harmonics” Y are eigenfunctions of L2 with eigen-


values
L2 Yn = n(n + d − 2)Yn , d ≥ 2 (10.16)
There is a single harmonic function on the sphere, namely, the constant function
corresponding to eigenvalue n = 0 of the “spherical Laplacian”. For n ≥ 1
the dimension of the eigenspace of L2 is larger than 1 and there is freedom
in choosing a basis which leads to different conventions in the definitions of
spherical harmonics.
Exercise 10.5. In two dimensions the space of Harmonic polynomial of degree
n is generated by
z n = (x + iy)n , z̄ n = (x − iy)n ,
The “spherical Laplacian” is

∂2
L2 = −
(∂θ)2
and the “spherical harmonics” are

e±inθ , dim Hn,2 = 2 n≥1

Exercise 10.6. In three dimensions

dim Hn,3 = 2n + 1

and the spectrum of “spherical Laplacian” is

Spectrum(L2 ) = n(n − 1)

In three dimensions we can write the Laplacian as

d2
∆= + 4∂ ∂¯ (10.17)
dz 2
It follows that (x + iy)n and (x − iy)n are Harmonic. In particular
 n
x + iy
n
z =r n
=⇒ Yn,n (θ, φ) = eiφ sinn θ (10.18)
r

up to normalization and similarly for z̄.


10.5. POISSON’S EQUATION 131

10.4.1 Harmonic functions and multipoles


Inversion is the map
x
x 7→ , r2 = x · x (10.19)
r2
Exercise 10.7. Show that under inversion a sphere in Rn is mapped to a sphere.

Here is an interesting fact I have learned from Barak Katzir

Theorem 10.8. Suppose φ(x) is a Harmonic function in Rd then


x
ψ(x) = r2−d φ 2 , r2 = x · x (10.20)
r
is Harmonic.

Clearly, it is enough to prove this for a dense set of Harmonic functions,


and so we focus on proving instead that for a homogeneous harmonic function
Hn (x) of degree n we   x 
∆ r2−d Hn 2 =0 (10.21)
r
This follows from
x
r2−d Hn 2 = r2−d−2n Hn (x)
r
= r2−d−n Yn (10.22)

But by the last line of Eq. 10.15, with n replaced by2 − d − n,

∆ r2−d−n Yn = r−d−n (2 − d − n)  2) − L2 Yn
  
2−d −n+ −
(d 

= r−d−n n(n − 2 + d) − L2 Yn
 

=0 (10.23)

and the last identity folllows from Eq. 10.16.


The functions describes multipoles: Harmonic functions that decay polyno-
mially at infinity (see section 10.8.2). For example, in d = 3 the dipole potential
is x d·x
1
H1 (x) = d · x ⇐⇒ H1 2 = (10.24)
r r |x|3
Similarly, for Q traceless symmetric 3 × 3 matrix, the quadrupole potential is

1  x  x · Qx
H2 (x) = x · Qx ⇐⇒ H2 2 = (10.25)
r r |x|5

10.5 Poisson’s equation


Poisson’s equation is
∆φ = −4πρ (10.26)
132 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

Clearly, if φ is a solution of the equation for some ρ, so is

φ + Harmonic (10.27)

So, Poisson’s equation needs to be supplemented by boundary conditions that


eliminated the freedom to add a harmonic function. It turns out that the ap-
propriate boundary conditions in Euclidean space with dimensions d ≥ 3, the
right boundary condition is that φ vanishes at infinity.
Since Poisson’s equation is linear, we can construct its solution by construct-
ing ρ from its point like elements
Z
ρ(x) = dyρ(y − x)δ(y) (10.28)

We call the solution G


∆G(x) = δ(x) (10.29)
and satisfies the boundary conditions, the Green function of the Laplacian. If
you think of the linear operators G and ∆ as matrices and of Dirac δ as the
unit matrix, then it is natural to think of the Green function as

G = ∆−1

By translation invariance of the Laplacian

∆G(x − y) = δ(x − y)

and the solution of Poisson’s equation


Z
φ(x) = −4π G(x − y)ρ(y)dy, (10.30)

10.6 Green’s function


For unit point charge in 3-dimensions the electric field, by symmetry must be
radial. Gauss law gives
x 1
E= , φ(x) = (10.31)
|x|3 |x|
It follows that the Green’s function in n = 3 dimensions is
1
∆x G(x − y) = δ (3) (x − y), G(x) = − (10.32)
4π|x|

10.6.1 Green function in arbitrary dimensions


The same method works in d dimensions. Let
1 x 1 r̂ 2π d/2
E= = , sd = , (10.33)
sd |x| d sd rd−1 Γ d2
10.6. GREEN’S FUNCTION 133

sd is the area of the d-dimensional unit sphere. This clearly satisfies Gauss law
for unit source Z Z
E · dSd = ∇ · E} dVd = 1
| {z (10.34)
R
=δ(x)

for any sphere of radius R. The source is a delta function. For d > 2 we have
 
1 d−2
∇ = − d−1 r̂
rd−2 r
it follows that the Green function is
1 1 r r
Gd (r) = − , G2 (r) = log , G1 (r) = (10.35)
(d − 2)sd rd−2 2π a 2
a > 0 is an an arbitrary (length) scale factor.
In 1 and 2 dimensions the Green functions diverges at infinity and in d ≥ 2
it diverges at the origin.

10.6.2 Volumes of d balls and spheres


Let
Vd (r) = vd rd , Sd (r) = sd rd−1
be the volume and area of the ball and sphere in d dimensions. Since
Vd0 (r) = Sd (r)
we see that
dvd = sd
Lets us find a recursion relation for the volumes. For d ≥ 1 we have
Z
vd+2 = dxd
|xd+2 |≤1
Z
= dxd dxd+1 dxd+2
|xd |2 +r 2 ≤1
Z Z √1−|xd |2
= 2π dxd rdr
|xd |≤1 0
Z
=π dxd (1 − |xd |2 )
|xd |≤1
Z
= πvd − πsd rd+1 dr
r≤1
 
sd
= π vd −
d+2
 
d
=π 1− vd
d+2

= vd
d+2
134 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

Sanity check:
2π 4π
v3 =v1 =
3 3
Here is an amusing observation about spheres: Evidently
(2π)d
v2d+1 = v1
(2d + 1)!!
vd → 0 super-exponentially.
Exercise 10.9. Compute sd using the Gaussian integral
Z Z ∞
−x2 2
d
d xe = sd rd−1 dre−r
0

10.7 Proof of the fundamental property of Har-


monic functions
Let G(x) be the Green function of the Laplacian in d-dimensions and φ Har-
monic:
∆ (G(x)φ(x)) = G∆φ + 2∇φ · ∇G + φ∆ (G)
= 2∇φ · ∇G + φ(x) δ(x)
Integrate this identity on a ball at the origin. The last term (on the right)
gives φ(0). The middle term can be organized so that one first integrates over
directions at fixed radius and last over the radial direction. Since G is a radial
function we pull it to the left and write
Z Z R Z
2∇φ · ∇GdV = sd rd−1 dr G0 (r) dS · ∇φ
|x|≤R 0 |x|=r
| {z }
0 by Gauss

(φ is Harmonic, the flux through any closed surface of ∇φ vanishes so the integral
on the right vanishes for any r > 0.) It remains to integrate the term on the
left
Z Z
∆ (φG) dV = dS · ∇ (φ(x)G(r))
|x|≤R |x|=R
Z  
= dS · (∇φ)G(R) +φ(x) G0 (R)r̂
|x|=R | {z }
0 by Gauss
Z
= G0 (R) dS · r̂ φ(x)
|x|=R
Z
1
= dS · r̂ φ(x)
sd Rd−1 |x|=R

and Eq. 10.35 for G(R) has been used. This is precisely the average of φ over
the sphere of radius R.
10.8. STATIONARY MAGNETIC FIELDS 135

10.8 Stationary magnetic fields


The magnetic field is determined by Ampere’s law and “Gauss” law for the
magnetic field

∇ · B = 0, ∇ × B + Ė = J
c
In the case Ė = 0 this reduces to

∇ · B = 0, ∇×B= J (10.36)
| {z c }
Ampere

Ampere’s law then implies ∇ · J = 0 (and then also ρ̇ = 0). To solve Ampere’s
differential equation we are free to use any gauge we please. In particular, in
the Coulomb gauge:
B = ∇ × A, ∇ · A = 0
Using the identity

∇ × (∇ × A) = −∆A + ∇(∇ · A)

we can write Amper’s equation as Poisson’s equations for (the vector valued) A


∆A = − J
c
Remark 10.10 (Consistency). The equation is consistent with the gauge con-
dition ∇ · A = 0 since ∇ · J = 0.

10.8.1 Biot-Savart law


The vector valued version of Eq. 10.30 reduces solving Poisson’s equations to
integration. Namely, Z
1 J(y)
A(x) = dy (10.37)
c |x − y|
By taking the curl of the identity we obtain B as an explicit integral over the
current:

B(x) = ∇ × A(x)
Z  
1 1
= ∇x ×J(y) dy
c |x − y|
| {z }
Coulomb
J(y) × (x − y)
Z
1
= dy
c |x − y|3

This is the Biot-Savart law.


136 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

Exercise 10.11 (A straight line of current). Consider a cylindrically symmetric


tube carrying constant current I along the z-axis. Using the cylindrical symmetry
of the problem and the integral version of Ampere’s law show that

ẑ × x
B = 2I
|x|2

Exercise 10.12 (Constant magnetic fields). The vector potential of a constant


magnetic field is a linear vector values function and so of the form

A = a × x + (b · x)c

Show that
B = 2a + b × c, ∇·A=c·b

10.8.2 Magnetic dipole


The current associated with thin loop of radius a in the x − y plane carrying
current I is

2J = (−y, x, 0) Iδ(x2 + y 2 − a2 )δ(z)


= 21 (Iẑ × ∇)θ(a2 − x2 − y 2 ) δ(z)

and θ(x) = 1 for x > 0 and 0 otherwise is the standard step function. Consider
the limit a → 0 and I → ∞ so that that Ia2 is fixed.

Exercise 10.13 (Delta function). Show that

θ(a2 − x2 − y 2 )
lim δ(z) = δ(x)
a→0 πa2

The a → 0 limit represents a point dipole, characterized by a vector

π 2 Ia2
 
m= ẑ
c

Ampere equation takes the form

(∇ × B)(x) = 4π(m × ∇)δ(x)

To find B we could plug the source into Biot-Savart. However, this is not much
10.8. STATIONARY MAGNETIC FIELDS 137

simpler then retracing the derivation. For A we find


(m × ∇)y δ(y)
Z
A(x) = dy
|x − y|
 
Z  
 δ(y) δ(y) 
= dy (m × ∇)y −(m × ∇)x
 
|x − y| |x − y| 


| {z }
bdry term
Z
δ(y)
= −(m × ∇)x dy
|x − y|
1
= −(m × ∇)
|x|
m×x
=
|x|3
To compute B we need a version of the vector identity
a × (b × c) = b(a · c) − c(a · b)
where a = ∇ is a differential operator,b = m a fixed vector and c = xr−3 a
vector valued function. Reflection shows that the right form is
∇ × (m × c) = m(∇ · c) − (m · ∇)c
From the solution of the Coulomb problem we know that
∇ · (xr−3 ) = 4πδ(x)
Hence x
B(x) = 4πmδ(x) − (m · ∇)
r3
It follows that the magnetic field of a dipole is
−m + 3(m · x̂)x̂
B= + 4πmδ(x)
|x|3
Exercise 10.14. Verify all steps.
Remark 10.15 (Singularity). The magnetic field has a bad (non-integrable)
singularity at the origin. One way to see this is to consider the total flux through
the origin. Take the plane oriented with m through the dipole. The flux through
such a plane is
=0
z }| {
(m · m)(x · x) − 3((m · x))2
B·m=− + 4πm2 δ(x)
|x|5
(m × x)2
=− + 4πm2 δ(x)
|x|5
The first term has an non-integrable singularity at the origin. At the same time,
we know that the total flux through any surface must be zero
138 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

Exercise 10.16 (Vanishing flux). Show that the total flux through any such
plane at distance ε from the origin vanishes.

Figure 10.2: Dipole field

10.9 Dirac monopoles


Dirac monopoles were invented, by Dirac of course, to explain the quantization
of charge: The charge of the proton is exactly minus that of the electron, in
contrast with, say, their their mass ratio which does not look like a simple
fraction. Dirac realized that if there was even a single monopole of magnetic
charge em anywhere in the universe, say, behind Andromeda, then charged
quantization will be a consequence of quantum mechanics: The electric charge
of any quantum particle e will be constrained by
2em e
∈Z
~c
This is Dirac charge quantization. This Dirac quantization can be viewed as a
consequence of the Aharonov-Bohm effect. We look for a vector potential whose
flux is the monopole charge em , and with nice Coulombic field
x
∇ × A = em 3 (10.38)
|x|
There is no smooth A that solves nevertheless. In spherical coordinates the
three equations for A are
1 em
(∇ × A)r = √ (∂θ Aφ − 
∂φ
A
θ) = 2 , g = r4 sin2 θ (10.39)
g r
and
∂φr − ∂r Aφ
A ∂rθ −
A ∂θ
A
r
(∇ × A)θ =  √ (∇ × A)φ =

= 0, √ =0 (10.40)
g g
10.9. DIRAC MONOPOLES 139

Figure 10.3: An infinitesimal loop around the positive z-axis carries flux of 4πem
while the same loop around the negative z-axis carries zero flux.

The cancelled term is a consequence of choosing Ar = Aθ = 0 which is suggested


by symmetry. As we shall see, this choice is good enough.
Eq. 10.39 reduces to
∂θ Aφ = em sin θ =⇒ Aφ = em (1 − cos θ) (10.41)
and we have chosen the integration constant so that Aφ = 0 on the positive
z-axis θ = 0. Since Aφ is independent of r we also satisfy Eq. 10.40.
It seems that we have successfully solved Eq. 10.38 and the solution appears
to be nice. But we know this can not be, since ah honest A implies ∇ · B = 0
and no monopoles. Indeed, Aφ 6= 0 on the negative z-axis, where θ = π, hides a
singularity. Spherical coordinates are singular when sin θ = 0. A smooth vector
field must therefore have Aφ = 0 whenever sin θ = 0. The vector field we found
is nice on the positive z-axis but singular on the negative z-axis. The singularity
is called the Dirac string. You
H can see this from stokes for a small loop around
the z-axis. If you integrate A · d` around the loop one finds
I
Aφ dφ = 4πem (10.42)

This can be given two interpretation: As the total flux through the sphere
minus a tiny hole near the north pole, or as minus the total flux through the
hole. 4πem is indeed the total flux of a monopole. Hence the Dirac string can
be thought of as the solenoid the feeds the flux of the monople. Now, how shall
we think of this Dirac string. On the one hand it is a gauge dependent property.
We could have chosen our coordinate system so that the string would go from
the origin to infinity along any direction we wish. On the other hand the Dirac
string carries magnetic field with flux 4πem . Magnetic fields are physical!
The problem would go away if we could use quantum mechanics to argue that
there are certain invisible magnetic fluxes. Indeed, this is what the Aharonov-
Bohm effect tells us: In an interference experiment involving a flux tube, one
can not distinguish no flux from a flux with integer number of quantum flux
quanta. This says that the Dirac string becomes invisible if
hc
4πem = Integer × Φ0 , Φ0 = (10.43)
e
140 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

We can rearrange this as


~c
e = Integer × (10.44)
2em
This is what Dirac set up to show: All electric charges must be an integer
multiple of a basic charge. I’d like now to derive Dirac result without appealing

Figure 10.4: The interefrence pattern from a flux tube is periodic in the flux
with period Φ0 = hc
e

to the Aharonov-Bohm effect. This argument is more mathematical, but you


will see it often.
We can get rid of Dirac string by doing something that geographers do:
Plotting earth on one sheet of paper leads to singularities at the pole. The
problem can be avoided by plotting the north hemisphere on one sheet and the
south hemisphere on the another.
On the half space z ≤ 0 we us the gauge
AN φ = em (1 − cos θ) (10.45)
which has a Dirac string along z ≥ 0. On the half space z ≥ 0 we us the gauge
AN φ = em (−1 − cos θ) (10.46)
which has a Dirac string along z ≤ 0.
We now need to “glue” the two half spaces by a gauge transformation. It is
here that quantum mechanics enters. We need three facts
• A unitary transformation of the state |ψi and operators W gives an equiv-
alent description of the system
(|ψi , W ) ⇐⇒ U |ψi , U W U †

(10.47)

• Charged particles couple to the electromagnetic field through minimal


coupling
e
− i~∂j − Aj (10.48)
c
• A gauge transformation is unitary U that is a function of the coordinates.
It follows that a gauge transformation affects minimal coupling
e  e  ~c
− i~∂j − Aj =⇒ U −i~∂j − Aj U † = −i~∂ − Aj + i ∂j log U (10.49)
c c | e {z }
pure gauge
10.10. APPLICATION TO GEOMETRY 141

We can try to glue the two half space by taking U = e−inφ . This is possible
provided
~c ~c
AN φ − AS φ = i ∂j log U ∂φ φ = n (10.50)
e e
Comparing with Eqs. 10.45, 10.46 we see that
~c
AN φ − AS φ = n = 2em (10.51)
e
We recovered Eq. 10.44. This gives Dirac quantization rule.

10.10 Application to geometry


10.10.1 Vector fields in 3D: Source and vorticity
Given a vector field V in 3 dimensions, its source ρ and vorticity ω are defined
by:
∇ · V = 4πρ, ∇ × V = 4πω
Basic facts are:
• Vorticity is sourcesless
∇·ω =0

• Radial vector fields are vorticity free.


• x is vector field with uniform source, ∇ · x = 3.
The converse is also true: The sources ρ and ω with ∇ · ω = 0 determine
the field V. By linearity, we can decompose the problem of constructing V into
two problems:
V =E+B
where E is irrotational (conservative)

∇ · E = 4πρ, ∇×E=0

and B is sourceless
∇ · B = 0, ∇ × B = 4πω
As we have seen, the equations for E and B are solved by the same technique.

10.10.2 Linking number


Suppose you have two loops γ1 and γ2 in space and you want to know if they
link. Imagine that the loop γ1 carries a unit current. Then, if the loop γ2 links
n times the loop γ1 we have
Z
4πn
B(x2 ) · dx2 =
γ2 c
142 CHAPTER 10. ELECTROSTATICS AND MAGNETOSTATICS

Figure 10.5: Left: An irrotational field with a source at the origin. A sourceless
field with vorticity along the z-axis

Now plug B from the solution of Poisson’s equation to get

dx2 · (x2 − x1 ) × dx1


Z Z
= 4πn (10.52)
γ2 γ1 |x2 − x1 |3

If n 6= 0 the loops link. The converse is, however, not always true.

Figure 10.6: Linking circles. The linking number, here 1, can be computed from
Eq. 10.52
Chapter 11

Electromagnetic waves

11.1 Maxwell’s equations in the Lorenz gauge


The inhomogeneous Maxwell equations are:

∂ µ Fµν = − jν
c
The homogeneous Maxwell’s equations are automatically satisfied if we write F
in terms of A. Hence, all of Maxwell’s equations are simply encoded in

∂ µ (∂µ Aν − ∂ν Aµ ) = − jν (11.1)
c
Gauge freedom allows us to impose a gauge condition on A, and it is convenient
to pick the Lorenz gauge1
∂ µ Aµ = 0 (11.2)
Maxwell’s equations in the Lorenz gauge are a set of decoupled wave equations
for Aµ with source jµ

∂ µ ∂µ Aν = Aν = − jν (11.3)
c
where
1
 = ∂ µ ∂µ = − 2 ∂tt + ∆
c
is the D’Alambertian, the operator associated with the wave equation.

11.1.1 Ambiguity of the Lorenz gauge


Unlike the coulomb gauge, the Lorenz gauge does not fix Aµ uniquely. Indeed,
if Aµ satisfies the Lorenz gauge so does

Aµ 7→ Aµ + ∂µ Λ (11.4)
1I shall show later that this is always possible.

143
144 CHAPTER 11. ELECTROMAGNETIC WAVES

provided Λ satisfies the wave equation


∂ µ ∂µ Λ = 0 (11.5)

11.2 Electromagnetic waves


In the absence of 4-currents and in the Lorenz gauge the potentials satisfy the
wave equation
Aµ = 0 (11.6)
Remark 11.1 (Lorentz invariance). Since the Dalambertian,  and the Lorenz
gauge conditions are manifestly a Lorentz invariants, Lorentz transformations
of electromagnetic waves are electromagnetic waves.

11.2.1 Electric and Magnetic fields


Since the gauge fields are not measurable objects, it is useful to see that the fields
too satisfy the wave equation with a source term. Taking the time derivative of
Faraday law
Ḃ + c∇ × E = 0 =⇒ B̈ + c∇ × Ė = 0
Substituting Ampere law
−Ė + c∇ × B = 4πJ
gives
B̈ = c∇ × (−c∇ × B + 4πJ) = c2 ∆B − c2 ∇ (∇ · B) +4πc∇ × J
| {z }
=0

This leads to the wave equation with a source them that is proportional to
∇ × J:

B = − ∇ × J
c
In particular, when ∇ × J = 0 we get the free wave equation for the magnetic
field.
Similarly for the electric field we have from Ampere’s law
Ë + c∇ × Ḃ = 4π J̇
Substituting Faraday’s law and making use of Gauss law gives
Ë = −c2 ∇ × (∇ × E) + 4π J̇ = c2 ∆E − c2 ∇(∇ · E) + 4π J̇
 
= c2 ∆E − 4π c2 ∇ρ − J̇

The electric field satisfies the wave equation with a source term proportional to
∇ρ − J̇/c2 !

E = 4π ∇ρ − 2
c
11.3. PLANE WAVES 145

In the absence of source terms, the electric field satisfies the free wave equation.
This was one of Maxwell’s great discoveries, namely, that even in the absence
of sources, the equations admit interesting wave-like solutions. This allowed
him to interpret light coming from the stars and propagating in vacuum, as
electromagnetic waves and eventually lead to the discovery of the radio.

11.3 Plane waves


Consider (the real part of)
Aµ (x) = aµ eik·x , k · x = kµ xµ (11.7)
with aµ a 4-vector of fixed amplitudes and kµ = (−ω/c, k) a fixed 4-wave vector.
The associated field is
Fµν = i(kµ aν − kν aµ )eik·x
Maxwell equations, Eq. (11.6), reduce to an algebraic equation for k and a linear
equation for aν
(k · k) aν − kν (k · a) = 0 (11.8)
By the Lorenz gauge condition, (k · a) = 0 and hence also
k · k = 0, k·a=0
This says that k is a light-like vector. The dispersion relation is
ω = ±c|k|
To understand the Lorenz gauge condition, k ·a = 0, lets us orient the Euclidean
frame so that the wave propagates in the z-direction. The light-like vector kµ
and the amplitude aµ are then
ω
kµ = (−1, 0, 0, 1), aµ = (a0 , a1 , a2 , −a0 )
c | {z }
Lorentz gauge

This is how a plane wave looks in the Lorenz gauge.


There is a remnant gauge freedom that allows us to choose a0 . The part
proportional to a0 is
ca0
a0 (1, 0, 0, −1)eiω(t−z/c) = i ∂µ eiω(t−z/c)
ω
is a pure gauge. This allows us to set a0 = 0 reducing the Lorenz gauge to the
Coulomb gauge:
ω
kµ = (−1, 0, 0, 1), aµ = (0, a1 , a2 , 0) (11.9)
c | {z }
Coulomb gauge

In the Coulomb gauge, the amplitudes are orthogonal (in Euclidean space) to
the direction of propagation.
146 CHAPTER 11. ELECTROMAGNETIC WAVES

11.3.1 Electric and magnetic fields


For plane waves
ω
E = −i A, B = −ik × A
c
Since k · A = 0 this implies that E and B are orthogonal and have equal mag-
nitudes. E, B and k form an orthogonal triad.

Figure 11.1: The triad of E, B, k for a plane wave. The wave propagates in
the k̂ = Ê × B̂ direction

11.3.2 Doppler
Since k · x is a Lorentz scalar

k · x = k 0 · x0

Lorentz transformation of a plane wave is a plane wave. The wave has, in


general, different wave vectors and amplitudes in different frames:

kµ0 = Λµ ν kν , a0µ = Λµ ν aν

Longitudinal Doppler
Consider a plane wave propagating in the z-direction, Eq. (11.9). Boosting the
wave with rapidity φ in the same direction is the same as viewing the wave from
an inertial frame boosted in the opposite direction. The associated Lorentz
transformation is

Λ0 3 = Λ3 0 = − sinh φ, Λ0 0 = Λ3 3 = cosh φ, Λ1 1 = Λ2 2 = 1 (11.10)


11.4. POLARIZATION 147

The Lorentz transformation of the light-like vector kµ = ω(1, 0, 0, 1) gives kµ0 =


ω 0 (1, 0, 0, 1) where
s
0 φ 1+β
ω = ω(cosh φ + sinh φ) = ωe = ω (11.11)
1−β
and we used the relation between rapidity and velocity
β = tanh φ (11.12)
The Doppler shift is linear in the velocities for small speeds. a1,2 are not affected
by the boost.

Transverse Doppler
Consider, as before, a wave propagating in the z-direction, but a boost in the
x-direction so that
Λ0 1 = Λ1 0 = − sinh φ, Λ0 0 = Λ1 1 = cosh φ, Λ2 2 = Λ3 3 = 1 (11.13)
The wave vector kµ = ω(1, 0, 0, 1) is transformed to a light like wave vector
kµ0 = ω(cosh φ, − sinh φ, 0, 1) in the x − z plane. The new frequency is
ω 0 = ω cosh φ = ω γ
This is quadratic in the velocities for small speeds.

11.4 Polarization
11.4.1 Amplitude and phase
Scalar plane waves are simply characterized by their frequency ω, wave vector
k, amplitude and phase. Electromagnetic waves, being vector valued, are more
complicated. In addition to the amplitude and phase they are also characterized
by their polarization.
The electric field of an electromagnetic plane wave propagating is the real
part of
E0 eiφ , φ = k · x − ωt (11.14)
The amplitude, E0 , is a complex vector in the plane perpendicular to k since
k · E0 = 0. E0 therefore has 4 real amplitudes. What is their physical interpre-
tation? Suppose we scale
E0 7→ λE0 , λ = |λ|eiγ (11.15)
One would then say that the amplitude has been scaled by |λ| and the phase
shifted by γ. We have now identified 2 of the 4 parameters in E0 : An amplitude
and a phase. It remains to identify the remaining two parameters hidden in E0 .
We are now in a situation that is reminiscent of quantum mechanics: The wave
function is a complex vector with an equivalence relation by an overall complex
number.
148 CHAPTER 11. ELECTROMAGNETIC WAVES

Figure 11.2: Four Stokes parameters describe elliptically polarized light. Three
numbers identify the size of the ellipse, its tilt to the axes, its eccentricity. A
fourth number gives the purity (the coherence) of the light. The plane of the
ellipse is perpendicular to the direction of propagation k.

11.4.2 Polarization
Let x̂ and ŷ denote orthogonal unit vectors in the plane perpendicular to k. Let
me introduce basis vectors, with Dirac ket notation
x̂ ± iŷ
|z± i = √ (11.16)
2
The states are normalized and orhtogonal

hz± |z± i = 1, hz± |z∓ i = 0 (11.17)

The complex vector E0 can be represented as

E0 ⇔ E+ |z+ i + E− |z− i = |Ei (11.18)

We are interested in the degrees of freedoms that remain after we identify E0


with λE0 . This is precisely the equivalence relation we have in quantum mechan-
ics for qubits. Quantum mechanics provides us with a procedure for factoring
out the normalization and the overall phase in of a quantum state. This is the
density matrix representation of quantum states:

|E+ |2 E+ E−
 
|Ei hE| 1
ρ= = ∗ (11.19)
hE|Ei |E+ |2 + |E− |2 E+ E− |E− |2

which does not care about the overall normalization and phase. Since T rρ = 1
while det ρ = 0 the two eigenvalues of ρ are 1 and 0: ρ is a projection

ρ2 = ρ (11.20)
11.4. POLARIZATION 149

Figure 11.3: The Poincare sphere associates with every point on the sphere
a polarization. The north and south poles represent right and left circularly
polarized light and the equator with linearly polarized light.

11.4.3 Poincare sphere


Any 2 × 2 hermitian matrix of unit trace can be written as
 
1 1 + s3 s1 + is2
ρ = 12 (1 + s · σ) = (11.21)
2 s1 − is2 1 − s3
where s = (s1 , s2 , s3 ) ∈ R3 and σ = (σ1 , σ2 , σ3 ) is the vector of Pauli matrices:
     
1 0 0 1 0 −i
σ3 = , σ1 = , σ2 = − (11.22)
0 −1 1 0 i 0
From Eq.( 11.21)
4 det ρ = 1 − s · s (11.23)
so ρ is a projection if s is a unit vector ŝ. This identify the remaining two
parameter with points on the unit sphere. The physical meaning of these degrees
of freedom is polarization. It is interesting that these natural degrees of freedom
are parametretized by the sphere.
Exercise 11.2. Show that antipodal points on the Poincare sphere represent
orthogonal states in the sense that ρŝ · ρ−ŝ = 0.

11.4.4 Circular polarization


The north poles of the Poincare sphere correspond to |z+ i so that
E ∝ Re (x + iy)eiφ = x cos φ − y sin φ

(11.24)
As φ increases from 0 to 2π the vector E describes a circle in the x-y plane
which is turning counter clockwise. At a fixed t and as a function of z the
field rotates counter-clockwise like a right handed screw and so is called right
circularly polarized.
150 CHAPTER 11. ELECTROMAGNETIC WAVES

Exercise 11.3 (South pole). Show that the south pole represent left circular
polarization.

11.4.5 Linear polarization


The equator is s3 = 0. This implies that |E+ | = |E− | and so the states on the
equator are parameterized by ψ:

(e−iψ/2 , eiψ/2 )/ 2 (11.25)

The corresponding E0 is a real vector in the plane:

E0 = 12 (x̂ + iŷ)e−iψ/2 + 12 (x̂ − iŷ)eiψ/2


= x̂ cos(ψ/2) + ŷ sin(ψ/2)

As ψ increases from 0 to 2π this describes a line element in the x-y plane at


angle ψ/2 to the x axis. ψ = 0 corresponds to x̂ polarized wave and ψ = π to
ŷ polarized wave.

11.4.6 Stokes parameters


Since we are oblivious to the overall phase, we may write (E+ , E− ) = (cos χ, eiψ sin χ),
with cos χ ≥ 0, i.e. 0 ≤ χ ≤ π/2 and 0 ≤ ψ < 2π. The corresponding points on
the unit sphere are

s3 = cos2 χ − sin2 χ = cos 2χ,


s1 + is2 = 2e−iψ cos χ sin χ = e−iψ sin 2χ

This makes 2χ and ψ the standard spherical coordinates.

11.4.7 Partially polarized light


The discussion so far addressed what are commonly known as coherent waves:
The waves are given by a function. In practice, one often encounters situations
where the light one gets is from and ensemble of different sources and one
has only statistical information about the plane waves. Such waves are called
incoherent. One way to model incoherence is as a statistical average
X
E0 = pj Ej , pj ≥ 0,

where Ej represent independent (normalized) light sources, namely

h(Ej )a (Ek )b i = 0 j 6= k, ∀a, b

In the case that all these sources emit plane waves all sharing the same direc-
tion k̂ the polarization of the mixture is naturally defined as the mixture of
polarizations X
ρ= p j ρj (11.26)
11.5. THE WAVE EQUATION 151

Exercise 11.4. Show that


1
hρi = 2 (1 + hsi · σ) (11.27)

It is still true that T rρ = 1. But now hρi need not be a projection since the
averages of a unit vectors is shorter, in general, than a unit vector: | hsi | ≤ 1;
the vector lies in the unit ball. The light we get from the sun is completely
unpolarized. It is associated with s = 0, the center of the Poincare ball.
Remark 11.5 (Combing a tennis ball). One amusing, essentially topological,
property of the transverse nature of electromagnetic waves is that it is not pos-
sible to have a fully spherically symmetric electromagnetic wave. The point is
that a spherical wave, with k pointing radially, has E tangent to the sphere. It
is a basic fact in topology that any vector field on the sphere must vanish at (at
least) two points. The field can not be “the same” everywhere.

11.5 The wave equation


So far, we have discussed plane wave solutions of the electromagnetic wave
equation. We now turn to investigating the wave equation in general.

11.5.1 The wave equation in one dimension


The one dimensional wave equation, for a scalar field φ, in light-cone coordinates
u = x1 − x0 , v = x1 + x0 takes the form

φ = 4∂uv φ = 0

The general solution is


φ(u, v) = f (u) + g(v)
with (essentially) arbitrary f and g. f describes a wave rigidly propagating to
the right at speed c and g a wave rigidly propagating to the left at speed c.
The functions f and g can be determined by the initial (Cauchy) data

φ0 (x1 = x, x0 = 0) = f (x) + g(x), φ̇0 (x1 = x, x0 = 0) = − f 0 (x) − g 0 (x)




This allows for reconstructing f and g from the initial data by integration. One
can verify that, the solution in terms of the initial data, is
Z x+ct
1 1
φ(x, t) = (φ0 (x + ct) + φ0 (x − ct)) + φ̇0 (y)dy (11.28)
2 2c x−ct

11.5.2 Waves with Gaussian waists


A plane wave is an idealization: The field extends all the way to infinity in all
directions and has infinite power. Let us now consider a model of a narrow
pencil of light which has a finite width and power.
152 CHAPTER 11. ELECTROMAGNETIC WAVES

ct

Figure 11.4: If the initial data, φ0 and φ̇0 are localized in the red interval, the
solution at later times lives in the forward light cone of the initial data. This is
called the domain of influence of the initial (red) data. This statement holds in
any inertial frame.

To do that it is convenient to write the wave equation in 3+1 dimensions as

¯
(∂uv + ∂∂)φ(u, v, z, z̄) = 0, z = x1 + ix2 , z̄ = x1 − ix2

where u, v are light cone coordinates, v = x3 − x0 , u = x3 + x0 , where x0 = ct.


A monochromatic wave with a Gaussian waist is a model of laser beam.
Consider a monochromatic wave with frequency ω = c/λ and the form

φ(u, v, z, z̄) = c(u)e−c(u)zz̄ eiv/λ

This ansatz solves the wave equation provided the function c(u) satisfies

λc2 (u) − ic0 (u) = 0.

The solution of this equation has an integration constant giving for c(u)

1
c(u) =
`2 − iλu

` is the minimal waist of the beam at u = 0, and the waist disperses with the
law
(`4 + λ2 u2 )1/4

The waist broadens with the root of u, like diffusion.

Exercise 11.6. A narrow red laser beam of diameter 1 mm and wavelength


λ = 700nm is directed at the moon. What is the diameter of the beam on the
moon.
11.6. GREEN’S FUNCTION FOR THE WAVE EQUATION 153

Figure 11.5: The waist of a Gaussian beam as function of z + ct.

11.6 Green’s function for the wave equation


We can use the linearity of the wave equation to represent the solutions of the
wave equations with a source, Eq.11.3, as the integral,
Z
Aµ (x) = dyjµ (y)G4 (x − y) (11.29)

where Gd+1 is the Green function in space-time, i.e. a solution of the equation
Y
Gd+1 (x) = δ d+1 (x), δ d+1 (x) = δ(xµ ) (11.30)
µ

Eq. 11.30 alone does not fix a unique is not unique Gd+1 , as we can add to
Gd+1 any solution of the free wave equation. Any solution Gd+1 satisfies the
free wave in the past t < 0. We can therefore subtract from Gd+1 the free wave
that agrees with the free wave in the past to have a solution with Gd+1 in the
past. We can repeat the procedure for different rest frames and then we arrive
at the conclusion that we can impose on Gd+1 the condition that it vanishes
outside the forward light cone. It turns out that Gd+1 has different qualitative
properties depending on whether d is even or odd. Thus although we mostly
care about d = 3, we shall consider the general case.

11.6.1 Conservation law


For t 6= 0, G solves the free wave equation. Integration on a fixed time slice in
d + 1 space-time
Z Z Z Z
c2 dd x ∂tt G = dd x∆G = dd x∇ · ∇G = dd−1 x∇G = 0

It follows that Z
dd x ∂t G

is a conserved quantity for t > 0 (and trivially so for t < 0). It has a jump at
t = 0: Z (
1 t>0
dd x ∂t G =
0 t<0
154 CHAPTER 11. ELECTROMAGNETIC WAVES

ct

Figure 11.6: Gd+1 = 0 outside the forward light cone. In particular Gd+1 = 0
for xµ xµ > 0.

This is seen by integrating Eq. 11.30 on the space-time slice − < t < .

11.6.2 Recursion relation for the Green function


To find G4 we shall construct a recursion relation that relates G4 with G2 . G2
can be computed explicitly. We shall now derive this recursion relation2 that
relates Gd+1 to Gd−1 .
The starting observation is the obvious identity
Z
d
δ (x0 , . . . , xd ) = dxd+1 δ d+1 (x0 , . . . , xd+1 ) (11.31)

The identity has the interpretation that a point source in d space-time is a


section of a line-source in d + 1 space-time dimensions.
To proceed, we also observe that since the D’Alambertian is Lorentz in-
variant, it is natural to look for Gd+1 which is a function of the interval for
t>0

s = −xµ xµ

We can figure out Gd from Gd+1 by superposition:


Z ∞
Gd (s) = dxd+1 Gd+1 (s − x2d+1 )
−∞

2I have learned this recursion relation from Amos Ori.


11.6. GREEN’S FUNCTION FOR THE WAVE EQUATION 155

ct

Figure 11.7: A point source is 2 dimensions is a two dimensional section of a


line source in 3.

Iterating this relation one more time gives


Z ∞ Z ∞
Gd (s) = dxd+1 dxd Gd+2 (s − x2d − x2d+1 )
−∞ −∞
Z ∞
=π d(r2 )Gd+2 (s − r2 )
0
Z ∞
=π Gd+2 (−r)dr
−s
Z s
=π Gd+2 (r)dr
−∞

Differentiating gives the sought after recursion relation

dGd (s)
= πGd+2 (s) (11.32)
ds
So if you know the green function in 1 dimension, you can find it for all odd
dimensions by differentiation, and if you know it for 2 dimensions you get it for
all even dimensions.

11.6.3 Even space dimensions


In the case d = 0 the equation for the Green function degenerates to an ODE

d2 G1
= c2 δ(ct) (11.33)
dt2
Integrating once, taking into account the light-cone condition, gives

dG1
= cθ(t) (11.34)
dt
156 CHAPTER 11. ELECTROMAGNETIC WAVES

Integrating once more

G1 (s) = ctθ(t), s = (ct)2 (11.35)

For positive times we get



G0+1 (s) = s (11.36)
From the recursion relation we then get for G2+1 in the forward light cone

1
G2+1 (s) = √ (11.37)
2π s

In even space dimensions Gd+1 fill the forward light cone. The Green function
does not quite satisfy the Huygens principle as the wave does not live on the
boundary of the light-cone, but rather fills it up. This is a feature of all even
spatial dimensions.

11.6.4 Odd space dimensions


In the case d = 1 we can make use of the light-cone coordinates to find G. The
equation for G is
∂ 2 G1+1
4 = 2δ(u)δ(v) = δ(ct)δ(x) (11.38)
∂u∂v
Integrating gives
1 1
G1+1 = θ(u)θ(v) = θ(t)θ(s) (11.39)
2 2
In summary
(
1
G1+1 (x) = 2 θ(s) in the forward light cone
(11.40)
0 otherwise

(Note that once again the solution fills the light cone.) From the recursion
relation we now find for G3+1
(
1
δ(s) in the forward light cone
G3+1 (s) = 2π (11.41)
0 otherwise

Note that G3+1 lives on the surface of the light cone. This is the Huygens
principle. This continues to hold in all higher odd dimensions.

Remark 11.7. Gd+1 is often written in the equivalent form

δ(|x| − ct)
G3+1
> (x) = θ(t) , x = (ct, x), (11.42)
|x|

Exercise 11.8. Show this.


11.7. COULOMB GAUGE 157

11.7 Coulomb gauge


The Coulomb gauge, aka the radiation gauge, aka the transverse gauge is:
Theorem 11.9 (Coulomb gauge). It is always possible to choose the vector
potential Aµ = (−Φ, A) so that

∇ · A = 0, ∆Φ = −4πρ (11.43)

Φ is the solution of Poisson’s equation the gauge field A satisfies the wave equa-
tion with a source term:

− ∆A − Ä = J − ∇Φ̇ (11.44)
c
Suppose ∇ · A 6= 0. Let Λ be a solution of the Poisson’s equation

∆Λ = ∇ · A

The gauge transformation


A0µ = Aµ − ∂µ Λ
reproduces the the same field F with A0 satisfying the Coulomb gauge condition

∇ · A0 = ∇ · A − ∆Λ = 0

Φ0 It is determined by
E = −∇Φ0 − Ȧ0
Taking the divergence of this we see that Φ0 as a solution of Poisson’s equation:

− ∆φ0 = ∇ · E = 4πρ (11.45)

The wave equation for A0 follows from Maxwell and the Coulomb gauge condi-
tion:

J = ∇ × B − Ė
c
= ∇ × (∇ × A) + ∇Φ̇
= −∆A + ∇(∇ · A) + ∇Φ̇
= −∆A + ∇Φ̇ (11.46)

Remark 11.10 (Causality). The Coulomb gauge is a-causal: The scalar poten-
tial φ is fixed by the instantaneous charge distribution. You move a charge here
and the scalar potential φ changes immediately everywhere. The fact that the
scalar potential changes faster than light can not be used to transfer information
faster than light because the fields are still causal, and only fields are measurable.
Remark 11.11 (Free space). When ρ = 0 we may take φ = 0 together with
∇ · A = 0.
158 CHAPTER 11. ELECTROMAGNETIC WAVES

11.8 Appendices
11.8.1 Cosmic rays: GZK limit
The cosmic microwave background (CMB) provides a shield that screens ultra
high energy cosmic rays: A high energy charge proton can collide with a photon
to produce a neutral pion converting much of the high kinetic energy of the
proton to the pion mass. This makes the CMB a screen for high energy cosmic
rays. The GZK limit says that protons with energies above 5 × 1013 M eV are
screened by the 3◦ K thermal photons of the CMB.
Let us compute the threshold for particle (pion) production. The total
energy-momentum of a proton with rapidity φ and counter-propagating pho-
ton in the plane is

pµ = mP (cosh φ, sinh φ) + ~ω(1, −1) = (p0 , p1 )

The energy in the center of mass frame, Ecm , is the scalar


q q
Ecm = p20 − p11 = m2P + 2mP ~ωeφ = mP + mπ

and the equality on the right expresses the fact that the two particles are at
rest. This gives the threshold for pion production as

m2π + 2mP mπ = 2mP ~ω eφ

Since mπ  mP one finds a simple formula for the rapidity



eφ ≈

The corresponding energy threshold is
mp mπ
mp cosh φ ≈ 12 mp eφ ≈ ≈ 2.5 × 1014 M ev
2 × 3kB

(This is factor 5 larger than the GZK estimate.)

Exercise 11.12. Can you figure out why the estimate is too big?

11.8.2 Laser cooling and optical molasses


Laser cooling is a cool application of the Doppler effect to slow down atoms.
Think of the atom as a two level system with energy gap E. Suppose you point
a laser beam with frequency ~ω < E at the atom. Atoms that move towards
the light source will see bluer light and will be able to absorb the light if they
move fast enough. This will slow the atom down. At the same time, slow atoms
will be transparent to the light.
11.8. APPENDICES 159

11.8.3 Covariant superposition


The wave equation is an algebraic equation in Fourier space. A solution can be
written as Z
1
φ(x) = d4 k δ(k · k) φ̃(k) eik·x (11.47)
(2π)2
with φ̃(k) an arbitrary function of the 4-vector k. Of course, because of the δ
function only the values that the function takes on the light-cone are relevant.
This expression for a scalar wave φ is manifestly Lorentz invariant.
It is instructive to split the solution to the forward and backward light cone
we have
φ(k) = θ(−t)φ< (k) + θ(t)φ> (k) (11.48)
We can carry out the time integration to get:
Z
1 dk
φ> (t, x) = 2
φ̃(|k|, k) ei(|k|t−k·x) (11.49)
(2π) 2|k|

The forward light cone is associated with out going (retarded) waves. Similarly
Z
1 dk
φ< (t, x) = 2
φ̃(−|k|, k) e−i(|k|t+k·x) (11.50)
(2π) 2|k|

the backward light-cone can be associated with incoming (advanced) waves.


Note that in both cases, Lorentz invariance induces a weight on the three di-
mensional k space.

11.8.4 Monochromatic waves


Monochromatic waves are solutions of the wave equation whose time dependence
is eiωt . Hence
∆φ = −k0 2 φ (11.51)
In Fourier space the solution is supported on the sphere

k · k = k02 (11.52)

The smallest wave length that such a wave can accommodate is 2π/k0 : The
frequency limis the spatial resolution.

11.8.5 Evanescent waves


Near a planar boundary one can can arrange for monochromatic, evanescent
plane wave solution to the wave equation: Plane waves in the half-space which
decay in x and propagate in the z-direction:
 ω 2
e−κx eikz , k 2 = k02 + κ2 = + κ2
c
160 CHAPTER 11. ELECTROMAGNETIC WAVES

The noteworthy fact about this waves is that the wave number k in the z-
direction can be much larger than the wave number associated with the fre-
quency ω . Near the boundary x = 0 one finds waves with short wave lengths:
k  k0 = ω/c.

Example 11.13 (Transversal waves). The notion of transversality for evanes-


cent waves is different from ordinary plane waves. For the wave

E = E0 e−κx eikz

Gauss law ∇ · E = 0 reduces to

κ(E0 )1 + ik(E0 )3 = 0

which allows (E0 )3 6= 0 for a wave propagating the in z-direction.

11.8.6 Waves in dielectric media: Birefringence:


In the absence of external sources, the time evolution of the fields in a dielectric
is dictated by Faraday and Ampere laws

Ė + ∇ × B = 0, Ḣ − ∇ × D = 0

subject to the constraints

∇ · D = 0, ∇·B=0 (11.53)

In Fourier space (x, t) ↔ (ω, k) the differential equations reduce to algebraic


equations

ωε−1 D + k × µH = 0, ωH − k × D = 0, k · D = 0, k · µH = 0

where µ and ε are the constitutive relations. We assume that µ, ε are positive
matrices. (Possibly functions of ω.) Substitution gives for D

ω 2 ε−1 D + k × (µk × D) = 0 (11.54)

which can be written as a (generalized) eigenvalue problem for the 3 × 3 sym-


metric matrix

Mjk (ω, k) = ω 2 (ε−1 )jk + εjmn εabk µna km kb

Given k a non-trivial solution for the eigenvector D exists provided det M = 0.


This fixes the dispersion relation ωj2 (k) with j = 0, 1, 2, the three eigenvalues
of Eq. (11.54). One eigenvalue is always trivial since the matrix always 0 as an
eigenvalue, corresponding to the eigenvector k. There are, therefore, in general,
only two non-trivial eigenvalues

Exercise 11.14. Show that that if ε and µ are symmetric, so is M .


11.8. APPENDICES 161

In the frame where ε and µ are diagonal


Mjk (ω, k) = ω 2 (ε−1 )j δjk + µn εnjm εnbk km kb
In the special case that µ is a scalar
Mjk (ω, k) = ω 2 (ε−1 )j − µk · k δjk + µkj kk


For k in the principal direction j we get


ω 2 = εj µ k 2

The wave propagates at different speeds ej µ along the principal directions of
ε. This is birefringence.

11.8.7 3D glasses
When you view a 3D movie, the 3D glasses transmit a picture with right circular
polarization to, say, the right eye and left circular polarization to the left eye.

Figure 11.8: An arrangement that transmits right circular polarization.

Exercise 11.15. Can you give an (ergonomic) argument why spectators would
prefer circular to linear polarization?
Exercise 11.16. Define quarter wave plate as the rotation of the Poincare
sphere that turns circular polarization to linear. Show that it is represented by
Hadamard gate H √
2H = σ3 + σ1
Exercise 11.17. Explain why the filtering associated by linear polarizers can
be described by the projections
1 ± σ1
PH,V =
2
It follows from the two exercises above that the right and left glasses can be
represented by 2 × 2 matrices
g1 = PH H, g 2 = PV H
162 CHAPTER 11. ELECTROMAGNETIC WAVES

Exercise 11.18. Give a physical interpretation of the identity

PV σ 3 = σ 3 PH

in terms of rotation of the glasses.

Exercise 11.19. Explain why holding the glasses backwards is represented by


transposition:
gj ⇐⇒ gjt
If you place glass 1 rotated by π/2 behind glass 2 inverted the joint system
is represented by represented by the matrix product σ3 g1 g2t . A computation
gives 0 which means that no light passes through.
Exercise 11.20. Show that there are 64 ways of arranging the pair of glasses.
How many of these let no light through.
Bibliography
F. John, Partial differential equations,
Chapter 12

Radiation

12.1 Wave equation with arbitrary source term


The retarded Green function for the wave equation in 3+1 dimensions is

1  δ(ct − r)
G> (x) = θ(x0 )δ s = , s = x · x, r = |x| ≥ 0 (12.1)
2π 4πr
The θ function guarantees causality: The past influences the present. The
delta function says that in 3+1 dimensions signals propagate on the light-cone.
Huygens principle (in the strong form) holds.
By the homogeneity of Minkowski space, for source term located at the
space-time point y
G> (x − y) = δ(x − y)
By the linearity of the wave equation the retarded solution of the wave equation,
φ, generated by an arbitrary1 source ρ

φ = ρ(x) (12.2)

is Z
φρ (x) = d4 y G> (x − y)ρ(y) (12.3)

Solving the scalar wave equation is reduced to computing an integral.

12.1.1 Scalar wave generated by a moving point source


As a preparation for studying the radiation of electromagnetic waves, consider
the simpler problem of radiation of scalar waves generated by a point source
1 Some condition on the localization of the sources should be imposed. This is related to

Olber’s paradox: If you assume constant density of stars, and that intensity of radiation falls
like r−2 the night sky shoudl be as bright as the sun.

163
164 CHAPTER 12. RADIATION

moving on a world line z = ct, z(t) , −∞ < t < ∞. The motion is assumed to
be that of a real particle so the 4-velocity is time-like. We take the source to be
4π (3) 
ρ(x, t) = δ x − z(t) (12.4)
c

x
x0
R
y
y0

Figure 12.1: The blue line is the world line of the point source. The wave is
observed at the black dot x. The backward light cone from x intersects the
(blue) world line at the black dot y. Since the velocity is time like the point of
intersection is unique. R is the 4 vector x − z(y 0 ).

The retarded wave is:


Z
d4 y δ (3) y − z(y 0 ) δ (x − y) · (x − y) θ(x0 − y 0 )
 
cφδ (x) = 2
| {z }| {z }
source Green
Z
= 2 dy 0 δ R · R θ(x0 − y 0 ), R = (x0 − y 0 , x − z(y 0 ))


Since the orbit z(z 0 ) is time-like, a single point contributes: a single particle
has a single image2 . To compute the remaining time-integral use
Z
 1
δ s(y) dy = 0 , s = R · R, s(y0 ) = 0. (12.5)
|s (y0 )|
2 Mirrors and lenses can create multiple images, of course.
12.2. MAXWELL EQUATION IN THE LORENZ GAUGE 165

Clearly, the time derivative of s is related to the velocity of the source, and it
is natural to express it in terms of the 4-velocity

ds dRµ dz µ
= 2Rµ = −2R µ
dy 0 dy 0 dy 0
µ
dz dτ Rµ uµ
= −2Rµ 0
= −2
dτ dy cγ
R·u
= −2

The wave φ(x) at the observing point x is therefore given by the deceptively
simple formula

γ(y 0 )
φ(x) = , R = x − y, R·R=0 (12.6)
|R · u(y 0 )|

which is manifestly causal and satisfies Huygens principle.

t u

R
y
x

Figure 12.2: The blue 4-velocity is time-like and the 4-vector R is light like.
Their Minklowsky scalar product is negative and can be close to zero.

The formula is simple but implicit. The right hand side is not an explicit
function of the argument x: You need to know γ(y 0 ), u(y 0 ) and R = x − y and
in particular, the earlier time y 0 (see Fig. 12.1.1). This time is determined as
the solution of the equation
2
(x0 − y 0 )2 = x − z(y 0 )

which may be arbitrarily complicated if the orbit z(z 0 ) is complicated.


The amplitude of decays like 1/R, which is characteristic of waves.

12.2 Maxwell equation in the Lorenz gauge


The inhomogeneous Maxwell equations are

∂ µ Fµν = − jν (12.7)
c
166 CHAPTER 12. RADIATION

Expressed in terms of the potentials, (this guarantees the homogeneous equa-


tions), one gets a system of second order PDE

∂ µ µ Aν − ∂ µ ν Aµ = − jν (12.8)
c
In the Lorenz gauge, ∂ µ Aµ = 0, Maxwell equations reduce to 4 decoupled wave
equations

Aν = ∂ µ µ Aν = − jν (12.9)
c
The equations are coupled through the Lorenz gauge condition. If the current
j µ is not conserved then the derivation is inconsistent with the Lorenz gauge.
Conversely, if current is conserved, then Lorenz gauge condition follows for all
times provided the initial data for Aµ and Ȧµ satisfy it.
Exercise 12.1. Show that if one imposes the Lorentz gague condition as initial
data for the wave equation, then the Lorenz gauge condition holds for all times
provided current is conserved.

12.3 Lienard-Wiechert: Retarded potentials


The Maxwell equations, Eqs. (12.9), can be viewed as 4 independent copies of
the scalar wave equation with given source terms. We can therefore use the
solution from the previous section to the source of Eq. (12.9) where

4π (3) 0 0 4π δ (3) y − z(y 0 )
uν (y 0 ) (12.10)

jν (y) = e δ y − z(y ) vν (y ) = e
c | {z } c γ | {z }
lab velocity 4−velocity

Comparing with Eqs. (12.4,12.6) for scalar waves we see that conveniently γ
disappears, and the retarded potentials are:
uν uν (y 0 )
Aν (x) = e = −e (12.11)
|R · u| R · u(y 0 )
The absolute value was removed by taking into account that R is forward light-
like and u forward time-like so R · u < 0.
The result admits the following (a-posteriori) interpretation: The vector
potential, being a 4-vector, must be of the form
(scalar)(vector)µ
We have (at least) two 4-vectors at our disposal: R and u. Between these we
can form 3 scalars: One interesting R · u and two uninteresting u · u = −c2 and
R · R = 0. This, plus dimension analysis and the limit case of a charge at rest
determines Eq. (12.11).
The result can also be viewed as the covariant form of Coulomb law:
e e uν
Aν (x) = (1, 0, 0, 0) = − (−c, 0, 0, 0) =⇒ −e
|x| c|x| | {z } R·u

12.3. LIENARD-WIECHERT: RETARDED POTENTIALS 167

12.3.1 The Lorenz Gauge condition


We still need to verify the Lorenz gauge condition.
A clever argument: Since the condition is Lorentz invariant, it is sufficient
to verify it in some Lorentz frame. So let us do that in the frame where the
charge is instantaneously at rest at the origin (at the early time). Then,by Eq.
12.11
e
∂ µ Aµ = ∂ 0 A0 , A0 = −
|x|
when you change t of the event x the distance x does not change because the
particle is at rest. Hence
∂ 0 A0 = 0
An honest Computation: The reason for doing an honest computation is
that this will force us to derive the identity that describes how the retarded
time depends upon variation of the observation event x which we shall need
when deriving a formula for the fields. The identity is:

∂µ τ = (12.12)
R·u
It follows by differentiating the light-like condition relating the events: R ·R = 0

0 = 12 ∂µ (R · R) = Rα ∂µ (xα − z α ) = Rµ − R · u (∂µ τ )

Back to the honest verification of the Lorenz gauge condition:


 µ 
µ u
0 = ∂µ A = −e∂µ
R·u
This will hold provided
?

  z}|{
2 µ µ
(R · u) ∂µ = (R · u) ∂µ u − u ∂µ (R · u) = 0 (12.13)
R·u
To verify that this is indeed so, let us prepare

∂µ uα = u̇α (∂µ τ ), ∂µ Rα = δµα − uα (∂µ τ )

Substituting this in Eq. (12.13) and using Eq. (12.12) we find

(R · u) ∂µ uµ − uµ ∂µ (R · u) = (R · u) u̇µ (∂µ τ ) − uµ uα ∂µ Rα − uµ Rα u̇α (∂µ τ )


u̇ · R Rµ
= (R · u) − uµ uα ∂µ Rα − uµ Rα u̇α
R·u R·u
= −uµ uα ∂µ Rα
= −u · u + uµ ∂µ τ
= −u · u + u · u = 0

We have verified that the solution, Eq. 12.11, indeed satisfies the Lorenz con-
dition.
168 CHAPTER 12. RADIATION

Figure 12.3: The self time τ parametrizes the blue orbit. It can be extended to
a function on space time by pushing the value of τ to the forward light-cone.
The figure illustrates how τ changes when the point of observation x changes.
The red lines are light-like.

12.4 Lienard Wiechert formula for retarded field


To find the fields we need to differentiate the potentials with respect to the
space-time coordinates xµ . Formally,
Fµν = ∂ µ Aν − ∂ ν Aµ = ∂ [µ Aν] (12.14)
and the right hand side is just a convenient notation. The word formal above
refers to the fact that in taking the partial derivatives we need to remember
that τ , the retarded time, is a function of the point of observation x, fig 12.3.
If we treat ∂x independent of τ then ∂ µ of Eq. 12.14 needs to be interpreted as
   
∂ ∂τ ∂ ∂ Rµ ∂
∂µ = + = + (12.15)
∂xµ ∂xµ ∂τ ∂xµ R · u ∂τ
The xµ differentiation cares about the location of the observer, while the τ
differentiation cares about the location of the charge.
Using the explicit form of the potentials of the previous section:
 u   
ν uµ uν Rµ  u 
ν
− ∂ µ Aν = e ∂ µ = −e 2
+ e ∂τ (12.16)
R·u (R · u) R·u R·u
Because F is anti-symmetric, the first term in Eq. (12.16) drops upon anti-
symmetrization and only the second term contributes
12.4. LIENARD WIECHERT FORMULA FOR RETARDED FIELD 169

The field F depend on the location, velocity u and the acceleration u̇ of the
charge at the early time. It does not depend on any higher derivatives, e.g. the
jerk ü. Now compute:
 u  u̇ν uν
ν
∂τ = − ∂τ (R · u) (12.17)
R·u R · u (R · u)2
u̇ν uν uν
= + 2
u·u− R · u̇
R·u (R · u) (R · u)2
u̇ν uν uν
= − c2 − R · u̇
R·u (R · u)2 (R · u)2
Consequently
 
Rµ  u 
ν Rµ u̇ν Rµ uν Rµ uν
∂τ = 2
− c2 3
− R · u̇ (12.18)
R·u R·u (R · u) (R · u) (R · u)3
We get F by anti-symmetrizing:
 
R[µ u̇ν] R[µ uν] R[µ uν]
Fµν = −e 2
− 3
R · u̇ +e c2 (12.19)
(R · u) (R · u) (R · u)3
| {z } | {z }
radiation ”Coulomb”

Since u = O(c), the last term, is order O(c0 ). It decays with distance like R−2 .
This is, essentially, the Coulomb term. The first two terms are proportional to
the acceleration and their velocity dependence is of order O(c−2 ). They decay
like R−1 . These are the radiating terms.

12.4.1 Interpretation
The Lienard-Wiechert formula is complicated and at first also opaque. It may
be useful to view it from general principles.
We have three vectors in the problem:
• R, the light-like vector connecting the point of observation and the source.
• u the particle 4-velocity.
• u̇ the 4-acceleration.
From these we can make three interesting scalars

R · u, R · u̇, u̇ ·
 u̇ (12.20)

The remaining scalars are less interesting

R · R = 0, u · u = −c2 , u · u̇ = 0 (12.21)

To see why u̇ · u̇ has been cancelled, observe that


• F must be (at most) linear in the acceleration.
170 CHAPTER 12. RADIATION

This is because the potential did not depend on the acceleration at all. As a
consequence, the scalar u̇ · u̇ should not appear and the scalar R · u̇ can only
appear in the numerator.
We can now reconstruct all the three terms in F just by the fact that F is
a tensor and dimension analysis: From the tensorial properties of F it must be
of the form
(tensor)µν = (scalar) (vector)µ (vector)ν
Since F has dimension of [charge][length−2 ] and u has the dimension of c, one
possible term is
R[µ uν]
ec2
(R · u)3
which is the last term in Eq. (12.19). You can even get the numerical factor
(and the sign) by looking at the limiting case of the Coulomb field of a particle
at rest where uµ = (−c, 0, 0, 0).
One possible term proportional to the acceleration is u̇ is
R[µ u̇ν]
e
(R · u)2
which gives the first term up tod numerical factor. The middle term is obtained
similarly.

12.5 Accelerating particle in its rest frame


The formula for F simplifies for a particle instantaneously at rest at the origin
(at the early time), i.e.
y 0 = 0, z(y 0 ) = 0, uµ = (c, 0, 0, 0) R · u = −c|x|. (12.22)
We can always achieve this by choosing appropriate Lorentz frame. This deter-
mines F on the forward light-cone (|x|, x). Lets us examine B first and then
E.

12.5.1 The Magnetic field:


Since the spatial components of u vanish, we have (in Cartesian coordinates)
Fij (|x|, x) = εijk B k (|x|, x)
 =0  =0
z }| { z }| {
 R[i u̇j] R[i uj]  2 R[i uj]
= −e 
 (R · u)2 − (R · u)3 R · u̇ + ec (R · u)3

e R[i u̇j]
= −
c2 |x|2
e (a(0) × x)k
= (12.23)
c2 |x|2
12.5. ACCELERATING PARTICLE IN ITS REST FRAME 171

where a, the 3-vector of acceleration u̇µ = (0, a), is orthogonal to u = (c, 0).
It follows that, 3 vector of magnetic field on the light-cone emanating from
the origin, (|x|, x) is
e a(0) × x
B(|x|, x) = 2 (12.24)
c |x|2
The main conclusions we draw from this are

1. The field decays like the inverse distance from the source.

2. The field is perpendicular to both the line of sight and the acceleration
vector.

12.5.2 The electric field


Recall that F0j = −Ej . For a particle at rest at the origin at lab time y 0 = 0
R · u = −c|x| and R0 = −|x|. Hence, on the light-cone emanating from the
origin
 
R[0 u̇j] R[0 uj] R[0 uj]
Ej = −F0j = e 2
− 3
R · u̇ − ec2
(R · u) (R · u) (R · u)3
 
e 1 e
= 2 3
|x| R[0 u̇j] − R[0 uj] R · u̇ + R[0 uj]
c |x| c c|x|3
e e
= 2 3
(−(x · x) aj + (x · a) xj ) + 2 xj
c |x| r

Since

(x · x)a − (x · a) x = − x × (x × a)

We can collect the above to a vector identity


e  e
E(|x|, x) = x̂ × (x̂ × a) + x̂
c2 |x| |x|2
e
= B(|x|, x) × x̂ + x̂ (12.25)
|x|2

The longitudinal part is Coulomb and the transversal part is radiation. E and
B are mutually orthogonal. The two Lorentz scalars are

e2
E · B = 0, E2 − B2 = −
|x|4

Exercise 12.2. Show that the Poynting vector is

c e2
E×B= x̂
4π 4πc3 |x|2
172 CHAPTER 12. RADIATION

12.5.3 Magnetic field in the far field region


In the case that the charge is not quite at the origin and not quite stationary,
one can still write a not too ugly expression for B in the important case of the
far field where |x| is the largest length scale in the problem. We now retain only
the terms that decay like 1/R

Fij (x) = εijk B k (|x|, x)


 
R[i u̇j] R[i uj]
≈ −e − R · u̇ (12.26)
(R · u)2 (R · u)3

In the far field |z| is small compared to |x| and can approximate

R = (x0 − y 0 , x − z(y 0 )) ≈ (|x|, x) (12.27)

(We have used the fact that R is light-like.) If, in addition, the charge is non-
relativistic
uµ ≈ (c, 0) (12.28)
and we obtain
e R[i u̇j] e xi aj − xj ai
εijk B k (x) ≈ − ≈− 2 , r = |x| (12.29)
c2 r2 c r2
We have essentially recovered the results Eq. 12.24 in the far-field also for
particles that need not be stationary and at the origin

e a(y 0 ) × x
B(x) ≈ (12.30)
c2 |x|2

Note that the result now implicitly depends on the retarded time y 0 .

12.6 Retardation from a distant source


To compute the Fµν from Lienard-Wiechert formula we need to know

(R, u, u̇) (12.31)

at the time time y 0 when the radiation has been emitted. This time is deter-
mined as the solution of the equation

x0 − y 0 = |x − z(y 0 )|

where z(y 0 ), the obit of the source (charge), is given. This is an implicit equation
for y 0 . A geometric solution is given in Fig. 12.1.1. In general, one can not hope
to find and analytic solution: A-priori z(y 0 ) could be an arbitrarily complicated
function that can not be inverted explicitly. The best we can hope for is to find
approximate solutions in simple cases when there is a small or a large parameter
that we can use in making successive approximations.
12.6. RETARDATION FROM A DISTANT SOURCE 173

Let us address the accurately we need to estimate the radiation time y 0 to


estimate R. Recall that

R = (x − y 0 , x − z(y 0 )) (12.32)

is light-like. So, it is sufficient to have a good estimate of its spacial part. Now,
if the source is distant from the point of observation, and if it is confined to a
relatively small ball in space,

|z| ≤ `  |x| (12.33)

When |x| is large, we need to approximate |x − z| not just to order |x|, but also
to order |x|0 = O(1), but we may neglect |x|−1
 2 !
2 2 2 2 x·z `
|x − z| = |x| + |z| − 2x · z = |x| 1−2 2 +O (12.34)
|x| |x|

Approximating 1 + 2ε ≈ 1 + ε gives
 
x·z
|x − z| ≈ |x| 1 − (12.35)
|x|2

The implicit equation for the retardation y 0 for a distant source simplifies to:

x0 − y 0 ≈ |x| − x̂ · z(y 0 ) (12.36)

Let us rewrite the equation in the form

y 0 ≈ x0 − |x| + x̂ · z(y 0 ) (12.37)

This is still an implicit equation for y 0 , so we shall need to make some further
approximations if we want to find explicit approximate expression for y 0 .

12.6.1 The dipole approximation


We now need to return to the question how accurately we need to solve Eq.
12.37 to get sufficiently good estimates of u(y 0 ) and u̇(y 0 ) and equivalently,
ż(y 0 ) and z̈(y 0 ). If the source has characteristic frequency ω, then we need to
have an estimate of the time at the source with accuracy better than

ω δt  1 ⇐⇒ ω δy 0  c (12.38)

For example, in the case that the source is harmonic, this means that we need
to know the phase of the source y 0 ω/c to an accuracy that is much better than
2π. We increasingly better accuracy as ω gets large.
174 CHAPTER 12. RADIATION

12.6.2 Dipole approximation: Successive approximations


We can solve the equation 12.37 by iteration. Assume that for all times |z(x0 | ≤
` is near the origin, and so small compared to |x|. This motivates starting the
iteration with

y10 = x0 − |x|
y20 = x0 − |x| + x̂ · z(y10 ) = x0 − |x| + x̂ · z(x0 − |x|) (12.39)
...

Let us see how good the successive approximations are. We can estimate the
error in y10 by comparing with the defining equation for y 0 :

y 0 − y10 = x̂ · z(y 0 ) = O(`) (12.40)

This is a dimension-full error and so has no natural notion of size, it is neither


small not large. Eq. 12.38 allows us to translated the error to a dimension-less
quantity by multiplying by ω/c
ω` 2π`
= (12.41)
c λ
We can phrase this as the statement that if the wavelength of the radiation λ
is much larger than the source, y10 is a good approximation. This is the dipole
approximation.
y20 is an even better approximations. Substituting y20 in Eq. 12.37 gives the
error

y 0 − y20 = x̂ · z(y 0 ) − x̂ · z(x0 − |x|)


≈ −x̂ · ż(y 0 )(y 0 − x0 + |x|)
= −x̂ · ż(y 0 )x̂ · ż(y 0 )
= O(ω`2 /c)

This translates to the dimensionless error


 2  2
ω` 2π`
= (12.42)
c λ
This is a quadratic improvement relative to the error in y10 . y20 gives an explicit
equation for y 0 given x, the point of observation

y20 ≈ x0 − |x| + x̂ · z x0 − |x|



(12.43)

This completes our discussion of the dipole approximation. The approximation


is applies to atomic systems radiating visible light: The size of atomic systems
is O(1 [Å]) while visible light has wavelength O(5000 [Å]). The approximation
does not hold, of course, for X-rays.
In the case of your cell phone ω`/c = O(1) and one needs to go beyond the
dipole approximation.
12.6. RETARDATION FROM A DISTANT SOURCE 175

ct x

y20

x0 − |x|

x
−`/2 `/2

Figure 12.4: The blue line gives the orbit of the charge. The lowest oredr dipole
approximation is x0 −|x|. A better approximation is y20 which takes into account
the position of the charge at x0 − |x|. One could get better approximations by
also taking into account the velocity of the charge.

12.6.3 Radiation from a charge in Harmonic motion


As an application consider a charge undergoing non-relativistic harmonic motion
with
z(y 0 ) = z0 eiωτ , y 0 = cτ (12.44)
z0 is a vector with complex amplitudes and taking
√ the real part is implicit in the
formula. Thus, for example z0 = `(1, i, 0)/ 2 describes circular motion with
radius ` in the plane. The dipole approximation gives for the retarded time τ
cτ = y 0
≈ x0 − r + x̂ · z x0 − r

0
−r)
= x0 − r + x̂ · z0 eiω(x , r = |x| (12.45)
By Eq. 12.30
e a(y 0 ) × x̂
B(x) ≈
c2 r
eω 2 eiωτ
= − 2 z0 × x̂ (12.46)
c r
The field is proportional to s spherical wave
0
eik(x −r)
(12.47)
r
176 CHAPTER 12. RADIATION

The most interesting thing to observe is that this outgoing spherical wave is a
consequence of retardation.
Exercise 12.3. Consider the radiation from two opposite charges ±e executing
harmonic motion with opposite amplitudes ±z0 . In this case one needs to con-
sider the retardation beyond the leading order in the dipole approximation, i.e.
one needs to keep the last term in Eq. 12.45.

12.6.4 Many particles


The radiation fields from many particles with prescribed orbits is, by linear-
ity of the Maxwell equation, the sum of the radiation of the individuals ones.
In general, each particle will have its own retarded time, and the formulas re-
main implicit. A simplification occurs in the dipole approximation for “ small
antenna” where all the charges share the same retardation.
The dipole moment of a large collection of charges is:
X
d(t) = ej zj (t) (12.48)

and we assume that all the orbits zj are such that the dipole approximation
applies. In this case all the charges have the same retardation and the magnetic
field is simple
d̈(t − r/c) × x̂
B(r, t) = (12.49)
c2 |x|
,

12.7 Power
The power emitted by dipole can be computed from Poynting
E×B
P=c (12.50)

Both E and B lie in the plane perpendicular to the line of sight so P is parallel
to r̂.
Linear Dipole: Suppose that a = aẑ. The magnitude of P in the direction
of the spherical angle θ relative to the z-axis is
E2 e2 a2 sin2 θ
P (θ) = c = (12.51)
4π 4πc3 r2
The power through a spherical shell of radius r is then
Z
PT = 2πr2 dθ sin θ P (θ) (12.52)

Evidently
Z Z
2
 4
dθ sin θ sin θ = − d(cos θ) (1 − cos2 θ) = 2 1 − 31 = (12.53)
3
12.7. POWER 177

B
E
a

Figure 12.5: Dipole oriented along the z-axis and the associated fields. φ is the
angle between the z-axis and the blue arrrow.

Hence, the total power


2 e2 a2 2 d̈2
P = = (12.54)
3 c3 3 c3

Figure 12.6: Polar plot of the power radiated by a dipole antenna as function of
the angle, Eq. 12.51. The maximal power is radiated in the plane perpendicular
to the dipole.

The radiation from a dipole antenna is not isotropic: It does not radiate at
all in the directions of the dipole. The good news is that you can play with the
geometry of the antenna to make it directional so you do not waste power in
directions that you do not want.
Remark 12.4. You can not make an isotropic antenna. No matter how com-
plicated an antenna you make the Poynting vector must vanish it at least in two
directions. This is a consequence of topology: The vector B is tangent to the
178 CHAPTER 12. RADIATION

sphere. It is a fact that every vector field tangent to the sphere must vanish at
two points, at least. (Or vanish quadratically at one point.) This is sometimes
expressed as you can not comb a tennis ball. Hence P must vanish at two points
at least.

12.8 Classical instability of atoms


I now want to explain a puzzle in classical electrodynamics that turned out to
be a window that opened the way to quantum mechanics. In classical physics
atoms are unstable, and should collapse in short time by emitting radiation with
diverging frequency.
Consider a charge e in Keplerian orbit around a nucleus of charge e in a
circular orbit. The energy (non-relativistic) of the system is
1 e2
E=−
2 |x|
while the acceleration increases rapidly at the orbit gets smaller:
e2
a=
m|x|2
The rate of loss of energy by radiation is
2 e2 2
−Ė = a
3 c3
We can now eliminate a and obtain a differential equation for the energy
25
Ė = −kE 4 , k= (12.55)
3(me)2 c3
Energy is lost with accelerating rate leading to blow up at finite time. The
differential equation is easy to integrate since

Ė 1 d(E −3 )
k=− = 3 (12.56)
E4 dt
Hence
E(t) = E0 (1 − γt)−1/3 , γ = −3E03 k > 0
In other words, the charge would collapse on the nucleus in finite time 1/γ.
The ground state energy of hydrogen-like atom is,
 2 2
2 e
2E0 = −mc = −mc2 α2
~c
and its period is 2π~/E0 . Therefore, the decay time, counted in periods, is
E0 1
= × α−3 = 6400
2π~γ 128π
12.8. CLASSICAL INSTABILITY OF ATOMS 179

This means that the electron in hydrogen would fall on the on the proton in
2 × 10−12 seconds. The classical world is unprotected against collapse to the
nucleus.
The apparent instability of the atoms in classical physics is one of the reasons
that lead to quantum mechanics.

E
t

Figure 12.7: Blowup at finite time: The energy of a charged particle encircling
the nucleus goes to −∞ in finite time.
180 CHAPTER 12. RADIATION
Chapter 13

Radiation reaction

13.1 Is electrodynamics a consistent theory?


13.1.1 Physics
Classical electrodynamics is not an accurate description of nature: Nature is
quantum mechanical. One can nevertheless ask if classical electrodynamics is a
self-consistent theory which is mathematically well defined.
A complete description of a problem in electrodynamics involves Newton
equations that describe the motion of charged particles in a given electromag-
netic field and Maxwell equations which describe the electromagnetic fields given
the 4-currents as sources. As we have seen, both Newton equations for point
charges and Maxwell’s equations for the fields are consistent with special rela-
tivity.
Electrodynamics for point charges has singular source terms j µ in Maxwell
equations, made from a collection of delta functions. Since the source terms are
singular, the fields are also singular. For example, the total field energy of a
point charge diverges:
e2
Z
1
dx = ∞ (13.1)
8π |x|4
The divergence comes from the singularity at x = 0. (The integral at infinity
is perfectly convergent in 3 spatial dimensions). As a consequence, when we
discussed Maxwell’s energy-momentum tensor and the associated conservation
of laws, we will encounter diverging quantities.
We can not replace the point charges by rigid small spheres, since rigid
spheres are incompatible with the principle of finite propagation speed of special
relativity, and involve additional internal degrees of freedom such as rotations,
which will affect energy and angular momentum.
If we replace the point charges by other elementary objects such as tiny
fluctuating strings, then we cure the disease in total field energy near but we now
need to supplement Newton and Maxwell by a theory describing the fluctuations

181
182 CHAPTER 13. RADIATION REACTION

of the strings.
You may wonder if the problem with infinities in not cured by quantum me-
chanics. After all, Heisenberg uncertainty relation does not allow to fully localize
a particle to a point. So, perhaps, a quantum theory of charges coupled with a
classical electromagnetic field is a consistent theory? The trouble is that it is not
possible to consistently couple a quantum theory for the charges with a classical
theory for the electromagnetic fields. This is most easily seen in the Heisen-
berg picture: In a quantum theory the observables are non-commuting matrices
while in a classical theory they are commuting functions. If the two theories are
coupled the classical theory will be “infected” by the non-commutativity of the
quantum theory.
What about QED: A relativistic quantum theory of both charges and fields?
It is now widely believed that this theory although practically useful, is in
principle, ill-defined. It is a “phenomenological” mirror of a more involved,
hopefully consistent, QCD.

13.1.2 Mathematics
Newton equations are non-linear ODE’s coupled to Maxwell equations which are
linear PDE’s. As the the charges are point-like the source terms are singular, and
so are the fields. The standard theorems in the theory of differential equations
do not cover singular PDE coupled to non-linear ODE. This is worse than the
case of Navier-Stokes equations, which is still open.

13.2 Non-relativistic interacting particles


There are limiting cases of electrodynamics which are well defined. In particular,
this is the case for for non-relativistic point charges. This limit is well defined
mathematically1 and practically useful. The electromagnetic field mediates the
interaction between the charges and the limit is encapsulated by the Hamiltonian
for mutually interacting particles
X 1 1 X ej ek
H= p2j + (13.2)
2mj 2 |xj − xk |
j6=k

This is the starting point of non-relativistic of atomic physics. To show this we


recall first he freedom to impose the Coulomb gauge condition

∇ · A = 0, ∆Φ = −4πρ (13.3)

Poisson’s equation determines the gauge potential Φ, which responds instanta-


neously to a change in ρ. A satisfies the wave equation with a source term:

− ∆A − Ä = J + ∇Φ̇ (13.4)
c
1 To be fair, we have thrown the infinite terms associated with self-interaction in Eq. 13.2.
13.3. RADIATION REACTION: THE ABRAHAM-LORENTZ FORCE 183

(See section 11.7.)


How much of gauge freedom we still have? We can add

A 7→ A + ∇ × C (13.5)

where C is a solution of the free wave equation. In particular, we can always


impose on A the condition that it is linear in the source, i.e that it is constructed
from the retarded Green function of the wave equation.
When the particles move slowly they are mostly affected by the electric field.
This follows from
e
mu̇µ + Fµν uν = 0
c
µ µ ν
u ≈ cδ0 , so Fµν u ≈ Fµ0 c. By Eq. 13.4 A is small when when the charges are
slow, and the electric field is mostly determined by Φ:
1
E = −∇Φ − Ȧ ≈ −∇Φ (13.6)
c
|{z}
O(v/c)

Maxwell’s equations for slow particles, to leading order in v/c, reduce to Pois-
son’s equation for Φ. The electromagnetic field is not a dynamical field anymore:
It is a slave of the motion of the charges ρ. Since we know how Poisson’s equa-
tion can be explicitly solved given the position of the cahrges, the dynamics is
only in the motion of the charges. This gives Eq. 13.2.

13.3 Radiation reaction: The Abraham-Lorentz


force
Trying to treat Maxwell and Newton self-consistently leads to deep conceptual
problems but at the same time, in practice, the effects are normaally rediculously
small.
Consider a charged particle moving non-relativistically in a circle due to the
action of a central force. We allow for non-electromagnetic forces that make sure
that the particle moves in a stationary orbit. The accelerating charge radiates
and we want to compute the force that acts back on the charge because of this
process.
For circular motion the acceleration is perpendicular to the velocity and

a2 = (a ·˙ v) − ȧ · v = −ȧ · v

The radiated power can now be written as

2 e2 2 2 e2
P = 3
a = − 3 ȧ · v = −FAL · v
3c 3c
If the particle is moving at constant speed, an opposite (non-electromagnetic)
force to FAL must be applied to feed back the energy lost to radiation. This
184 CHAPTER 13. RADIATION REACTION

can be interpreted as the back reaction of the radiation


2 e2
FAL = ȧ (13.7)
3 c3
ȧ is known as jerk. It is opposite to the velocity, Fig 13.1 and the force acts abit
like friction. This is the Abraham-Lorentz force. It is an unusual force, in two
ways. Unlike the usual friction it is not proportional to the velocity but rather
to the jerk. Second, it is ultimately a force that a particle applies on itself by
shedding radiation.

v ȧ

Figure 13.1: In a circular motion every derivative is a rotation by π/2

Exercise 13.1 (Covariant form of Abraham Lorentz force). Using the fact that
Newton law
maµ = fµ
is consistent with u · u = −c2 provided f · u = 0. Using a · a + u · ȧ = 0, we see
that the covariant form of Abaraham-Lorentz force is
2e2  a·a 
maµ = 3 ȧµ − 2 uµ (13.8)
3c c

13.3.1 When is radiation reaction important?


To appreciate the significance of the Abraham-Lorentz force consider the ratio
 2 
|FAL | e ȧ e2 ȧ
=O 3
= O (τ ω) , τ = 3
, ω= (13.9)
m|a| mc a mc a
ω is the characteristic frequency of the actual motion and τ is a fundamental time
scale associated with radiation reaction. To get some feeling into the meaning
13.3. RADIATION REACTION: THE ABRAHAM-LORENTZ FORCE 185

of τ lets express it in terms of the classical radius r0 of a charge particle of mass


m. This is defined by equating the field energy with the energy in the mass

e2
mc2 = (13.10)
r0
Then τ is the time it takes light to cross the classical radius:
r0
τ= (13.11)
c
For the electron
τ = O(10−23 ) [s] (13.12)
The ratio in Eq. 13.9 is a ridiculously small number unless the particle has
violent jerk, with characteristic frequencies ω  1023 Hz.

13.3.2 Friction
Small forces can still do something if they act for long time. Consider the
equation of motion in a force field f with radiation reaction like friction:

1 2 e2
a= f (x) + τ ȧ, τ = (13.13)
m 3 mc3
Viewing τ as a smallest time scale in the problem we treat it as a perturbation
and solve the equation of motion by iteration. To leading order
1
a1 = f (x1 ) (13.14)
m
To second order
1
a2 = f (x2 ) + τ ȧ1
m
1  
= f (x2 ) + τ ḟ (x1 )
m
1
= (f (x2 ) + τ (v1 · ∇)f (x1 )) (13.15)
m
which we approximate by an equation with a weak friction term

ma = f (x) + τ (v · ∇)f (x) (13.16)

As an example consider the harmonic oscillator where f (x) = −mω02 x. The


equation of motion is
a = −ω02 (x + τ v) (13.17)
which is solved by x(t) = x(0)eiω0 zt where z is a solution of the quadratic
equation
− z 2 + 1 + izτ ω0 = 0 (13.18)
186 CHAPTER 13. RADIATION REACTION

Since ω0 τ  1, this is solved by

z 2 ≈ 1 ± iτ ω (13.19)

One solution is decaying in the future and a second solution is exploding in


the future. We pick the decaying solution as is self-consistent. The exploding
solution is not self-consistent with the assumption that the radiation reaction
is a small perturbation.

Exercise 13.2. Write the equations of motion for the Kepler problem with
radiation reaction friction term.

13.3.3 The Dumbbell

The derivation of the Abraham-Lorentz force is not completely convincing. We


expect a force also when a is constant as the particle still radiates.

To address this consider, following Amos Ori, a the forces that diffeent parts
of an extended body apply on each other. In the limit that the size of the object,
ε → 0, we shall recovers Abaraham Lorentz with the extra bonus that we get
an interpretation of the mass in terms of the energy of the field.

We shall derive the self-force force on a dumbbell made of two point charges
separated by a rod of length ε.

We first compute the forces that the dumbbell applies on itself when it moves
in a prescribed way. The world-lines of the dumbbell are shown in the figure
(red) and the light-cone is drawn blue. We shall see that as a consequence of
retardation, Newton third law is violated, and there is a net force acting on an
extended body.
13.3. RADIATION REACTION: THE ABRAHAM-LORENTZ FORCE 187




hhhh 

 hh
  x
 










The dumbbell moves along the x-axis and is aligned with the y-axis. So the
world line of the two charges is

x± (t) = (ct, q(t), ±ε/2, 0)

The two charges communicate when x± (t + τ ) − x∓ (t) is light like. The time
delay τ = O(ε) is small.
R± (τ ) = x± (τ ) − x∓ (0) is a light like vector. We choose a Lorentz frame so
that the dumbbell is at rest at time zero, i.e. q(0) = q̇(0) = 0 and so

R± (τ ) = (cτ, q(τ ), ±ε, 0) (13.20)

We want to find the forces at time τ and express them in terms of the
acceleration and jerk at the same time τ . For simplicity we take the jerk to be
constant. Then
a(τ ) = a = a(0) + ȧτ
From this it follows that

q̇(τ ) = a(0)τ + 21 ȧτ 2 = aτ − 12 ȧτ 2 (13.21)

Integrating the velocity gives the position

q(τ ) = 21 a(0)τ 2 + 16 ȧτ 3 = 12 aτ 2 + 1 1


ȧτ 3 = 12 aτ 2 − 13 ȧτ 3

6 − 2 (13.22)

This fixes the function q(τ ) in R± .


188 CHAPTER 13. RADIATION REACTION

The retardationτ and the dumbbell size ε are related by the condition that
R is light like:
(cτ )2 = ε2 + q 2 (τ )
which is a polynomial equation for τ of order 6. However, as q(τ ) is quadratic
in τ . Hence, to leading order
cτ ≈ ε (13.23)
The retarded field at time τ is determined by the velocity and acceleration of
the particle at time 0 when the signal was generated:
 
R[µ aν] R[µ uν] 2 R[µ uν]
Fµν = −e − R·a−c (13.24)
(R · u)2 (R · u)3 (R · u)3
The four velocity
 is uµ (0) = (−c, 0, 0, 0), and the four acceleration is aµ (0) =
0, a(0), 0, 0 . We therefore have from Eq. 13.20

R · u = −c2 τ ≈ −cε, R · a = −a(0)q(τ ) = O(ε2 ) (13.25)

Since this term appears in the denominator of F, the limit ε → 0 is singular and
has to be taken carefully. In particular, we need to keep terms in the numerator
to O(ε3 ).
The electric field in the direction of motion on one of the charges due to the
other at time τ is
!
R0 a1 R1 u0 2 R1 u0
−Ex = F01 = −e + R · a +c
(c2 τ )2 (−c2 τ )3 (−c2 τ )3
| {z } | {z }
a0 =0 u1 =0

We are interested in the terms that do not vanish when cτ = ε → 0.


By Eqs. 13.20, 13.22, R1 = q(τ ) = O(τ 2 ) the middle terms in the formula
above tends to zero with τ and can be dropped when ε → 0.
The remaining terms are (to O(ε))
 
e a(0) q(τ )
Ex = F10 ≈ − + 3
c3 τ τ
1 1
2 a − 3 ȧτ
 
e a − ȧτ
= − + (13.26)
c3 τ τ
 
e a 2
= − 3 − ȧ
c 2τ 3
The retarded force that one charge applies on the other in the x direction is
made of two terms: The leading term diverges as ε → 0 and is proportional
to the acceleration a. The subleading term has a finite limit as ε → 0 and is
proportional to the jerk:
e2 2e2
F = eE = − 3
a + ȧ (13.27)
|2τ{zc } 3c3
divergent
13.4. CONCEPTUAL DIFFICULTIES 189

As expected, in addition to the Abraham-Lorentz force proportional to ȧ, we


find a force proportional to a. Newton law for dumbbell with (bare) mass mb ,
in an external force Fex and radiation self-force is then

e2
   2
2e
mb + a= ȧ + Fex
2τ c3 3c3

We interpret the brackets as the effective mass which gets a contribution from
the electromagnetic energy in the field.
This is quite tantalizing, for it offers a new interpretation of the mass as
being generated by the field. The interpretation is not completely satisfactory
for as we send τ → 0 we need to take the bare mass large and negative in order
to get a finite effective mass.
Remark 13.3. In this computation we neglected the self-force of the point
charge on itself. One can get the self radiation reaction using the following
trick Amos Ori taught me. Let f (e) denote this self force. It is a quadratic
function of e. To find it use the fact the in the limit τ → 0 the total force on
the dumbbell gives an equation for f (e):

4ε2
Ft = 2f (e) + ȧ = f (2e) = 4f (e)
3c3
2e2
This says that the radiation reaction force is f (e) = 3c3 ȧ as we have seen before.

13.4 Conceptual difficulties


Radiation reaction is conceptually problematic for several reasons. First, it
changes the order of Newton equations of motion from second order to third,
see Eq. (13.13). This means that fixing the initial position and velocity is not
sufficient to determine the orbit. This is in contrast with common experience.
Another problem is that the equations of motion, Eq. (13.13), admit non-
physical solutions. For example, with f = 0 the equation is

2e2
a = τ ȧ , τ=
3mc3
It admit the solution
a(0) t/τ
a(t) = a(0)et/τ =⇒ v(t) = v(0) + e (13.28)
τ
A particle, initially at rest, self accelerate to large velocities. This can only be
avoided if you tune a(0) = 0 precisely. In practice, we only tune x(0) and v(0),
so how come we do not see these self-accelerations? Classical Electrodynamics
for point particles is fundamentally flawed. You could use this to argue for going
quantum.

You might also like