Skip to main content

Full text of "Introduction to General Relativity"

See other formats


Gerard 't Hooft 

Institute for Theoretical Physics 
Utrecht University 

Spinoza Institute 
Postbox 80.195 
3508 TD Utrecht, the Netherlands 

internet : http : //www. staff . science t 101/ 

Version December 2012 



General relativity is a beautiful scheme for describing the gravitational field and the 
equations it obeys. Nowadays this theory is often used as a prototype for other, more 
intricate constructions to describe forces between elementary particles or other branches 
of fundamental physics. This is why in an introduction to general relativity it is of 
importance to separate as clearly as possible the various ingredients that together give 
shape to this paradigm. After explaining the physical motivations we first introduce 
curved coordinates, then add to this the notion of an affine connection field and only as a 
later step add to that the metric field. One then sees clearly how space and time get more 
and more structure, until finally all we have to do is deduce Einstein's field equations. 

These notes materialized when I was asked to present some lectures on General Rela- 
tivity. Small changes were made over the years. 1 decided to make them freely available 
on the web, via my home page. Some readers expressed their irritation over the fact that 
after 12 pages I switch notation: the i in the time components of vectors disappears, and 
the metric becomes the h + + metric. Why this "inconsistency" in the notation? 

There were two reasons for this. The transition is made where we proceed from special 
relativity to general relativity. In special relativity, the i has a considerable practical 
advantage: Lorentz transformations are orthogonal, and all inner products only come 
with + signs. No confusion over signs remain. The use of a — + + + metric, or worse 

even, a + — metric, inevitably leads to sign errors. In general relativity, however, 

the i is superfiuous. Here, we need to work with the quantity goo anyway. Choosing it 
to be negative rarely leads to sign errors or other problems. 

But there is another pedagogical point. I see no reason to shield students against 
the phenomenon of changes of convention and notation. Such transitions are necessary 
whenever one switches from one field of research to another. They better get used to it. 

As for applications of the theory, the usual ones such as the gravitational red shift, 
the Schwarzschild metric, the perihelion shift and light deflection are pretty standard. 
They can be found in the cited literature if one wants any further details. Finally, I do 
pay extra attention to an application that may well become important in the near future: 
gravitational radiation. The derivations given are often tedious, but they can be produced 
rather elegantly using standard Lagrangian methods from fleld theory, which is what will 
be demonstrated. When teaching this material, I found that this last chapter is still a 
bit too technical for an elementary course, but I leave it there anyway, just because it is 
omitted from introductory text books a bit too often. 

I thank A. van der Ven for a careful reading of the manuscript. 



C.W. Misner, K.S. Thorne and J. A. Wheeler, "Gravitation", W.H. Freeman and Comp., 
San Francisco 1973, ISBN 0-7167-0344-0. 

R. Adler, M. Bazin, M. Schiffer, "Introduction to General Relativity" , McGraw-Hill 1965. 

R. M. Wald, "General Relativity" , Univ. of Chicago Press 1984. 

P.A.M. Dirac, "General Theory of Relativity", Wiley Interscience 1975. 

S. Weinberg, "Gravitation and Cosmology: Principles and Applications of the General 
Theory of Relativity", J. Wiley & Sons, 1972 

S.W. Hawking, G.F.R. Ellis, "The large scale structure of space-time" , Cambridge Univ. 
Press 1973. 

S. Chandrasekhar, "The Mathematical Theory of Black Holes", Clarendon Press, Oxford 
Univ. Press, 1983 

Dr. A.D. Fokker, "Relativiteitstheorie" , P. Noordhoff, Groningen, 1929. 

J.A. Wheeler, "A Journey into Gravity and Spacetime" , Scientific American Library, New 
York, 1990, distr. by W.H. Freeman & Co, New York. 

H. Stephani, "General Relativity: An introduction to the theory of the gravitational 
field" , Cambridge University Press, 1990. 


Prologue 1 

Literature 2 


1 Summary of the theory of Special Relativity. Notations. 4 

2 The Eotvos experiments and the Equivalence Principle. 8 

3 The constantly accelerated elevator. Rindler Space. 9 

4 Curved coordinates. 14 

5 The cifiine connection. Riemann curvature. 19 

6 The metric tensor. 26 

7 The perturbative expansion and Einstein's law of gravity. 31 

8 The action principle. 35 

9 Special coordinates. 40 

10 Electromagnetism. 43 

11 The SchwEirzschild solution. 45 

12 Mercury and light rays in the Schwarzschild metric. 52 

13 Generalizations of the Schwarzschild solution. 56 

14 The Robertson- Walker metric. 59 

15 Gravitational radiation. 64 

16 Concluding remeirks 70 


1. Summary of the theory of Special Relativity. Notations. 

Special Relativity is the theory claiming that space and time exhibit a particular symmetry 
pattern. This statement contains two ingredients which we further explain: 

(i) There is a transformation law, and these transformations form a group. 

(ii) Consider a system in which a set of physical variables is described as being a correct 
solution to the laws of physics. Then if all these physical variables are transformed 
appropriately according to the given transformation law, one obtains a new solution 
to the laws of physics. 

As a prototype example, one may consider the set of rotations in a three dimensional 
coordinate frame as our transformation group. Many theories of nature, such as Newton's 

— * 

law F — m ■ a , are invariant under this transformation group. We say that Newton's 
laws have rotational symmetry. 

A "point-event" is a point in space, given by its three coordinates x — {x, y, , at a 
given instant t in time. For short, we will call this a "point" in space-time, and it is a 
four component vector. 






\x^ J 


Here c is the velocity of light. Clearly, space-time is a four dimensional space. These 
vectors are often written as x^^ , where // is an index running from to 3 . It will however 
be convenient to use a slightly different notation, x^, = 1, . . . , 4 , where x^ = ict and 
i = . Note that we do this only in the sections 1 and 3, where special relativity in 
fiat space-time is discussed (see the Prologue). The intermittent use of superscript indices 
( {}^ ) and subscript indices ( {}^ ) is of no significance in these sections, but will become 
important later. 

In Special Relativity, the transformation group is what one could call the "velocity 
transformations" , or Lorentz transformations. It is the set of linear transformations, 


i^n' = E^'-^'^ (1-2) 

subject to the extra condition that the quantity a defined by 


^2 = Y^^x^f = \x\' - cH^ {a > 0) (1.3) 

remains invariant. This condition implies that the coefficients L^^^ form an orthogonal 




Y.^%L'^^ = (1-4) 


Because of the i in the definition of , the coefficients and L^- must be purely 
imaginary. The quantities and 5^j,i, are Kronecker delta symbols: 

5'^^ = Sfj^v = 1 a n = I' , and otherwise. (1.5) 

One can enlarge the invariance group with the translations: 

y = X^L'^,x^+ a^ (1.6) 


in which case it is referred to as the Poincare group. 

We introduce summation convention: 
If an index occurs exactly twice in a multiplication (at one side of the = sign) it will 
automatically be summed over from 1 to 4 even if we do not indicate explicitly the 
summation symbol J2 ■ Thus, Eqs. (1.2)-(1.4) can be written as: 

If we do not want to sum over an index that occurs twice, or if we want to sum over an 

index occurring three times (or more), we put one of the indices between brackets so as 
to indicate that it docs not participate in the summation convention. Remarkably, we 
nearly never need to use such brackets. 

Greek indices /x, i/, . . . run from 1 to 4 ; Latin indices i,j,... indicate spacelike 
components only and hence run from 1 to 3 . 

A special element of the Lorentz group is 

where x is a parameter. Or 

/I \ 


coshx isinhx 

\0 —isinhx coshx / 


X ^ X ; y ^ y ; 
z' — z cosh X ~ sinh x ; 

t' — — sinh X + ^ cosh X ■ (1-9) 
This is a transformation from one coordinate frame to another with velocity 

V — c tanhx (in the z direction) (1-10) 


with respect to each other. 

For convenience, units of length and time will henceforth be chosen such that 



Note that the velocity v given in (1.10) will always be less than that of light. The light 
velocity itself is Lorentz- invariant. This indeed has been the requirement that lead to the 
introduction of the Lorentz group. 

Many physical quantities are not invariant but covariant under Lorentz transforma- 
tions. For instance, energy E and momentum p transform as a four-vector: 





Electro-magnetic fields transform as a tensor: 








\ iEi 





It is of importance to realize what this implies: although we have the well-known 
postulate that an experimenter on a moving platform, when doing some experiment, 
will find the same outcomes as a colleague at rest, we must rearrange the results before 
comparing them. What could look hke an electric field for one observer could be a 
superposition of an electric and a magnetic field for the other. And so on. This is what 
we mean with covariance as opposed to invariance. Much more symmetry groups could be 
found in Nature than the ones known, if only we knew how to rearrange the phenomena. 
The transformation rule could be very complicated. 

We now have formulated the theory of Special Relativity in such a way that it has be- 
come very easy to check if some suspect Law of Nature actually obeys Lorentz invariance. 
Left- and right hand side of an equation must transform the same way, and this is guar- 
anteed if they are written as vectors or tensors with Lorentz indices always transforming 
as follows: 

iX'"';.-^...)' = L\... • • ^'^^-5... • (1-14) 

Note that this transformation rule is just as if we were dealing with products of vectors 
X'^Y^ , etc. Quantities transforming as in Eq. (1.14) are called tensors. Due to the 
orthogonality (1.4) of L''^ one can multiply and contract tensors covariantly, e.g.: 

Xi' = Y^^Z^PP (1.15) 


is a "tensor" (a tensor with just one index is called a "vector"), if Y and Z are tensors. 
The relativistically covariant form of Maxwell's equations is: 

d^F^u = -J.; (1.16) 

daFp^ + dpF^a + d^F^p = 0; (1.17) 

F^,. - d^A,-d,A^, (1.18) 

d^J^, = 0. (1.19) 

Here (9^ stands for d/dx^ , and the current four-vector is defined as J^(a;) = 
{j{x), icg{x) ) , in units where /xq and Sq have been normalized to one. A special tensor 
is £nuai3 , which is defined by 

£i234 = 1 ; 

^ixva^ = if any two of its indices are equal. (1-20) 

This tensor is invariant under the set of homogeneous Lorentz transformations, in fact for 
all Lorentz transformations L'^j, with det (L) = 1 . One can rewrite Eq. (1.17) as 

^HuajidyFalj = 0. (1-21) 

A particle with mass m and electric charge q moves along a curve x'^{s) , where s runs 
from — oo to +00 , with 

(dsxn' = -1; (1.22) 
md^^x" = qF^^dsx''. (1.23) 

The tensor T^""^^ defined by^ 

^^iv™ = ^i/M™ = Ff^xFxu + ISf^i^FxaFxa , (1-24) 

describes the energy density, momentum density and mechanical tension of the fields F^js . 
In particular the energy density is 

T,r = -\Fl + \F.,F,, = \{E^ + B') , (1.25) 

where we remind the reader that Latin indices i,j, . . . only take the values 1, 2 and 3. 
Energy and momentum conservation implies that, if at any given space-time point x , 
we add the contributions of all fields and particles to Tfj,^{x) , then for this total energy- 
momentum tensor, we have 

d^T^, = 0. (1.26) 

The equation 90T44 = — 9jTjo may be regarded as a continuity equation, and so one 
must regard the vector Tjo as the energy current. It is also the momentum density, and, 

^N.B. Sometimes T^i, is defined in different units, so that extra factors An appear in tlie denominator. 


in the case of electro-magnetism, it is usually called the Poynting vector. In turn, it 
obeys the equation 9oTjo = djTij , so that —T^j can be regarded as the momentum flow. 
However, the time derivative of the momentum is always equal to the force acting on a 
system, and therefore, T^j can be seen as the force density, or more precisely: the tension, 
or the force Fj through a unit surface in the direction j . In a neutral gas with pressure 
p , we have 

= (1.27) 

2. The Eotvos experiments and the Equivalence Principle. 

Suppose that objects made of different kinds of material would react slightly differently 
to the presence of a gravitational field g , by having not exactly the same constant of 
proportionality between gravitational mass and incrtial mass: 

inert " 

-'"grav y ' 

-'"inert " 

-"^^grav y ) 



i \/i grav -, 


These objects would show different accelerations a and this would lead to effects that 
can be detected very accurately. In a space ship, the acceleration would be determined 
by the material the space ship is made of; any other kind of material would be accel- 
erated differently, and the relative acceleration would be experienced as a weak residual 
gravitational force. On earth we can also do such experiments. Consider for example a 
rotating platform with a parabolic surface. A spherical object would be pulled to the 
center by the earth's gravitational force but pushed to the rim by the centrifugal counter 
forces of the circular motion. If these two forces just balance out, the object could find 
stable positions anywhere on the surface, but an object made of different material could 
still feel a residual force. 

Actually the Earth itself is such a rotating platform, and this enabled the Hungarian 
baron Lorand Eotvos to check extremely accurately the equivalence between inertial mass 
and gravitational mass (the "Equivalence Principle"). The gravitational force on an object 
on the Earth's surface is 

— * 

F, = -GnM^M^,,^ , (2.2) 

where Gjv is Newton's constant of gravity, and is the Earth's mass. The centrifugal 
force is 

Foj = Minert'^^^axis , (2-3) 

where uj is the Earth's angular velocity and 

F^. = F- (2.4) 


is the distance from the Earth's rotational axis. The combined force an object (i) feels 
on the surface is F*-*-' = Fg^^ + F^*^ . If for two objects, (1) and (2) , these forces, F^^^ 
and , are not exactly parallel, one could measure 


|F(i)||F(2)| ' 

-'"grav -'"grav 


where we assumed that the gravitational force is much stronger than the centrifugal one. 
Actually, for the Earth we have: 

300. (2.6) 

From (2.5) we see that the misalignment a is given by 

nfW /i#{2) 

a ^ (1/300) cos ^ sin ^ ^ , (2.7) 

-i^Jgrav -i^-igrav 

where 6 is the latitude of the laboratory in Hungary, fortunately sufficiently far from 
both the North Pole and the Equator. 

Eotvos found no such effect, reaching an accuracy of about one part in lO'^ for the 
equivalence principle. By observing that the Earth also revolves around the Sun one can 
repeat the experiment using the Sun's gravitational field. The advantage one then has 
is that the effect one searches for fluctuates daily. This was R.H. Dicke's experiment, 
in which he established an accuracy of one part in 10^^ . There are plans to launch a 
dedicated satellite named STEP (Satellite Test of the Equivalence Principle), to check 
the equivalence principle with an accuracy of one part in 10^^ . One expects that there 
will be no observable deviation. In any case it will be important to formulate a theory 
of the gravitational force in which the equivalence principle is postulated to hold exactly. 
Since Special Relativity is also a theory from which never deviations have been detected 
it is natural to ask for our theory of the gravitational force also to obey the postulates of 
special relativity. The theory resulting from combining these two demands is the topic of 
these lectures. 

3. The constantly accelerated elevator. Rindler Space. 

The equivalence principle implies a new symmetry and associated invariance. The real- 
ization of this symmetry and its subsequent exploitation will enable us to give a unique 
formulation of this gravity theory. This solution was first discovered by Einstein in 1915. 
We will now describe the modern ways to construct it. 

Consider an idealized "elevator", that can make any kinds of vertical movements, 
including a free fall. When it makes a free fall, all objects inside it will be accelerated 
equally, according to the Equivalence Principle. This means that during the time the 


elevator makes a free fall, its inhabitants will not experience any gravitational field at all; 
they are weightless.^ 

Conversely, we can consider a similar elevator in outer space, far away from any star or 
planet. Now give it a constant acceleration upward. All inhabitants will feel the pressure 
from the floor, just as if they were living in the gravitational field of the Earth or any other 
planet. Thus, we can construct an "artificial" gravitational field. Let us consider such 
an artificial gravitational field more closely. Suppose we want this artificial gravitational 
field to be constant in space^ and time. The inhabitants will feel a constant acceleration. 

An essential ingredient in relativity theory is the notion of a coordinate grid. So let us 
introduce a coordinate grid /x = 1, . . . , 4 , inside the elevator, such that points on its 
walls in the x -direction are given by = constant, the two other walls are given by = 
constant, and the floor and the ceihng by = constant. The fourth coordinate, , is 
i times the time as measured from the inside of the elevator. An observer in outer space 
uses a Cartesian grid (inertial frame) there. The motion of the elevator is described 
by the functions x^{^) . Let the origin of the ^ coordinates be a point in the middle of 
the floor of the elevator, and let it coincide with the origin of the x coordinates. Suppose 
that we know the acceleration g as experienced by the inhabitants of the elevator. How 
do we determine the functions x^{^)l 

We must assume that g = (0, 0, g) , and that g{T) — g is constant. We assumed that 
at T = the ^ and x coordinates coincide, so 

Now consider an inflnitesimal time lapse, dr . After that, the elevator has a velocity 
v — gdr. The middle of the floor of the elevator is now at 

(ignoring terms of order dr^ ) , but the inhabitants of the elevator will see all other points 
Lorentz transformed, since they have velocity v . The Lorentz transformation matrix is 
only infinitesimally different from the identity matrix: 

I + 6L ^ 





—ig dr 


ig dr 

1 ) 


^Actually, objects in different locations inside the elevator might be inclined to fall in slightly different 
directions, with different speeds, because the Earth's gravitational field varies slightly from place to place. 
This must be ignored. As soon as situations might arise that this effect is important, our idealized elevator 
must be chosen to be smaller. One might want to choose it to be as small as a subatomic particle, but 
then quantum effects will compound our arguments, so this is not allowed. Clearly therefore, the theory 
we are dealing with will have limited accuracy. Theorists hope to be able to overcome this difficulty by 
formulating "quantum gravity" , but this is way beyond the scope of these lectures. 

^We shall discover shortly, however, that the field we arrive at is constant in the x , y and t direction, 
but not constant in the direction of the field itself, the z direction. 


Therefore, the other points idr) will be seen at the coordinates {x, it) given by 


(I + SL) 


Now, we perform a little trick. Eq. (3.4) is a Poincare transformation, that is, a 
combination of a Lorentz transformation and a translation in time. In many instances (but 
not always) , a Poincare transformation can be rewritten as a pure Lorentz transformation 
with respect to a carefully chosen reference point as the origin. Here, we can find such a 
reference point: 

by observing that 

^^^ = (0, 0,-1/^,0) , 



so that, at t — dr , 

X - A 



It is important to see what this equation means: after an infinitesimal lapse of time dr 
inside the elevator, the coordinates {x, it) are obtained from the previous set by means 
of an infinitesimal Lorentz transformation with the point x'-'' = A^ as its origin. The 
inhabitants of the elevator can identify this point. Now consider another lapse of time 
dr . Since the elevator is assumed to feel a constant acceleration, the new position can 
then again be obtained from the old one by means of the same Lorentz transformation. 
So, at time r = Ndr , the coordinates (x, it) are given by 


i + 9/ 9^ 

= (I + 5Ly 

All that remains to be done is compute 5L)^ . This is not hard: 



- NdT , L 

[r) = 

-- (I + 6L)' 

+ dr) 

= (I + 6L)L{t) ; 









A(t) -iB{T) 

\ w 



iB{T) A{t) J 


dA/dr = 

= gB , dB/dr 

^gA ; 



A — cosh(g'T) , B — sinh(g'T) . 



Combining all this, we derive 


Figure 1: Rindler Space. The curved solid hne represents the floor of the 
elevator, = . A signal emitted from point a can never be received by an 
inhabitant of Rindler Space, who lives in the quadrant at the right. 

The 3, 4 components of the coordinates, imbedded in the x coordinates, are pic- 
tured in Fig. 1. The description of a quadrant of space-time in terms of the C, coordinates 
is called "Rindler space". From Eq. (3.12) it should be clear that an observer inside the 
elevator feels no effects that depend explicitly on his time coordinate r , since a transition 
from T to t' is nothing but a Lorentz transformation. We also notice some important 

(i) We see that the equal r lines converge at the left. It follows that the local clock 
speed, which is given hy g = ^J —{dx^^ / drY , varies with height : 

e^l + ge, (3.13) 

(ii) The gravitational field strength felt locally is g~'^g{i) , which is inversely propor- 
tional to the distance to the point — . So even though our field is constant 
in the transverse direction and with time, it decreases with height. 

(iii) The region of space-time described by the observer in the elevator is only part of 
all of space-time (the quadrant at the right in Fig. 1, where x^ g > \x'^\). The 
boundary lines are called (past and future) horizons. 


All these are typically relativistic effects. In the non-relativistic limit {g ^ 0) Eq. (3.12) 
simply becomes: 

x'^e+br'; x'^iT^e. (3.14) 

According to the equivalence principle the relativistic effects we discovered here should 
also be features of gravitational fields generated by matter. Let us inspect them one by 

Observation (i) suggests that clocks will run slower if they are deep down a gravita- 
tional field. Indeed one may suspect that Eq. (3.13) generalizes into 

Q = l + V{x), (3.15) 

where V{x) is the gravitational potential. Indeed this will turn out to be true, provided 
that the gravitational field is stationary. This effect is called the gravitational red shift. 

(ii) is also a relativistic effect. It could have been predicted by the following argument. 
The energy density of a gravitational field is negative. Since the energy of two masses Mi 
and M2 at a distance r apart is = —GNMiM2/r we can calculate the energy density 
of a field g as T44 — — (I/SttGat)^^ ■ Since we had normalized c—1 this is also its mass 
density. But then this mass density in turn should generate a gravitational field! This 
would imply^ 

d-g I AtiGnTu = -If , (3.16) 

so that indeed the field strength should decrease with height. However this reasoning is 
apparently too simplistic, since our field obeys a differential equation as Eq. (3.16) but 
without the coefficient ^ . 

The possible emergence of horizons, our observation (iii), will turn out to be a very 
important new feature of gravitational fields. Under normal circumstances of course the 
fields are so weak that no horizon will be seen, but gravitational collapse may produce 
horizons. If this happens there will be regions in space-time from which no signals can 
be observed. In Fig. 1 we see that signals from a radio station at the point a will never 
reach an observer in Rindlcr space. 

The most important conclusion to be drawn from this chapter is that in order to 
describe a gravitational field one may have to perform a transformation from the co- 
ordinates that were used inside the elevator where one feels the gravitational field, 
towards coordinates x^^ that describe empty space-time, in which freely falling objects 
move along straight lines. Now we know that in an empty space without gravitational 
fields the clock speeds, and the lengths of rulers, are described by a distance function a 
as given in Eq. (1.3). We can rewrite it as 

da^ = ^^.dxMx'^ ; g^, = diag(l, 1, 1, 1) , (3.17) 

^Temporarily we do not show the minus sign usually inserted to indicate that the field is pointed 


We wrote here da and dx'^ to indicate that we look at the infinitesimal distance between 
two points close together in space-time. In terms of the coordinates appropriate for 
the elevator we have for infinitesimal displacements d^ ^ , 

dx^ — cosh.{g T)d^^ + (1 + g^^) sinh{gT)dT , 

dx"^ = ismh{gT)d^^ + + g^^)cosh{gT)dT . (3.18) 


da' = -{l + g^^)'dT' + {diy. (3.19) 

If we write this as 

da' = g,AO d^'dC = (df)' + (1 + 5e')'(de')^ (3-20) 

then we see that all effects that gravitational fields have on rulers and clocks can be 
described in terms of a space (and time) dependent field g^y{i) ■ Only in the gravitational 
field of a Rindler space can one find coordinates x^ such that in terms of these the 
function g^^ takes the simple form of Eq. (3.17). We will see that g^iv{C) is all we need 
to describe the gravitational field completely. 

Spaces in which the infinitesimal distance da is described by a space (time) dependent 
function g^uiO called curved or Riemann spaces. Space-time is a Riemann space. We 
will now investigate such spaces more systematically. 

4. Curved coordinates. 

Eq. (3.12) is a special case of a coordinate transformation relevant for inspecting the 
Equivalence Principle for gravitational fields. It is not a Lorentz transformation since 
it is not linear in r. We see in Fig. 1 that the ^'^ coordinates are curved. The empty 
space coordinates could be called "straight" because in terms of them all particles move in 
straight lines. However, such a straight coordinate frame will only exist if the gravitational 
field has the same Rindler form everywhere, whereas in the vicinity of stars and planets 
it takes much more complicated forms. 

But in the latter case we can also use the Equivalence Principle: the laws of gravity 
should be formulated in such a way that any coordinate frame that uniquely describes the 
points in our four-dimensional space-time can be used in principle. None of these frames 
will be superior to any of the others since in any of these frames one will feel some sort of 
gravitational field^. Let us start with just one choice of coordinates x^ — {t, x, y, z) . 
Prom this chapter onwards it will no longer be useful to keep the factor i in the time 
component because it doesn't simplify things. It has become convention to define x^ = t 
and drop the x^ which was it . So now /x runs from to 3. It will be of importance now 
that the indices for the coordinates be indicated as superscripts ^, " . 

^ There will be some limitations in the sense of continuity and differentiability as we will see. 


Let there now be some one-to-one mapping onto another set of coordinates , 

; X ^ x{u) . (4.1) 

Quantities depending on these coordinates will simply be called "fields" . A scalar field (j) 
is a quantity that depends on x but does not undergo further transformations, so that 
in the new coordinate frame (we distinguish the functions of the new coordinates u from 
the functions of x by using the tilde, ~ ) 

(j) = (j){^a) = (j){x{u)) . (4.2) 

Now define the gradient (and note that we use a subscript index) 

axf^ X constant, tor v ^ ji 

Remember that the partial derivative is defined by using an infinitesimal displacement 

(t){x + da;) = ^{x) + (t)^dx>' + 0{dx^) . (4.4) 

We derive 

(t){u + du) = 0(m) + — cP^du" + 0{du^) = cj){u) + Mu)du''. (4.5) 

Therefore in the new coordinate frame the gradient is 

Mu) = x^,0^(a:(«)), (4.6) 

where we use the notation 

u"""^^ constant 


so the comma denotes partial derivation. 

Notice that in all these equations superscript indices and subscript indices always 
keep their position and they are used in such a way that in the summation convention 
one subscript and one superscript occur: 

Of course one can transform back from the x to the u coordinates: 

= u\^^,{u{x)). (4.8) 


«^M^% = <^a, (4.9) 


(the matrix u'^ ^ is the inverse of a;^^ ) A special case would be if the matrix x^^, would 
be an element of the Lorentz group. The Lorentz group is just a subgroup of the much 
larger set of coordinate transformations considered here. We see that 0/^(a;) transforms 
as a vector. All fields Aij^{x) that transform just like the gradients , that is, 

A,{u) = (4.10) 

will be called covariant vector fields, co- vector for short, even if they cannot be written 

as the gradient of a scalar field. 

Note that the product of a scalar field 4> ^ co-vector transforms again as a 
CO- vector: 

5^ = Mn ; 
B^{u) = Hu)A^{u) = (l){x{u))x^'^^A^{x{u)) 

= x>^^,B,ixiu)) . (4.11) 

Now consider the direct product S^;^ = A^^^A^^ . It transforms as follows: 

B^,{u) = x''^^x^^,B^0{x{u)) . (4.12) 

A collection of field components that can be characterized with a certain number of indices 
II, u, . . . and that transforms according to (4.12) is called a covariant tensor. 

Warning: In a tensor such as Bf^i, one may not sum over repeated indices to obtain a 
scalar field. This is because the matrices in general do not obey the orthogonality 
conditions (1.4) of the Lorentz transformations L° . One is not advised to sum over 
two repeated subscript indices. Nevertheless we would like to formulate things such as 
Maxwell's equations in General Relativity, and there of course inner products of vectors do 
occur. To enable us to do this we introduce another type of vectors: the so-called contra- 
variant vectors and tensors. Since a contravariant vector transforms differently from a 
covariant vector we have to indicate this somehow. This we do by putting its indices 
upstairs: F'^{x) . The transformation rule for such a superscript index is postulated to 

F^{u) = <,F"(x(i.)), (4.13) 

as opposed to the rules (4.10), (4.12) for subscript indices; and contravariant tensors 
jpnua... transform as products 

pWt^ p{'^> p(^)oi _ _ _ _ (4-14) 

We will also see mixed tensors having both upper (superscript) and lower (subscript) 
indices. They transform as the corresponding products. 

Exercise: check that the transformation rules (4.10) and (4.13) form groups, i.e. the 
transformation x ^ u yields the same tensor as the sequence x ^ v ^ u . Make 
use of the fact that partial differentiation obeys 


Summation over repeated indices is admitted if one of the indices is a superscript and one 
is a subscript: 

F^(u)A,(u) = u^^,F-{x{u))x^^^Ap{x{u)) , (4.16) 
and since the matrix u'^^^ is the inverse of x^^^ (according to 4.9), we have 

SO that the product F^A^j^ indeed transforms as a scalar: 

u-^x^,^Si, (4.17) 

F^iu)A,{u) = F^{x{u))A^ixiu)) . (4.18) 

Note that since the summation convention makes us sum over repeated indices with the 
same name, we must ensure in formulae such as (4.16) that indices not summed over are 
each given a different name. 

We recognize that in Eqs. (4.4) and (4.5) the infinitesimal displacement dx^ of a 
coordinate transforms as a contravariant vector. This is why coordinates are given super- 
script indices. Eq. (4.17) also tells us that the Kronecker delta symbol (provided it has 
one subscript and one superscript index) is an invariant tensor: it has the same form in 
all coordinate grids. 

Gradients of tensors 

The gradient of a scalar field transforms as a covariant vector. Are gradients of 
covariant vectors and tensors again covariant tensors? Unfortunately no. Let us from 
now on indicate partial dent d/dx^ simply as 9^ . Sometimes we will use an even shorter 

^ = 9^0 = <f>,,. (4.19) 


From (4.10) we find 

- - d / Ox ^ 

dx^ dx^ d . , , ss d'^x^^ 

du" (9m" dxi^ '^^ ^ " du"du' 
= x'^^y^^df^AM^)) + ^'',a,.M<^)) ■ (4-20) 

The last term here deviates from the postulated tensor transformation rule (4.12). 


Now notice that 

which always holds for ordinary partial differentiations. From this it follows that the 
antisymmetric part of d^A^j^ is a covariant tensor: 

TP — F) A — F) A ■ 
F.M = x^^^x^^^Fp,{x{u)) . (4.22) 

This is an essential ingredient in the mathematical theory of differential forms. We can 
continue this way: if A^p — —A^a then 

= daAp^ + djsA^a + d^^a^ (4.23) 

is a fully antisymmetric covariant tensor. 

Next, consider a fully antisymmetric tensor g^uap having as many indices as the 
dimensionality of space-time (let's keep space-time four-dimensional). Then one can write 

9ixval3 — ^ ^fii^aP , (4.24) 

(see the definition of e in Eq. (1.20)) since the antisymmetry condition fixes the values of 
all coefficients of g^uais apart from one common factor uj . Although ou carries no indices 
it will turn out not to transform as a scalar field. Instead, we find: 

Cu{u) = det(x%)a;(x(M)). (4.25) 

A quantity transforming this way will be called a density. 

The determinant in (4.25) can act as the Jacobian of a transformation in an integral. 
If (j){x) is some scalar field (or the inner product of tensors with matching superscript 
and subscript indices) then the integral 

J uj{x)(f){x)d^x (4.26) 
is independent of the choice of coordinates, because 

J d^x... = J d\-det{dx^/du'')... . (4.27) 

This can also be seen from the definition (4.24): 

/ 9fj.uai3 dn^ Adu" A du" A dw'^ = 
/ g^x-ys dx'' A dx^ A dx^ A dx^ . (4.28) 

Two important properties of tensors are: 


1) The decomposition theorem. 

Every tensor X*^^^'" can be written as a finite sum of products of covariant and 
contravariant vectors: 

x^::. - - ■ (4-29) 

The number of terms, N , does not have to be larger than the number of components 
of the tensor^. By choosing in one coordinate frame the vectors A, B, . . . each 
such that they are non vanishing for only one value of the index the proof can easily 
be given. 

2) The quotient theorem. 

Let there be given an arbitrary set of components X^^'""^'" . Let it be known that 
for all tensors Aj^^ ■ (with a given, fixed number of superscript and/or subscript 
indices) the quantity 

ry/iu... yfiu...al3... acft... 

^kX... - ^K\...aT...^al3... 

transforms as a tensor. Then it follows that X itself also transforms as a tensor. 

The proof can be given by induction. First one chooses A to have just one index. Then 
in one coordinate frame we choose it to have just one non-vanishing component. One then 
uses (4.9) or (4.17). If A has several indices one decomposes it using the decomposition 

What has been achieved in this chapter is that we learned to work with tensors in 
curved coordinate frames. They can be differentiated and integrated. But before we can 
construct physically interesting theories in curved spaces two more obstacles will have to 
be overcome: 

(i) Thus far we have only been able to differentiate antisymmetrically, otherwise the 
resulting gradients do not transform as tensors. 

(ii) There still are two types of indices. Summation is only permitted if one index 
is a superscript and one is a subscript index. This is too much of a limitation 
for constructing covariant formulations of the existing laws of nature, such as the 
Maxwell laws. We shall deal with these obstacles one by one. 

5. The afRne connection. Riemann curvature. 

The space described in the previous chapter does not yet have enough structure to for- 
mulate all known physical laws in it. For a good understanding of the structure now to 
be added we first must define the notion of "affine connection" . Only in the next chapter 
we will define distances in time and space. 

^If n is the dimensionality of spacetime, and r the number of indices (the rank of the tensor), then 
one needs at most N < rf~^ terms. 



/ / 


Figure 2: Two contravariant vectors close to each other on a curve S . 

Let C,'^{x) be a contravariant vector field, and let a;'^(r) be the space-time trajectory 
5" of an observer. We now assume that the observer has a way to establish whether 
^^{x) is constant or varies as his eigentime r goes by. Let us indicate the observed time 
derivative by a dot: 

^ ^^^{x{t)). (5.1) 

The observer will have used a coordinate frame x where he stays at the origin O of 
three-space. What will equation (5.1) be like in some other coordinate frame u7 

C{x) = x%r(«(a;)) ; 

^U" = ^e^(^(r)) = <.^r(x.(x(r))) +<,,,^-|>). (5.2) 

Using F'^ = x^j^u^^^F'^ , and replacing the repeated index u in the second term by a , 
we write this as 

d r „ du^ ~ 

= x(;(-r(«(r)) + «.,A^r(«(r))) . 

Thus, if we wish to define a quantity ^'^ that transforms as a contravector then in a 
general coordinate frame this is to be written as 

C{u{r)) ^CHr)) + r;:,— r(H(r)) . (5.3) 

Here, F^^ is a new field, and near the point u the local observer can use a "preferred 
coordinate frame" x such that 

In this preferred coordinate frame, F will vanish, but only on the curve S ! In 
general it will not be possible to find a coordinate frame such that F vanishes everywhere. 
Eq. (5.3) defines the parallel displacement of a contravariant vector along a curve S . To 


do this a new field was introduced, r^^(-u) , called "affine connection field" by Levi-Civita. 
It is a field, but not a tensor field, since it transforms as 

r:M^)) = <,[^'',y,xK^{^) + ^',.,x] ■ (5-5) 

Exercise: Prove (5.5) and show that two successive transformations of this type 
again produces a transformation of the form (5.5). 

We now observe that Eq. (5.4) implies 

n.-Kx, (5.6) 

and since 

this symmetry will also hold in any other coordinate frame. Now, in principle, one can 
consider spaces with a parallel displacement according to (5.3) where F does not obey 
(5.6). In this case there are no local inertial frames where in some given point x one 
has = 0. This is called torsion. We will not pursue this, apart from noting that 
the antisymmetric part of F^^^ would be an ordinary tensor field, which could always be 
added to our models at a later stage. So we limit ourselves now to the case that Eq. (5.6) 
always holds. 

A geodesic is a curve x^{a) that obeys 

/\2 Ht'^ dx^ 

>^» + r::A-T--T:- = o. (5.8) 

d(T2 ^ ' '^-^ da da 

Since dx^/da is a contravariant vector this is a special case of Eq. (5.3) and the equation 
for the curve will look the same in all coordinate frames. 

N.B. If one chooses an arbitrary, different parametrization of the curve (5.8), using 
a parameter a that is an arbitrary differentiable function of a , one obtains a different 

d^ ., , , , , d ,, , , _„ da;*^ da;"^ 

da2 ^ ^ ^ Ma ^ ^ da da ' 

where a (a) can be any function of a . Apparently the shape of the curve in coordinate 
space does not depend on the function a{a) . 

Exercise: check Eq. (5.8a). 

Curves described by Eq. (5.8) could be defined to be the space-time trajectories of particles 
moving in a gravitational field. Indeed, in every point x there exists a coordinate frame 
such that F vanishes there, so that the trajectory goes straight (the coordinate frame of 
the freely faUing elevator). In an accelerated elevator, the trajectories look curved, and 
an observer inside the elevator can attribute this curvature to a gravitational field. The 
gravitational field is hereby identified as an affine connection field. 


Since now we have a field that transforms according to Eq. (5.5) we can use it to 
ehminate the ofi^ending last term in Eq. (4.20). We define a covariant derivative of a 
CO- vector field: 

D^A^ = d^A^-V^^A,. (5.9) 
This quantity D^A^ neatly transforms as a tensor: 

D^A,{u) = x%xf^^ D^A^{x) . (5.10) 

Notice that 

DaAfj, — Df^Aa = daAf^ — d^Aa , (5-11) 

so that Eq. (4.22) is kept unchanged. 

Similarly one can now define the covariant derivative of a contravariant vector: 

D^Af^ = a^^'^ + r^^^^ (5.12) 

(notice the differences with (5.9)!) It is not difficult now to define covariant derivatives of 
all other tensors: 

^a^K\... - ^a^KX... + ^ afi^KX... + ^ aH^KX... " " " 

-^LHx.: - ^'xXt: ■ ■ ■ ■ (5-i3) 

Expressions (5.12) and (5.13) also transform as tensors. 

We also easily verify a "product rule" . Let the tensor Z be the product of two tensors 
X and y: 

= Kl:. ■ (5.14) 

Then one has (in a notation where we temporarily suppress the indices) 

D^Z ^ {D^X)Y ^X{PaY). (5.15) 

Furthermore, if one sums over repeated indices (one subscript and one superscript, we 
will call this a contraction of indices) : 

so that we can just as well omit the brackets in (5.16). Eqs. (5.15) and (5.16) can easily 
be proven to hold in any point x , by choosing the reference frame where F vanishes at 
that point x . 

The covariant derivative of a scalar field is the ordinary derivative: 

D^ct> = d^ct>, (5.17) 


but this does not hold for a density function cu (see Eq. (4.24), 

D^u = daUJ-r'/^^uj. (5.18) 
DaU! is a density times a covector. This one derives from (4.24) and 

e'""'%,.x = 65^. (5.19) 

Thus we have found that if one introduces in a space or space-time a field that 
transforms according to Eq. (5.5), called 'affine connection', then one can define: 1) 
geodesic curves such as the trajectories of freely falling particles, and 2) the covariant 
derivative of any vector and tensor field. But what we do not yet have is (i) a unique def- 
inition of distance between points and (ii) a way to identify co vectors with contra vectors. 
Summation over repeated indices only makes sense if one of them is a superscript and the 
other is a subscript index. 


Now again consider a curve S as in Fig. 2, but close it (Fig. 3). Let us have a 
contravector field ^"{x) with 

C{x{t)) = 0; (5.20) 
We take the curve to be very small^ so that we can write 

C{x) = C + ^:,x>' + 0{x'). (5.21) 

Figure 3: Parallel displacement along a closed curve in a curved space. 

Will this contravector return to its original value if we follow it while going around the 
curve one full loop? According to (5.3) it certainly will if the connection field vanishes: 
r = . But if there is a strong gravity field there might be a deviation ^ . We find: 

dre = 0; 

= - / dr (r + r ^x-) ^ (r + . (5.22) 

''In an affine space witliout metric the words 'small' and 'large' appear to be meaningless. However, 
since differentiability is required, the small size limit is well defined. Thus, it is more precise to state 
that the curve is mfinitesimally small. 


where we chose the function x{t) to be very small, so that terms 0{x'^) could be ne- 
glected. We have a closed curve, so 

f dr^ = and 

L'^e-o ^ (5.23) 

so that Eq. (5.22) becomes 

= ^ x'^'^dT^R'^^^J" + higher orders in X . (5.24) 


x"—dT + i x^—dr = , 5.25 
dr J dr 

only the antisymmetric part of R matters. We choose 

^ «Aa = -R'naX (5-26) 

(the factor | in (5.24) is conventionally chosen this way). Thus we find: 

R^nXa = '^X^Ka~'^a^K\~^^\a^"Ka~^''aa^'^KX- (5-27) 

We now claim that this quantity must transform as a true tensor. This should be 
surprising since F itself is not a tensor, and since there are ordinary derivatives dx 
instead of covariant derivatives. The argument goes as follows. In Eq. (5.24) the l.h.s., 
(5^ ^ is a true contravector, and also the quantity 


x''=-dT, (5.28) 

transforms as a tensor. Now we can choose ^'^ any way we want and also the surface ele- 
ments S'"'^ may be chosen freely. Therefore we may use the quotient theorem (expanded 
to cover the case of antisymmetric tensors) to conclude that in that case the set of coeffi- 
cients R\xa niust also transform as a genuine tensor. Of course we can check explicitly 
by using (5.5) that the combination (5.27) indeed transforms as a tensor, showing that 
the inhomogeneous terms cancel out. 

R^i^Xa something about the extent to which this space is curved. It is called 

the Riemann curvature tensor. From (5.27) we derive 

R''.Xa + R\a. + R''a.X = 0, (5.29) 


«/37 + V + D.R'.aP = . (5.30) 

The latter equation, called Bianchi identity, can be derived most easily by noting that 
for every point x a coordinate frame exists such that at that point x one has Fj^^, = 


(though its derivative dT cannot be tuned to zero). One then only needs to take into 
account those terms of Eq. (5.30) that are hnear in dT . 

Partial derivatives have the property that the order may be interchanged, d^j^di, = 
diid^. This is no longer true for covariant derivatives. For any covector field A^j^{x) we 

DixDyAa — Di^D^Aa = —R\^^Ax , 

and for any contravector field A°' : 

D^D^A'' - D^D^A'' = R\,A^ , (5.32) 

which we can verify directly from the definition of R\f^i, ■ These equations also show 
clearly why the Riemann curvature transforms as a true tensor; (5.31) and (5.32) hold for 
all ^4^ and A"^ and the l.h.s. transform as tensors. 

An important theorem is that the Riemann tensor completely specifies the extent to 
which space or space-time is curved, if this space-time is simply connected. We shall not 
give a mathematically rigorous proof of this, but an acceptable argument can be found as 
follows. Assume that R'^i^xa — everywhere. Consider then a point x and a coordinate 
frame such that r'^^(x) — . We assume our manifold to be at the point x . Then 
consider a Taylor expansion of F around x : 

Kxi^') = rS:„(^' - xr + lr^l:^,{x' - xrix' -xf..., (5.33) 

From the fact that (5.27) vanishes we deduce that Fj^^j^^^ is symmetric: 

k\, a Ka,X ' 

and furthermore, from the symmetry (5.6) we have 

kX, a Xk, a ' 

so that there is complete symmetry in the lower indices. From this we derive that 

T:, = dxdkY^ + 0{x'-x)\ 


y = |FWy„(x' - x)"(x' - x^ix' - xy . (5.37) 

If now we turn to the coordinates u'^ = x^ + then, according to the transformation 
rule (5.5), F vanishes in these coordinates up to terms of order {x' — xY . So, here, the 
coefficients T^^^ vanish. 

The argument can now be repeated to prove that, in (5.33), all coefficients F^ can be 
made to vanish by choosing suitable coordinates. Unless our space-time were extremely 
singular at the point x , one finds a domain this way around x where, given suitable 






coordinates, F vanish completely. All domains treated this way can be glued together, 
and only if there is an obstruction because our space-time isn't simply-connected, this 
leads to coordinates where the F vanish everywhere. 

Thus we see that if the Riemann curvature vanishes a coordinate frame can be con- 
structed in terms of which all geodesies are straight lines and all covariant derivatives are 
ordinary derivatives. This is a flat space. 

Warning: there is no universal agreement in the literature about sign conventions in 

the definitions of do"^ , T'^^^, R'^^xa^ ^m;' ^^e field g^^ of the next chapter. This 
should be no impediment against studying other literature. One frequently has to adjust 
signs and pre-f actors. 

6. The metric tensor. 

In a space with affine connection we have geodesies, but no clocks and rulers. These we 
will introduce now. In Chapter 3 we saw that in flat space one has a matrix 

9nu = 






so that for the Lorentz invariant distance a we can write 


(time will be the zeroth coordinate, which is agreed upon to be the convention if all 
coordinates are chosen to stay real numbers). For a particle running along a timelike 
curve C — {x{(7)} the increase in eigentime T is 


dT , with dT^ 

dx ^ dx " 
da da 


=^ -g^^x'^dx'' . 


This expression is coordinate independent, provided that g^j^i, is treated as a co-tensor 
with two subscript indices. It is symmetric under interchange of these. In curved coordi- 
nates we get 

9nu = gun = gnu{x) . (6.4) 

This is the metric tensor field. Only far away from stars and planets we can find coordi- 
nates such that it will coincide with (6.1) everywhere. In general it will deviate from this 
slightly, but usually not very much. In particular we will demand that upon diagonaliza- 
tion one will always find three positive and one negative eigenvalue. This property can 


be shown to be unchanged under coordinate transformations. The inverse of g^i, which 
we will simply refer to as g^" is uniquely defined by 

9,u9'"' - S'^^. (6.5) 

This inverse is also symmetric under interchange of its indices. 

It now turns out that the introduction of such a two-index co-tensor field gives space- 
time more structure than the three-index affinc connection of the previous chapter. First 
of all, the tensor gfj,^, induces one special choice for the affine connection field. Let 
us elucidate this first by using a physical argument. Consider a freely falling elevator 
(or spaceship). Assume that the elevator is so small that the gravitational pull from 
stars and planets surrounding it appears to be the same everywhere inside the elevator. 
Then an observer inside the elevator will not experience any gravitational field anywhere 
inside the elevator. He or she should be able to introduce a Cartesian coordinate grid 
inside the elevator, as if gravitational forces did not exist. He or she could use as metric 
tensor g^^ = diag(— 1, 1, 1, 1) . Since there is no gravitational field, clocks run equally fast 
everywhere, and rulers show the same lengths everywhere (as long as we stay inside the 
elevator). Therefore, the inhabitant must conclude that dag^iv = 0. Since there is no 
need of curved coordinates, one would also have = at the location of the elevator. 
Note: the gradient of T , and the second derivative of g^^, would be difficult to detect, so 
we put no constraints on those. 

Clearly, we conclude that, at the location of the elevator, the covariant derivative of 
g^i, should vanish: 

D^g^u = . (6.6) 

In fact, we shall now argue that Eq. (6.6) can be used as a definition of the affine connec- 
tion r for a space or space-time where a metric tensor g^iv{x) is given. This argument 
goes as follows. 

From (6.6) we see: 

rxa, = 9x.K,, (6-8) 

^Xan — Tamo ■ (6.9) 

Then one finds from (6.7) 

I ( dfj,9xi. + d^gx^c - dxg^iu ) = ^x^lu , (6.10) 

ri. = ^''^r^M- (6-11) 

These equations now define an affine connection field. Indeed Eq. (6.6) follows from (6.10), 
(6.11). In the literature one also finds the "Christoffel symbol" {^} which means the 
same thing. The convention used here is that of Hawking and Ellis. Since 

DJ), = dJl = 0, (6.12) 


we also have for the inverse of g^i, 

Dag'^'^O, (6.13) 

which follows from (6.5) in combination with the product rule (5.15). 

But the metric tensor (jf^j^ not only gives us an afiine connection field, it now also 
enables us to replace subscript indices by superscript indices and back. For every covector 
An{x) we define a contravector A'^{x) by 

A^{x) = g,.{x)A''{x) ; A'^ = g^^A,. (6.14) 

Very important is what is implied by the product rule (5.15), together with (6.6) and 

D^A^' = g'^^D^A,, 

D^A^ = g^^D^A- . (6.15) 

It follows that raising or lowering indices by multiplication with g^j^^, or g^'^ can be done 
before or after covariant differentiation. 

The metric tensor also generates a density function u) : 

^ = \/--det{g^). (6.16) 

It transforms according to Eq. (4.25). This can be understood by observing that in a 
coordinate frame with in some point x 

gnu{x) = diag(-a,6,c,d) , (6.17) 

the volume element is given by V abed . 

The space of the previous chapter is called an "affine space". In the present chapter 
we have a subclass of the affine spaces called a metric space or Riemann space; indeed we 
can call it a Riemann space-time. The presence of a time coordinate is betrayed by the 
one negative eigenvalue of g^i, . 

The geodesies 

Consider two arbitrary points X and Y in our metric space. For every curve C — 
{x'*((t)} that has X and Y as its end points, 

x^{Q) = X^" ; x^il) = Y", (6.18) 

we consider the integral 


= / ds, (6.19) 

JC (7=0 


with either 

ds^ = gi,^dx^dx\ (6.20) 

when the curve is spacehke, or 

ds^ = -Qi.^dx^dx'' , (6.21) 

wherever the curve is timehke. For simphcity we choose the curve to be spacehke, 
Eq. (6.20). The timehke case goes exactly analogously. 

Consider now an infinitesimal displacement of the curve, keeping however X and Y 
in their places: 

x'^{a) = x^{a) + ri^{a) , infinitesimal, 

r;^(0) = 7^/^(1) = , (6.22) 

then what is the infinitesimal change in ^ ? 

5£ = y Sds ; 
2ds5ds = {5gi,^)dx^'dx'' + 2gi,^dx^'d7]'' + 0{dr)^) 

= ida9f.u)v''dx^dx'' + 2g^,dx^^da . (6.23) 


Now we make a restriction for the original curve: 

which one can always realize by choosing an appropriate parametrization of the curve. 
(6.23) then reads 

^. /"Wi a dx^'dx'' dx^d?7«>. 

Si- j da[,rj 9,.a^^+9,a^^) ■ (6-25) 

We can take care of the drj/da term by partial integration; using 

d dx^ ,^ 

J^9t.a = 9|^a^-^ , (6.26) 

we get 

dxi'dx" dx^dxi^ d^x''\ d / dx^ 

f , / „/i dxi^dx" dx^dx^" d^x'^x d / dx^ „\\ 

= -/d.,»9^(^ + rS,--). (6.27) 

The pure derivative term vanishes since we require rj to vanish at the end points, 
Eq. (6.22). We used symmetry under interchange of the indices A and fj, in the first 


line and the definitions (6.10) and (6.11) for F. Now, strictly following standard pro- 
cedure in mathematical physics, we can demand that 5^ vanishes for all choices of the 
infinitesimal function 77" (cr) obeying the boundary condition. We obtain exactly the 
equation for geodesies, (5.8). If we hadn't imposed Eq. (6.24) we would have obtained 
Eq. (5.8a). 

We have spacelike geodesies (with Eq. (6.20) and timelike geodesies (with Eq. (6.21). 
One can show that for timelike geodesies £ is a relative maximum. For spacelike geodesies 
it is on a saddle point. Only in spaces with a positive definite Qfj^v the length I of the 
path is a minimum for the geodesic. 


As for the Riemann curvature tensor defined in the previous chapter, we can now raise 
and lower all its indices: 

Riiva0 - g^\R\aP : (6-28) 

and we can check if there are any further symmetries, apart from (5.26), (5.29) and (5.30). 
By writing down the full expressions for the curvature in terms of g/j^i, one finds 

RfivaP — —Rv^iaP — RaP^iv ■ (6.29) 

By contracting two indices one obtains the Ricci tensor: 

Rij,u = R^fj.\v^ (6.30) 

It now obeys 

Rixv = Rvn-, (6.31) 
We can contract further to obtain the Ricci scalar, 

R = g^^R^, = Ri:. (6.32) 

Now that we have the metric tensor g^^ , we may use a generalized version of the 
summation convention: // there is a repeated subscript index, it means that one of them 
must he raised using the metric tensor g^'^ , after which we sum over the values. Similarly, 
repeated superscript indices can now be summed over: 

A^B^ = A^Bi' = Ai'B^ = A^B, g''^ . (6.33) 

The Bianchi identity (5.30) implies for the Ricci tensor: 

D^,R^, - \D,R = . (6.34) 


We define the Einstein tensor G^uix) as 

C/xi/ — Rixv — \Rgnv , D^G^i, — 0. (6.35) 

The formahsm developed in this chapter can be used to describe any kind of curved 
space or space-time. Every choice for the metric g^y (under certain constraints concerning 
its eigenvalues) can be considered. We obtain the trajectories - geodesies - of particles 
moving in gravitational fields. However so-far we have not discussed the equations that 
determine the gravity field configurations given some configuration of stars and planets 
in space and time. This will be done in the next chapters. 

7. The perturbative expansion and Einstein's law of gravity. 

We have a law of gravity if we have some prescription to pin down the values of the 
curvature tensor R^^^-y ^^^^ ^ given matter distribution in space and time. To obtain 
such a prescription we want to make use of the given fact that Newton's law of gravity 
holds whenever the non-relativistic approximation is justified. This will be the case in any 
region of space and time that is sufficiently small so that a coordinate frame can be devised 
there that is approximately flat. The gravitational fields are then sufficiently weak and 
then at that spot we not only know fairly well how to describe the laws of matter, but we 
also know how these weak gravitational fields are determined by the matter distribution 
there. In our small region of space-time we write 










and hij,y is a small perturbation. We find (see (6.10): 

9'" = 


In this latter expression the indices were raised and lowered using rj'^'^ and rjni, instead 
of the g'^'^ and Qf^y . This is a revised index- and summation convention that we only 
apply on expressions containing h^y . Note that the indices in 77^,^ need not be raised or 



The curvature tensor is 




and the Ricci tensor 

= \i- d\. + dad.K + d^dX, - d.dXa) + 0{h') . (7.7) 

The Ricci scalar is 

R = -d'^K^ + 9^9, V + ■ (7.8) 

A slowly moving particle has 

1, 0, 0, 0) , (7.9) 


so that the geodesic equation (5.8) becomes 

= -no- (7.10) 

Apparently, = — Fqq is to identified with the gravitational field. Now in a stationary 
system one may ignore time derivatives Sq . Therefore Eq. (7.3) for the gravitational field 
reduces to 

Fj = — Fjoo = \dihQQ , (7-11) 

so that one may identify — ^/iqo as the gravitational potential. This confirms the suspicion 
expressed in Chapter 3 that the local clock speed, which \s g = \/—goo ~ 1 — ^hoQ , can 
be identified with the gravitational potential, Eq. (3.19) (apart from an additive constant, 
of course) . 

Now let T/^u be the energy-momentum-stress-tensor; T44 = —Too is the mass-energy 
density and since in our coordinate frame the distinction between covariant derivative and 
ordinary derivatives is negligible, Eq. (1.26) for energy-momentum conservation reads 

D^T^, = (7.12) 

In other coordinate frames this deviates from ordinary energy-momentum conservation 
just because the gravitational fields can carry away energy and momentum; the Tfj,„ 
we work with presently will be only the contribution from stars and planets, not their 
gravitational fields. Now Newton's equations for slowly moving matter imply 

r = -F^oo = -^^V{x) = Idihoo ; 
diVi = -AttGnTu = 4:TtGnToo ; 
9 '/loo = 87rG'Arroo . (7.13) 

This we now wish to rewrite in a way that is invariant under general coordinate 
transformations. This is a very important step in the theory. Instead of having one 
component of the T^^ depend on certain partial derivatives of the connection fields F 


we want a relation between covariant tensors. The energy momentum density for matter, 
T^i, , satisfying Eq. (7.12), is clearly a covariant tensor. The only covariant tensors one 
can build from the expressions in Eq. (7.13) are the Ricci tensor i?^,^ and the scalar R . 
The two independent components that are scalars under spacelike rotations are 

i?oo = -i^'/ioo; (7.14) 
and R — didjhij + d^{hoo — ha) . (7-15) 

Now these equations strongly suggest a relationship between the tensors Ti^„ and i?^,^ , 
but we now have to be careful. Eq. (7.15) cannot be used since it is not a priori clear 
whether we can neglect the spacelike components of hij (we cannot). The most general 
tensor relation one can expect of this type would be 

R^, = AT^, + Bg^,T^, (7.16) 

where A and B are constants yet to be determined. Here the trace of the energy 
momentum tensor is, in the non-relativistic approximation 

= -Too + T,,. (7.17) 

so the 00 component can be written as 

Roo = -^d^hoo = iA + B)Too - BTu , (7.18) 

to be compared with (7.13). It is of importance to realize that in the Newtonian limit 
the Tii term (the pressure p ) vanishes, not only because the pressure of ordinary (non- 
relativistic) matter is very small, but also because it averages out to zero as a source: in 
the stationary case we have 

= d^T^i = djTji, (7.19) 

^ J TudxMx^ ^ - J dxW(92T2i + 93T3i) = 0, (7.20) 

and therefore, if our source is surrounded by a vacuum, we must have 

J Tu dx'^dx^ = ^ J d^xTn ^ , 

and similarly, J d^f ^22 = J d^f Tga = . (7.21) 

We must conclude that all one can deduce from (7.18) and (7.13) is 

A + B ^ -AttGn. (7.22) 

Fortunately we have another piece of information. The trace of (7.16) is 
R= {A + AB)T^ . The quantity G^,^ in Eq. (6.35) is then 

G^, = AT^, -{\A + B)TS g^. , (7.23) 


and since we have both the Bianchi identity (6.35) and the energy conservation law (7.12) 
we get (using the modified summation convention, Eq. (6.33)) 

D^G^, = ; D^T^, = ; therefore {\A + B)d^{T^) = 0. (7.24) 

Now , the trace of the energy-momentum tensor, is dominated by — Tqo . This wiU in 
general not be space-time independent. So our theory would be inconsistent unless 

B = -\A ; A = -SttGjv, (7.25) 

using (7.22). We conclude that the only tensor equation consistent with Newton's equation 
in a locally flat coordinate frame is 

- \R9^. = -SttG^T^, , (7.26) 

where the sign of the energy-momentum tensor is deflned by ( ^ is the energy density) 

Tu = -Too = Q- (7.27) 

This is Einstein's celebrated law of gravitation. From the equivalence principle it follows 
that if this law holds in a locally flat coordinate frame it should hold in any other frame 
as well. 

Since both left and right of Eq. (7.26) are symmetric under interchange of the indices 
we have here 10 equations. We know however that both sides obey the conservation law 

D^G^^^Q. (7.28) 

These are 4 equations that are automatically satisfled. This leaves 6 non-trivial equations. 
They should determine the 10 components of the metric tensor gi^j, , so one expects a 
remaining freedom of 4 equations. Indeed the coordinate transformations are as yet 
undetermined, and there arc 4 coordinates. Counting degrees of freedom this way suggests 
that Einstein's gravity equations should indeed determine the space-time metric uniquely 
(apart from coordinate transformations) and could replace Newton's gravity law. However 
one has to be extremely careful with arguments of this sort. In the next chapter we show 
that the equations are associated with an action principle, and this is a much better 
way to get some feeling for the internal self-consistency of the equations. Fundamental 
difficulties are not completely resolved, in particular regarding the possible emergence of 
singularities in the solutions. 

Note that (7.26) implies 

STrGivT/ = R ; 

R^, = -87rG'iv(T^. - \T^g^..) ■ (7.29) 

therefore in parts of space-time where no matter is present one has 

i?M- = 0, (7.30) 


but the complete Riemann tensor -R"^^,^ will not vanish. 

The Weyl tensor is defined by subtracting from Ra/3-yS a part in such a way that all 
contractions of any pair of indices gives zero: 

Ca/375 = RalSjS + \ QasR-ylS + QfiiRaS + \Rga'igp5 " (T ■ (7-31) 

This construction is such that Ca/37<5 has the same symmetry properties (5.26), (5.29) 
and (6.29) and furthermore 

C$,, = ^. (7.32) 

If one carefully counts the number of independent components one finds in a given point 
X that Rai3-y5 has 20 degrees of freedom, and R^i, and Cap-yS each 10. 

The cosmological constant 

We have seen that Eq. (7.26) can be derived uniquely; there is no room for correc- 
tion terms if we insist that both the equivalence principle and the Newtonian limit are 
valid. But if we allow for a small deviation from Newton's law then another term can be 
imagined. Apart from (7.28) we also have 

D^g^, = 0, (7.33) 

and therefore one might replace (7.26) by 

R^.. -lR9^^u + Ag^. ^ -SttGn T^u , (7.34) 

where A is a constant of Nature, with a very small numerical value, called the cosmological 
constant. The extra term may also be regarded as a 'renormalization': 

STfj,^ oc g^^, (7.35) 

implying some residual energy and pressure in the vacuum. Einstein first introduced 
such a term in order to obtain interesting solutions, but later "regretted this". In any 
case, a residual gravitational field emanating from the vacuum, if it exists at all, must be 
extraordinarily weak. For a long time, it was presumed that the cosmological constant 
A = . Only very recently, strong indications were reported for a tiny, positive value of A . 
Whether or not the term exists, it is very mysterious why A should be so close to zero. In 
modern field theories it is difficult to understand why the energy and momentum density 
of the vacuum state (which just happens to be the state with lowest energy content) are 
tuned to zero. So we do not know why A = , exactly or approximately, with or without 
Einstein's regrets. 

8. The action principle. 

We saw that a particle's trajectory in a space-time with a gravitational field is determined 
by the geodesic equation (5.8), but also by postulating that the quantity 

£ = J ds , with {dsf ^ -g^^dxf'dx" , (8.1) 


is stationary under infinitesimal displacements x^{t) x^{t) + 5x^{t) : 

5£^0. (8.2) 

This is an example of an action principle, i being the action for the particle's motion in 
its orbit. The advantage of this action principle is its simplicity as well as the fact that 
the expressions are manifestly covariant so that we see immediately that they will give 
the same results in any coordinate frame. Furthermore the existence of solutions of (8.2) 
is very plausible in particular if the expression for this action is bounded. For example, 
for most timelike geodesies £ is an absolute maximum. 

Now let 

g det{g,,) . (8.3) 

Then consider in some volume V oi 4 dimensional space-time the so-called Einstein- 
Hilbert action: 

-gRd^x, (8.4) 

where R is the Ricci scalar (6.32). We saw in chapters 4 and 6 that with this factor y/—g 
the integral (8.4) is invariant under coordinate transformations, but if we keep V finite 
then of course the boundary should be kept unaffected. Consider now an infinitesimal 
variation of the metric tensor g^^, : 

Qixu = g^iv + , (8.5) 
so that its inverse, g^^ changes as 

gt^y = gi'^ - Sg^"" . (8.6) 

We impose that Sg^i, and its first derivatives vanish on the boundary of V . What effect 
does this have on the Ricci tensor R^i, and the Ricci scalar R ? 

First, compute to lowest order in Sg^^ the variation ST^,^ of the connection field 

pA ^ pA _^ ^pA 

Using this, and Eqs. (6.8), (6.10) and (6.11), we find : 

^^lu = y^"{di^Sgai. + duSgaij. - da5gf,y) - dg'^^V^nu ■ 

Now, we make an important observation. Since ^F^^^ is the difference between two 
connection fields, it transforms as a true tensor. Therefore, this last expression can be 
written in such a way that we see only covariant derivatives: 

^r;, = \g^''{D^5gau + DJgo,^ - DJg^,) . 


This, of course, we can check exphcitly. Similarly, again using the fact that these expres- 
sions must transform as true tensors, we derive (see Eq. (5.27): 

so that the variation in the Ricci tensor R^i, to lowest order in 5gfj,i, is given by 

R,u ^R,u + l{- DHg^^ + D^D^Sg^, + D^D^dg^ - D^DJg^) , (8.7) 

Exercise: check the derivation of Eq. (8.7). 
With R = g^^Rf,^ we have 

R^ R- R^Jg^'^ + {D^DJg^'' - D'Sg^) . (8.8) 

Finally, the determinant of ^^j, is obtained by 

det(^^,) = det {g^xiSt + Q^'^Sg^.)) = det(^^,) det{5lf + g^'^Sg^,) = g{l + Sgjl) (8.9) 

yf^g = ^(l + |50. (8.10) 

and so we find for the variation of the integral / as a consequence of the variation (8.5): 

/ = / + / ^ ( - i?/^- + \Rgn5g^.. + f {D,D, - g^.D^)5g^^ . (8.11) 
Jv Jv 


^f^gD^X'^ = d^i^gXi') , (8.12) 

and therefore the second half in (8.11) is an integral over a pure derivative and since 
we demanded that 5g^y (and its derivatives) vanish at the boundary the second half of 
Eq. (8.11) vanishes. So we find 

51 ^ - [ ^gG^'^Sg,,, (8.13) 

with G^u as defined in (6.35). Note that in these derivations wc mixed superscript and 
subscript indices. Only in (8.12) it is essential that is a contra- vector since we insist 
in having an ordinary rather than a covariant derivative in order to be able to do partial 
integration. Here we see that partial integration using covariant derivatives works out 
fine provided we have the factor ^/—g inside the integral as indicated. 

We read off from Eq. (8.13) that Einstein's equations for the vacuum, G^,^ = 0, are 
equivalent with demanding that 

57 = 0, (8.14) 

for all smooth variations dg^^^x) . In the previous chapter a connection was suggested 
between the gauge freedom in choosing the coordinates on the one hand and the con- 
servation law (Bianchi identity) for G^,^ on the other. We can now expatiate on this. 


For any system, even if it does not obey Einstein's equations, / will be invariant under 
infinitesimal coordinate transformations: 

x^" = x^' + u^ix) , 
(9a;" dx^ 

: S-^+u-^^ + 0{u'), (8.15) 

so that 


~9,A^) = V + ^"^.V + + 9,.^% + 0(1.^) . (8.16) 

This combination precisely produces the covariant derivatives of . Again the reason 
is that all other tensors in the equation are true tensors so that non-covariant derivatives 
are outlawed. And so we find that the variation in g^^ is 

gfiu = 9nu + D^u^ + D^u^ . (8.17) 

This leaves / always invariant: 

5/ = -2 y ^/^G^'^D^.u^ =0; (8.18) 

for any u,j{x) . By partial integration one finds that the equation 

^u.D^G^'' = (8.19) 

is automatically obeyed for all Uy{x) . This is why the Bianchi identity Df^Gfj^^, — 0, 
Eq. (6.35) is always automatically obeyed. 

The action principle can be expanded for the case that matter is present. Take for 
instance scalar fields (j){x) . In ordinary fiat space-time these obey the Klein-Gordon 

(a^ - m^)(j) = . (8.20) 
In a gravitational field this will have to be replaced by the covariant expression 

{D^ - m^)0 = {g'^'D^D^ - m^)(t) = . (8.21) 
It is not difficult to verify that this equation also follows by demanding that 

5J = ; 

J -'2 J V^d'x<P{D'-m')<P = J ^d^a;(-i(D^0)2-imV) , (8.22) 

for all infinitesimal variations S(j) in (Note that (8.21) follows from (8.22) via partial 
integrations which are allowed for covariant derivatives in the presence of the term) . 


Now consider the sum 
and remember that 

{D^<py = g^^'d^cpdA- (8.24) 

Then variation in will yield the Klein-Gordon equation (8.21) for as usual. Variation 
in g^^ now gives 

5S = y|^V=^d^x(-^^^ + |D'^0D>-i((D,0)2 + mVV^)5yM- (8-25) 
So we have 

G'^^ = , (8.26) 

if we write 

= -D^(t>D,(t> + I ((L'a^)' + m202^ . (8.27) 

Now since J is invariant under coordinate transformations, Eqs. (8.15), it must obey a 
continuity equation just as (8.18), (8.19): 

D^T^. = 0. (8.28) 

This equation holds only if the matter field(s) 0(a;) obey the matter field equations. That 
is because we should add to Eqs. (8.15) the transformation rule for these fields: 

4){x) = (l){x) + u^dx(t){x) + 0{u^) . 

Precisely if the fields obey the field equations, the action is stationary under such variations 
of these fields, so that we could omit this contribution and use an equation similar to (8.18) 
to derive (8.28). It is important to observe that, by varying the action with respect to 
the metric tensor g^y , as is done in Eq. (8.25), we can always find a symmetric tensor 
T,xv{x) that obeys a conservation law (8.28) as soon as the field equations are obeyed. 

Since we also have 

T44 = \{D(t>f + |m202 + l(L>o0)2 = 7^(x) , (8.29) 

which can be identified as the energy density for the field , the {iO} components of 
(8.28) must represent the energy flow, which is the momentum density, and this implies 
that this T^u has to coincide exactly with the ordinary energy-momentum density for the 
scalar fleld. In conclusion, demanding (8.25) to vanish also for all infinitesimal variations 
in g^j, indeed gives us the correct Einstein equation (8.26). 


Finally, there is room for a cosmological term in the action: 

This example with the scalar field 4> can immediately be extended to other kinds of 
matter such as other fields, fields with further interaction terms (such as A0^), and 
electromagnetism, and even liquids and free point particles. Every time, all we need is 
the classical action S which we rewrite in a covariant way: S'matter — / \f~~9 ^matter i 


which we then add the Einstein-Hilbert action: 

Of course we will often omit the A term. Unless stated otherwise the integral symbol 
will stand short for J . 

9. Special coordinates. 

In the preceding chapters no restrictions were made concerning the choice of coordinate 
frame. Every choice is equivalent to any other choice (provided the mapping is one-to-one 
and differentiable) . Complete invariance was ensured. However, when one wishes to cal- 
culate in detail the properties of some particular solution such as space-time surrounding 
a point particle or the history of the universe, one is forced to make a choice. Since 
we have a four-fold freedom for the use of coordinates we can in general formulate four 
equations and then try to choose our coordinates such a way that these equations are 
obeyed. Such equations are called "gauge conditions" . Of course one should choose the 
gauge conditions such a way that one can easily see how to obey them, and demonstrate 
that coordinates obeying these equations exist. We discuss some examples. 

1) The temporal gauge. 


^00 = -1; (9.1) 
goi ^ 0, (i = 1,2,3) . (9.2) 

At first sight it seems easy to show that one can always obey these. If in an arbitrary 
coordinate frame the equations (9.1) and (9.2) are not obeyed, one writes 

^00 = fl-oo + 2L'oMo = -1 , (9.3) 
9oi = 9oi + DiUo + DoUi ^ . (9.4) 

Uo{x, t) can be solved from eq. (9.3) by integrating (9.3) in the time direction, after 
which we can find Ui by integrating (9.4) with respect to time. We then apply Eq. (8.17) 


to observe that g^y{x — u) obeys the equations (9.1) and (9.2) up to terms of order 
(m)^ (note that Eqs. (9.3) and (9.4) only correspond to coordinate transformations when 
u is infinitesimal). Iterating the procedure, it seems easy to obey (9.1) and (9.2) with 
increasing accuracy. Will such an iteration procedure converge? These are coordinates in 
which there is no gravitational field (only space, not space-time, is curved) , hence all lines 
of the form x{t) = constant are actually geodesies, as one can easily check (in Eq. (5.8), 
Fqq = ). Therefore they are "freely falling" coordinates, but of course freely falhng 
objects in general will go into orbits and hence either wander away from or collide against 
each other, at which instances these coordinates generate singularities. 

2) The gauge: 

5^^M- = 0. (9.5) 

This gauge has the advantage of being Lorentz invariant. The equations for infinitesimal 
Ufjt become 

d^gf,u = d^g^y + d^D^^u^ + d^D^u^ = . (9.6) 

(Note that ordinary and covariant derivatives must now be distinguished carefully) In an 
iterative procedure we first solve for d^Uy . Let d^, act on (9.6): 

2d'^dvU^ = -dvdij^g^j,^ + higher orders, (9.7) 

after which 

d'^u,, = -d^g^,, - di,{d^u^) + higher orders. (9.8) 

These are d'Alembert equations of which the solutions are less singular than those of Eqs. 
(9.3) and (9.4). 

A smarter choice is 

3) the harmonic or De Bonder gauge: 

^'^Ti, = . (9.9) 

Coordinates obeying this condition are called harmonic coordinates, for the following 
reason. Consider a scalar field V obeying 

D^V = , (9.10) 

or g'^'^(d^d^V-r>l,,d,v) ^0. (9.11) 

Now let us choose four coordinates x^' ■ that obey this equation. Note that these then 
are not covariant equations because the index a of is not participating: 

gi^u (d^d^x'^ - r'l^^dxx^) = . (9.12) 


Now of course, in the gauge (9.9), 

= ; dxx'' = 5^ . (9.13) 

Hence, in these coordinates, the equations (9.12) imply (9.9). Eq. (9.10) can be solved 
quite generally (it helps a lot that the equation is linear!) For 

gfiu = VtMi^ + hf^u (9.14) 

with infinitesimal /i^i^ this gauge differs slightly from gauge # 2: 

fu = - l9^h^^ = 0, (9.15) 

and for infinitesimal Ui, we have 

fu = + d^u,, + d^d^Uu, - d^d^u^ 

— fi^ + d^Ui, — (apart from higher orders) (9.16) 

so (of course) we get directly a d'Alembert equation for u,y . Observe also that the equation 
(9.10) is the massless Klein-Gordon equation that extremises the action J of Eq. (8.22) 
when m — . In this gauge the infinitesimal expression (7.7) for i?^,^ simplifies into 

Ri^u = -\d^K,, (9.17) 

which simplifies practical calculations. 

The action principle for Einstein's equations can be extended such that the gauge 
condition also follows from varying the same action as the one that generates the field 
equations. This can be done various ways. Suppose the gauge condition is phrased as 

h{{9ap},x) = 0, (9.18) 

and that it has been shown that a coordinate choice that obeys (9.18) always exists. Then 
one adds to the invariant action (8.23), which we now call Sirw. '■ 


J V^X''(x)f,(g,x)d'x , (9.19) 

total ~ 'S'inv ~l~ "S'gauge ) (9.20) 

where A^(a;) is a new dynamical variable, called a Lagrange multiplier. Variation A — ?> 

A + immediately yields (9.18) as Euler-Lagrange equation. However, we can also 
consider as a variation the gauge transformation 


5^inv = 0, (9.22) 

(^-S'gauge = Jx'^Sf^ I 0. (9.23) 


Now we must assume that there exists a gauge transformation that produces 

5f,ix) = (9.24) 

for any choice of the point x^^'' and the index a . This is precisely the assumption that 
under any circumstance a gauge transformation exists that can tune to zero. Then 
the Euler-Lagrange equation tells us that 

(^-^gauge = A"(xW) ^ A"(xW) = 0. (9.25) 

All other variations of g^i, that are not coordinate transformations then produce the usual 
equations as described in the previous chapter. 

A technical detail: often Eq. (9.24) cannot be reahzed by gauge transformations that 
vanish everywhere on the boundary. Therefore we must allow Sf^ also to be non- vanishing 
on the boundary, if now we impose A = on the boundary then this insures (9.25): A = 
everywhere. This means that the equations generated by the action (9.20) may generate 
solutions with A 7^ that have to be discarded. There will always be solutions with 
A = everywhere, and these are the solutions we want. 

Another way to implement the gauge condition in the Lagrangian is by choosing 

W = / -IV^g g^'fju ■ (9.26) 

Let us write this as /— |(/q-)^, where /„ is defined as {\/y/—g~g^)°'^ffi- If now we 
perform an infinitesimal gauge transformation (8.17), and again assume that it can be 
done such that Eq. (9.24) is realized for 6 fa , we find 

SStot.l = 55gauge = -fa{x^'^) ■ (9.27) 

Requiring S'totai to be stationary then implies f^{x^^^) = , and all other equations can 
be seen to be compatible with the ones from 5'inv alone. 

Here, one must impose /^(x) = on the boundary, which then will guarantee that 
fn — everywhere in space-time. By choosing to fix the gauge this way, one can often 
realize that S'totai has a simpler form than ^inv , so that calculations at a later stage 
simplify, for instance when gravitational radiation is considered (Chapter 15). 

10. Electromagnetism. 

We write the Lagrangian for the Maxwell equations as^ 

C ^ -lF^,F^, + J^A^, (10.1) 

^Note that conventions used here differ from others such as Jackson, Classical Electrodynamics by 
factors such as 47r . The reader may have to adapt the expressions here to his or her own notation. Again 
the modified summation convention of Eq. (6.33) is implied. 



i^M- = d^A,-d,A^- (10.2) 
This means that for any variation 

+ SA^ , (10.3) 

the action 

S ^ J jOd'^x , (10.4) 

should be stationary when the Maxwell equations are obeyed. We see indeed that, if 6Ai, 
vanishes on the boundary, 

- J {-F^ud^SA^ + J^SA^y^x 

= J d^x5A,{d^F^, + J,) , (10.5) 

using partial integration. Therefore (in our simplified units) 

SS^O ^ d^F^, = -J, . (10.6) 

Describing now the interactions of the Maxwell field with the gravitational field is 
easy. We first have to make S covariant: 

F^ii, = - d^Af, (unchanged) , (b) 


Indices may be raised or lowered with the usual conventions. 

The energy-momentum tensor can be read off from (10.8) by varying with respect to 
Qi^^ (and multiplying by 2): 

T,. = -F.^F^'' + ilF^^F-^ - rA^)g^^ ; (10.9) 

here J° (with the superscript index) was kept as an external fixed source. We have, in 
flat space-time, the energy density 

Q = -Too = \{E^ + ^') - ^"^a , (10.10) 

as usual. 

We also see that: 


1) The interaction of the Maxwell field with gravitation is unique, there is no freedom 
to add an as yet unknown term. 

2) The Maxwell field is a source of gravitational fields via its energy-momentum tensor, 
as was to be expected. 

3) The homogeneous equation in Maxwell's laws, which follows from Eq. (10.7b), 

d^F^p + do^Fp^ + dpF^a = , (10.11) 

remains unchanged. 

4) Varying , we find that the inhomogeneous equation becomes 

D^F^. = g'^^D^Fp, = -J,, (10.12) 
and hence receives a contribution from the gravitational field V^^^^ and the potential 

gap _ 

Exercise: show, both with formal arguments and explicitly, that Eq. (10.11) does not 
change if we replace the derivatives by covariant derivatives. 

Exercise: show that Eq. (10.12) can also be written as 

d,{^gFn = -V^J\ (10.13) 

and that 

5m(^^^) = 0. (10.14) 

Thus \f--gJ^ is the real conserved current, and Eq. (10.13) implies that \/—g acts as 
the dielectric constant of the vacuum. 

11. The Schwarzschild solution. 

Einstein's equation, (7.26), should be exactly valid. Therefore it is interesting to search 
for exact solutions. The simplest and most important one is empty space surrounding a 
static star or planet. There, one has 

=0. (11.1) 

If the planet docs not rotate very fast, the effects of this rotation (which do exist!) may 
be ignored. Then there is spherical symmetry. Take spherical coordinates, 

(x°, x^ x^) = (i, r, cp) . (11.2) 

Spherical symmetry then implies 

9o2 = 903 = 9i2 = 9i3 = to = 0, (11.3) 


as well as 

5^33 = sin^ 6' 5^22 , (11-4) 

and time-reversal symmetry 

^01 = 0. (11.5) 

The metric tensor is then specified by writing down the length ds of the infinitesimal line 

ds^ = -Adt^ + Bdr^ + Cr'^ (d^^ + sin^ d-/) , (11.6) 

where A, B, and C are positive functions depending only on r . At large distance from 
the source we expect: 

r ^ oo ; A, B,C ^ 1. (11.7) 

Our freedom to choose the coordinates can be used to choose a new r coordinate: 

f = ^/C{r) r , so that Cr^ = . (11.8) 

We then have 

Bdr^ = B(VC + ^^Y'dr' = Bdr\ (11.9) 

In the new coordinate one has (henceforth omitting the tilde ~): 

ds^ = -Adt^ + Bdr^ + (d^^ ^ ^^^2 ^ dc^^-, ^ ^ ^ ^ _ ;^q^ 

where A, B ^ 1 as r — >■ 00 . The signature of this metric must be (— , +, +, +) , so that 

A>Q and B>0. (H-H) 

Now for general A and B we must find the affine connection F they generate. There 
is a method that saves us space in writing (but does not save us from having to do the 
calculations), because many of its coefficients will be zero. If we know all geodesies 

+ r''^^x''x^ = 0, (11-12) 

then they uniquely determine all F coefficients. The variational principle for a geodesic 

^ 1 , ^ , , dx ^ dx , 

o = 5y d. = 5y y,,.— — da, (11.13) 

where a is an arbitrary parametrization of the curve. In chapter 6 we saw that the 
original curve is chosen to have 

(7 = s. (11-14) 

The square root is then one, and Eq. (6.23) then corresponds to 

We write 

-Ai^ + Bf^ + rH^ + sin^ Qip^ = F{s) ; S J Fds ^ . (11.16) 

The dot stands for differentiation with respect to s . 
(11.16) generates the Lagrange equation 

d dF _ dF_ 

For jjL — Q this is 



^^(-2Ai) = 0, (11.18) 


f+i(|^..)< = 0. (11.19) 

Comparing (11.12) we see that all r*^,^ vanish except 

r% = r^oi = A'/2A (11.20) 

(the accent, ' , stands for differentiation with respect to r ; the 2 comes from symmetriza- 
tion of the subscript indices and 1. For = 1 Eq. (11.17) implies 

r + —r-" + —t" - -e^ - -sinH^^ = 0, (11.21) 

2B 2B B B ^ ' ^ ' 

so that all F^^^, are zero except 

F^oo = ^725 ; F\i = B'/2B ; 

= -r/B ; FI33 = -{r/B)siTi^e . (11.22) 

For II — 2 and 3 we find similarly: 

r|i = r^is = l/r ; = -sin^^cos^; 

r'23 = = cot^ ; r\, = F|i = 1/r. (11.23) 

Furthermore we have 

= r^sin e^/AB. (11.24) 


and from Eq. (5.18) 


i Hi 

The equation 

now becomes (see (5.27)) 

A/2A + B' /2B + 2/r , 
cot^ . 

Rtiv — , 



A' B' 2 

(^725)' -^'72^5 + (^725) , 2^ , ^ 

^'E' A'^ 2A' 




-(log v^)_,^, + - r%r% - r\,r\. 

This produces 

A'B' A'^ 2AB' 
2A\ " ' 2B ^ 2A^ rB 

Combining (11.29) and (11.31) we obtain 


= 0. 


{AB)' = . 




-(iogv^),^,. + r"^.,a-r^./V + r",.aogv^),„ = o. (11.28) 





Therefore AB = constant. Since at r ^ oo we have A and 5 ^ 1 we conclude 

B ^ 1/A. (11.33) 

In the 66 direction one has 

1 n2 

R,, = (_iog^),, + r^22,i-2n2r 
-r|3r|3 + r'22(iogv^), = o. 



This becomes 

= -I cot« - + I - cot'. - ^(^ = . (11.35) 

Using (11.32) one obtains 
Upon integration, 

{r/B)' = 1. (11.36) 

r/S = r-2M, (11.37) 
2M ( 2M\-i 

^ = 1-—; (i-— ) ■ (11-38) 

Here 2M is an integration constant. We found the solution even though wc did not yet 
use all equations Ry^y = available to us (and only a linear combination of i?oo and 
Rii was used). It is not hard to convince oneself that indeed all equations R^i, = are 
satisfied, first by substituting (11.38) in (11.29) or (11.31), and then spherical symmetry 
with (11.35) will also ensure that R33 — 0. The reason why the equations are over- 
determined is the Bianchi identity: 

D^Gy, = 0. (11.39) 

It will always be obeyed automatically, and implies that if most components of G^i, have 
been set equal to zero the remainder will be forced to be zero too. 

The solution we found is the Schwarzschild solution (Schwarzschild, 1916): 

ds^ = -(l-^)dt^+ ^'''^^ +r^(dg^ + sin^gd(^^). (11.40) 


In (11.37) we inserted 2M as an arbitrary integration constant. We see that far from the 


-^00 = 1 - — ^ 1 + 2V{x) . (11.41) 

So the gravitational potential V{x) goes to —M/r , as near an object with mass m , if 

M ^ Gnui (c= 1). (11.42) 
Often we will normalize mass units such that Cat = 1 • 

The Schwarzschild solution^ is singular at r = 2M , but this can be seen to be an 
artifact of our coordinate choice. By studying the geodesies in this region one can discover 

^In his original paper, using a slightly different notation, Karl Schwarzschild replaced ^/r^ — (2M)3 
by a new coordinate r that vanishes at the horizon, since he insisted that what he saw as a singularity 
should be at the origin, claiming that only this way the solution becomes "eindeutig" (unique), so that 
you can calculate phenomena such as the perihelion movement (see Chapter 12) unambiguously. The 
substitution had to be of this form as he was using the equation that only holds if g = 1 . He did not 
know that one may choose the coordinates freely, nor that the singularity is not a true singularity at all. 
This was 1916. The fact that he was the first to get the analytic form, justifies the name Schwarzschild 


different coordinate frames in terms of wliicli no singularity is seen. We liere give tlie result 
of such a procedure. Introduce new coordinates ("Kruskal coordinates") 

it,r,e,<f) ^ {x,y,e,<f), (11.43) 

defined by 


= x/y , (b) 


so that 

dx ^ dy dr 


X y 2M(l-2M/r) ' 

dx dy dt 
X y ~ 2M ' 

The Schwarzschild line element is now given by 

d.^ = 16M^fl-^)^ + rW 

V r / xy 

= ^e-^/2^dxdy + rW (11.46) 


dO^ 4|f d^2^sin2^d(^2 _ (1147) 

The singularity at r = 2M disappeared. Remark that Eqs. (11.44) possess two solutions 
{x, y) for every r, t . This implies that the completely extended vacuum solution (= 
solution with no matter present as a source of gravitational fields) consists of two universes 
connected to each other at the center. Apart from a rotation over 45° the relation between 
Kruskal coordinates x, y and Schwarzschild coordinates r, t close to the point r = 2M 
can be seen to be exactly as the one between the fiat space coordinates x^, x° and the 
Rindler coordinates r as discussed in chapter 3. 

The points r = however remain singular in the Schwarzschild solution. The regular 
region of the "universe" has the line 

xy = -1 (11.48) 

as its boundary. The region a; > , y > will be identified with the "ordinary world" 
extending far from our source. The second universe, the region of space-time with x < 
and y < has the same metric as the first one. It is connected to the first one by 
something one could call a "wormhole" . The physical significance of this extended region 
however is very limited, because: 

1) "ordinary" stars and planets contain matter ( T^j, ^ ) within a certain radius 
r > 2M , so that for them the validity of the Schwarzschild solution stops there. 


2) Even if further gravitational contraction produces a "black hole" one finds that 
there will still be imploding matter around ( T^j, 7^ ) that will cut off the second 
"universe" completely from the first. 

3) even if there were no imploding matter present the second universe could only be 
reached by moving faster than the local speed of light. 

Exercise: Check these statements by drawing an xy diagram and indicating where the 
two universes are and how matter and space travellers can move about. Show that also 
signals cannot be exchanged between the two universes. 

Figure 4: Penrose diagrams, (a) The Perose diagram for the Schwarzschild 
metric. The shaded region does not exist in black holes with a collapse in 
their past; (b) A black hole after collapse. The shaded region is where the 
collapsing matter is. lightrays moving radially {6 = <p = 0) here always move 
at 45°. 

If one draws an "imploding star" in the x y diagram one notices that the future 
horizon may be physically relevant. One then has the so-called black hole solution. 

We define the Penrose coordinates, x and by 

X = tan(^7rx) ; y = tan(^7ry) . (11.49) 
In these coordinates, we see that 

i. the lightcone is again at 45° ; 
a. the allowed values for x and y are: 

|x| < 1 , \y\ <1 , \x - y\ <1 . (11.50) 


This region is sketched in Fig. 4a. We call this a Penrose diagram. The shaded part 
is not accessible if the black hole has a collapsing object in its distant past. Then the 
appropriate Penrose diagram is the one of Fig. 4b. 

12. Mercury and light rays in the Schwarzschild metric. 

Historically the orbital motion of the planet Mercury in the Sun's gravitational field has 
played an important role as a test for the vahdity of General Relativity (although Einstein 
would have launched his theory also if such tests had not been available) 

To describe this motion we have the variation equation (11.16) for the functions t(r) , 
r(r) , 6{t) and ipij) , where r parametrizes the space-time trajectory. Writing r = 
dr/dr, etc. we have 

^Ji^i'-'-V-y+i'-'-V-) + + ^'^^ ^ 0^) } dr = , (12.1) 

in which we put ds^/dr^ = —1 because the trajectory is timelike. The equations of 
motion follow as Lagrange equations: 

A(^2^) = r^sin^cosV; (12-2) 

^{r^sin^ecp) = 0; (12.3) 



2M , 
1 t 

= . (12.4) 

We did not yet write the equation for r . Instead of that it is more convenient to divide 
Eq. (11.40) by -ds^ : 

1= (i-'^y'^-(i-'^y'f^-r\e^ + sm'e^') . (12.5) 

Now even in the completely relativistic metric of the Schwarzschild solution all orbits 
will be in fiat planes through the origin, since spherical symmetry allows us to choose as 
our initial condition 

e = 7v/2; e ^ 0. (12.6) 

and then this will remain valid throughout because of Eq. (12.2). Eqs. (12.3) and (12.4) 
tell us: 

r^(p — J — constant. (12.7) 


(l - ^ E ^ constant. (12.8) 


Eq. (12.5) then becomes 

1 = (l - — ) E'-[l-—) f'- jyr' . (12.9) 

Just as in the Kepler problem it is convenient to treat r as a function oi (p . t has already 
been eliminated. We now also eliminate r . Let us, for the remainder of this chapter, 
write differentiation with respect to if with an accent: 

r' = r/ip. (12.10) 

From (12.7) and (12.9) one derives: 

1 - 2M/r = E^- Jh'^/r^ -J^(l- ^) /r^ . (12.11) 

Notice that we can interpret E as energy and J as angular momentum. Write, just as 
in the Kepler problem: 

r = l/u , r' = -u'/u^ ; (12.12) 
1-2Mm = E^- JV^- JV(1-2Mm) . (12.13) 

Prom this we find 


The formal solution is 

^ {2Mu - 1) (v? + j^) + E^/J^ ■ (12.14) 

¥^-V^o = j du[—j^ + — -u' + 2Mu^^ . (12.15) 

Exercise: show that in the Newtonian limit the term can be neglected and then 
compute the integral. 

The relativistic perihelion shift will be the extent to which the complete integral 
from Wmin to i^max (two roots of the third degree polynomial), multiplied by two, differs 
from 27r . 

A neat way to obtain the perihelion shift is by differentiating Eq. (12.13) once more 
with respect to if : 


Now of course 

'^^ u - 2u'u" - 2uu' + QMu^u = . (12.16) 

u' = (12.17) 


Figure 5: Perihelion shift of a planet in its orbit around a central star. 

can be a solution (the circular orbit) . If 7^ we divide by u' : 

u" + u ^ ^ + 3Mu^. (12.18) 

The last term is the relativistic correction. Suppose it is small. Then we have a well-known 
problem in mathematical physics: 

u" + u^A + eu^. (12.19) 

One could expand rt as a perturbative expansion in powers of £ , but we wish an expansion 
that converges for all values of the independent variable ip . Note that Eq. (12.13) allows 
for every value of u only two possible values for u' so that the solution has to be periodic 
in (/9 . The unperturbed period is 2n . But with the term present we do not know the 
period exactly. Assume that it can be written as 

2TT{l + ae + 0{e'^)). (12.20) 


u = A + Bcos[{l- ae)ip]+€Ui{ip) + 0{€^) , (12.21) 

u" = -5(1 - 2as) cos [(1 - ae)ip] + £<(v7) + 0{e^) ; (12.22) 

e (^A^ + 2AB cos [(1 - ae)ip] + cos^ [(1 - ae)ip]^ + 0{e'^) . (12.23) 

We find for ui 

u'l + ui = (-2aB + 2 AB) cos ip + B^ cos^ ip + A^ , (12.24) 

where now the 0{e) terms were omitted since they do not play any further role. This is 
just the equation for a forced pendulum. If we do not want that the pendulum oscillates 


with an ever increasing amplitude ( Ui must stay small for all values of if ) then the 
external force is not allowed to have a Fourier component with the same periodicity as 
the pendulum itself. Now the term with cost/? in (12.24) is exactly in resonance^° unless 
we choose a — A. Then one has 

u'[ + ui = \B'^{cos2^ + l)+ , (12.25) 

= iS2j^l__L_cos2^) (12.26) 

which is exactly periodic. Apparently one has to choose the period to be 27r(l + As) if 
the orbit is to be periodic in (p . We find that after every passage through the perihelion 
its position is shifted by 

5^ = 2T:Ae = 27r— - , (12.27) 

(plus higher order corrections) in the direction of the planet itself (see Fig. 5). 

Now wc wish to compute the trajectory of a light ray. It is also a geodesic. Now 
however ds = . In this limit we still have (12.1) - (12.4), but now we set 

ds/dr = , 

so that Eq. (12.5) becomes 

2M\ .o A 2M\-i 

^ (l-^Y-[l-^Y f^-r\e^ + sin^e^^). (12.28) 

Since now the parameter r is determined up to an arbitrary multiplicative constant, only 
the ratio J/E will be relevant. Call this j . Then Eq. (12.15) becomes 

^ ip^^+ duij-"^ - + 2Mu^)~^ . (12.29) 

J Uo 

As the left hand side of Eq. (12.13) must now be replaced by zero, Eq. (12.18) becomes 

u" + u ^ 3Mu^ . (12.30) 

An expansion in powers of M is now permitted (because the angle ip is now confined 
within an interval a little larger than tt ) : 

u — Acoscp + v, (12.31) 


v" + v = 3M^^cosV = ;^^^^(l + cos2(^) , (12.32) 

V = ^MA2(1 - |cos2(^) = MA2(2 -cos» . (12.33) 

^''Note here and in the following that the solution of an equation of the form v!' -\-u = cos Wji^ 

is u = ^ . Ai cos ijJi^p /{I — Lof) + Ci cos + C2 sin ^p. This is singular when w — )• 1 . 


So we have for small M 



= u = Acosip + MA\2-cos^ip). (12.34) 

The angles (p at which the ray enters and exits are determined by 

1/r = , cos(^ = ^— ^- . 12.35 

' ' ^ 2MA ^ ^ 

Since M is a small expansion parameter and | cos 99 1 < 1 we must choose the minus sign: 

cos(/7 ?s -2MA = -2M/ro , (12.36) 
^ ^ ±(f + 2M/ro) , (12.37) 

where tq is the smallest distance of the light ray to the central source. In total the angle 
of deflection between in- and outgoing ray is in lowest order: 

A = 4M/ro . (12.38) 

In conventional units this equation reads 

m© is the mass of the central star. 

2 . (12.39) 

Exercise: show that this is twice what one would expect if a light ray could be 
regarded as a non-relativistic particle in a hyperbolic orbit around the star. 

Exercise: show that expression (12.27) in ordinary units reads as 

where a is the major axis of the orbit, e its excentricity and c the velocity of light. 

13. Generalizations of the Schwarzschild solution. 

a). The Reissner-Nordstrom solution. 

Spherical symmetry can still be used as a starting point for the construction of a 
solution of the combined Einstein- Maxwell equations for the fields surrounding a "planet" 
with electric charge Q and mass m . Just as Eq. (11.10) we choose 

ds' = -Ade + Sdr' + r'(d^2 + sin' 9 dc/?') , (13.1) 


but now also a static electric field: 

Er = E{r) ; Ee ^ ^ ; S = . (13.2) 

This implies that Fqi = —Fiq = E{r) and all other components of F^y are zero. Let us 

assume that the source J'* of this field is inside the planet and we are only interested in 
the solution outside the planet. So there we have 

= (13.3) 

If we move the indices upstairs we get 

F'"^ ^ E{r)/AB, (13.4) 

and using 

= VABr^sin^, (13.5) 

we find that according to (10.13) 

Thus the inhomogeneous Maxwell law tells us that 


where Q is an integration constant, to be identified with electric charge since at r — )■ oo 
both A and B tend to 1 . 

The homogeneous Maxwell law (10.11) is automatically obeyed because there is a field 
Aq (potential field) with 

Er = -drAo. (13.8) 

The field (13.7) contributes to : 

Too = - E^/2B = -AQ7327rV^ ; (13.9) 

Til = E^/2A = Sg7327rV^ ; (13.10) 

T22 = -E^ry2AB = -QV327rV2 , (13.11) 

T33 = Tassin^^ = -Q^ sin^ ^ /327r V . (13.12) 

We find 

T^^ = g'^'T^. = 0; R^O, (13.13) 

a general property of the free Maxwell field. In this case we have ( = 1 ) 

i?^, = -SttT^,. (13.14) 


Herewith the equations (11.29) - (11.31) become 

4/0/ 4/2 O/JR' 

+ ^ + ^ - -^B«V2..'. (13.15, 
We find that Eq. (11.32) still holds so that here also 

B = 1/A. (13.16) 

Eq. (11.36) is now replaced by 

{r/By - 1 = -Q^/Anr^ . (13.17) 

This gives upon integration 

r/B ^ r -2M + Q^/Anr . (13.18) 
So now we have instead of Eq. (11.38), 

A = l- — + -^; S = l/A. (13.19) 

This is the Reissner-Nordstro m solution (1916, 1918). 

If we choose Q'^/An < there are two "horizons", the roots of the equation A — 0: 

r ^ r± = M± ^/M^"^QV4^_ (13.20) 

Again these singularities are artifacts of our coordinate choice and can be removed by 
generalizations of the Kruskal coordinates. Now one finds that there would be an infinite 
sequence of ghost universes connected to ours, if the horizons hadn't been blocked by 
imploding matter. See Hawking and Ellis for a much more detailed description. 

b) The Kerr solution 

A fast rotating planet has a gravitational field that is no longer spherically symmetric 
but only cylindrically. We here only give the solution: 

d.^= -de + (r- + a-) W + ""-'f - : ""^ y (13.21) 

+ cos^ 6* 

+(r^ + cos^ 9) ( d9^ + . (13.22) 

This solution was found by Kerr in 1963. To prove that this is indeed a solution of 
Einstein's equations requires patience but is not difficult. For a derivation using more ele- 
mentary principles more powerful techniques and machinery of mathematical physics are 
needed. The free parameter a in this solution can be identified with angular momentum. 


c) The Newman et al solution 

For sake of completeness we also mention that rotating planets can also be electrically 
charged. The solution for that case was found by Newman et al in 1965. The metric is: 

ds^ = -^(di - asin^ed^f + ^^(adt - (r' + a^)d^f + ^dr' + YdO^ , (13.23) 


Y = r^ + a^cos^, (13.24) 
A = - 2Mr + gV^TT + . (13.25) 

The vector potential is 

^0 - ^ ' ^3 - ■ (13-26) 

Exercise: show that when Q — Eqs. (13.22) and (13.23) coincide. 

Exercise: find the non-rotating magnetic monopole solution by postulating a radial 
magnetic field. 

Exercise for the advanced student: describe geodesies in the Kerr solution. 

14. The Robertson- Walker metric. 

General relativity plays an important role in cosmology. The simplest theory is that at a 
certain moment " t = " , the universe started off from a singularity, after which it began 
to expand. We assume maximal symmetry by taking as our metric 

ds^ = -dt^ + a\t)duj^ . (14.1) 

Here du;^ stands short for some fully isotropic 3-dimensional space, and a{t) describes 
the (increasing) distance between two neighboring galaxies in space. Although we do 
embrace here the Copernican principle that all points in space look the same, we abandon 
the idea that there should be invariance with respect to time translations and also Lorentz 
invariance for this metric - the galaxies contain clocks that were set to zero a,t t — and 
each provides for a local inertial frame. 

First, we concentrate on the three-dimensional space described by du;^ . Here, we take 
polar coordinates g, 9, (p: 

duj^ = B{g)dg^ + g\de^ + sin^ Odip^) , (14.2) 


then in this three dimensional space the Ricci tensor is (by using the same techniques as 
in chapter 11) 

Rn = B'{g)/gB{g), (14.3) 
1 ^ 

In an isotropic (3-dimensional) space, one must have 

Rij = Xgij ; Rn = XB , R22 = , (14.5) 
for some constant A , and therefore 

B'/B = XBg, (14.6) 

l-^ + H; - A.^ (14.7) 

Together they give 

which indeed also obeys (14.6) and (14.7) separately. 

Exercise: show that with g = sin ip , this gives the metric of the 3-sphere, in 

terms of its three angular coordinates ip,6,(p . Indeed, the metric for an -sphere 
can be written as 

duj% = djjj% + sin^ ijjN da;^_i , R^p = XNQij ; Aat = - 1 . (14.9) 
Back to the 3-sphere, often one chooses a new coordinate u : 

dcf v/2fcA u Q _ 2/v^sinV> 

1 + "-i + yirf^- i+cos^ ■ ^''-''^ 

One observes that 

dg = \ i .du and B = ( f— ) , 14.11 

so that 

A (1 + {k/A)v?y 


The parameter k is arbitrary except for its sign, which must be the same as the sign of 
A. The factor in front of Eq. (14.12) may be absorbed in a{t) . Therefore we write for 

ds'^ ^ -dt'^ + a^(t) ^. (14.13) 

li k — 1 the spacehke piece is a sphere, if A; = it is flat, if k — —1 the curvature is 
negative and space is unbounded (in spite of the fact that then \x\ is bounded, which is 
an artifact of our coordinate choice). 

After some elementary calculations, 

= ^. (14.14) 

R\ ^ Rl = Rl = ^ + A(d2 + A;), (14.15) 

R = i?/;* = — (aa + d^ + A;) . (14.16) 


The tensor G^i, becomes (taking for simplicity x = Q): 


Goo = ^(d' + A;) = SttG^^ + A , (14.17) 


Gil = ^22 = = -2ad-d2 -A; = a2(87rGjvp- A) . (14.18) 

Here, q — T44 = Tqq/qqq is the energy density and p is the pressure: Tij — —pQij. We 
define the Hubble parameter H{t) as 

H^-, (14.19) 

so that Eq. (14.17), also called Friedmann's equation, can be written as: 

a J 3 3 

Eliminating a out of Eqs. (14.15) and (14.16) we get 

d 4:7iG , „ s A , s 

- = - — (. + 3p)+3. (14.21) 

This is a simple equation of motion for our parameter a(t) . From Eq. (14.20) we derive 
a function V{a) , acting as a potential, in which a "particle" a{t) is moving with zero 

-la' = V{a) = lk-a'(^ + ^-^g) . (14.22) 


Figure 6: The potential (14.22) for the cases a) k = 0, A < , 

h) k — —1, A = and c) k — 0, A > . In the case (a), there is a turning 

point at a = a™^ . 

Assuming internal consistenct of these equations, by multiplying Eq. (14.20) with , 
differentiating that in time, and comparing this with Eq. (14.21), we find the continuity 
equation (Bianchi equation) for p and g : 

q^-3H{q + p). (14.23) 

Let us now make some assumption about matter in the universe, and its equations of 
state, i.e. the relation between the energy density g and the pressure p. The simplest 
case is to assume that there is no pressure (a "dust-filled universe", p — 0). In this case, 
the energy density g is just the matter density, which is inversely proportional to the 
volume. We see that then, in agreement with Eq. (14.23): 

g^^, (dust) (14.24) 

where is constant. 

We see that, as a increases, in Eq. (14.22), first the matter term dominates, then the 
space-curvature term (with k), and finally the cosmological constant dominates. 

Using supernova data and observations on the fluctuations of the cosmological back- 
ground radiation, one was able to measure the function a{t) . From this, it was derived 


that A; ~ , and the total mass density of the universe, ^totai = g + , consists of 

74% "dark energy" (cosm. const.), A , 

22% "dark matter" , . 

3, 6% interstellar and intergalactic gas , 

0, 4% stars, planets, etc. 

It is instructive to consider the solutions to the equations (14.21) and (14.22), when 

there arc other relations between the pressure and the density. For instance in a radiation- 
filled universe, we have p = g/3 , and since we may assume that the radiation is thermal. 


and the number of photons is conserved, we may conclude that q = Qq/o^ instead of 
Eq. (14.24). Indeed, this agrees with Eqs. (14.21) — (14.23). 

The case A = , k ^ , has now become obsolete, presumably. But the solutions to 
the Priedmann Equation (14.20) are well-known mathematical curves, and as such still 
interesting. Regard this now as an exercise: we have 

aa' + ka^ ^ D; (14.26) 

= D/a-k, (14.27) 

and from (14.18): 

a = -D/2a^ . (14.28) 

Write Eq. (14.27) as 
then we try 


da V D — ka 


a = — sin^ , (14.30) 
dt da dt 2D sino9 

— = - — — = — ^sm(/?cos(^- , (14.31) 

dip dip da kVk cos ip 



t{ip) = ^^((^- |sin2(^) , (14.32) 

a{ip) = ^(l-cos2^). (14.33) 


These are the equations for a cycloid. Since D > , t > and a > we demand 

A; > — >■ if real ; 

A; < — >■ (fi imaginary ; 

k — — >■ (p infinitesimal . (14.34) 

See Fig. 7. 

All solutions start with a "big bang" at t = . Only the cycloid in the k = 1 case also 
shows a "big crunch" in the end. If A; < not only space but also time are unbounded. 

Other cases, such as p = —g/3 and p = —g are good exercises. But, since recent 
findings very strongly indicated that A > , it is also of interest to see how a unverse 
evolves when it is dominated by a cosmological constant. We then have 

a = %Ka , = %ka^ , (14.35) 

so that 

a = Ae'^* or a = Ae''^^ k = ^/A/3 , (14.36) 

where A ia, a, constant. We see that, if A > , the universe expands exponentially (the 
shrinking solution is physically unrealistic). It seems that our universe is heading for such 
an exponential expansion. It also may have been in such a phase during a short period 
right after the Big Bang. It is suspected that a scalar field (/? at that time may have been 
stabilized towards a large, fixed value ipi where its self interaction V{ipi) may have taken 
a non-vanishing value. The universe could have expanded by some 60 or so "efolds" (a 
factor ) , and this is taken as a possible explanation of the question why our present 
universe is so large, while it looks very uniform — as if all regions visible today may have 
had a common past. 

15. Gravitational radiation. 

Fast moving objects form a time dependent source of the gravitational field, and causality 
arguments (information in the gravitational fields should not travel faster than light) then 
suggest that gravitational effects spread like waves in all directions from the source. Far 
from the source the metric will stay close to that of flat space-time. To calculate this 
effect one can adopt a linearized approximation. In contrast to what we did in previous 
chapters it is now convenient to choose units such that 

167rG'^ = 1 . (15.1) 

The linearized Einstein equations were ah^cady treated in chapter 7, and in chapter 9 
we see that, after gauge fixing, wave equations can be derived (in the absence of matter, 
Eq. (9.17) can be set to zero). It is instructive to recast these equations in Euler- Lagrange 


form. The Lagrangian for a linear equation however is itself quadratic. So we have to 
expand the Einstein-Hilbert action to second order in the perturbations /i^j, in the metric: 

9ni^ = Vnu + h^r, , (15.2) 
and after some calculations we find that the terms quadratic in h^i, can be written as: 

-\-\Afj'^ -\- total derivative + higher orders in /i , (15.3) 


= d^hf,^ - \dah^^ , (15.4) 

and Tf^i, is the energy momentum tensor of matter when present. Indices are summed 
over with the flat metric rjn^, , Eq. (7.2). 

The Lagrangian is invariant under the linearized gauge transformation (compare (8.16) 
and (8.17)) 

hf^u h^y + df,Uy + dyU^ , (15.5) 
which transforms the quantity A^- into 

^ + d^u^ . (15.6) 

One possibility to fix the gauge is to choose 

A^^O (15.7) 

(the linearized De Bonder gauge). For calculations this is a convenient gauge. But for a 
better understanding of the real physical degrees of freedom in a radiating gravitational 
field it is instructive first to look at the "radiation gauge" (which is analogous to the 
electromagnetic case diAi — 0): 

d.h^j = ; dihi4 = , (15.8) 

where we stick to the earlier agreement that indices from the middle of the alphabet, 
i, J, . . . , in a summation run from 1 to 3. So we do not impose (15.7). 

First go to "momentum representation" : 

h{x,t) = {2ti)-^'^ j A'^kh{k,t)e'^'^ ■ (15.9) 
di iki. (15.10) 

We will henceforth omit the hat(") since confusion is hardly possible. The advantage 

— * 

of the momentum representation is that the different values of k will decouple, so we 


can concentrate on just one k vector, and choose coordinates such that it is in the z 
direction: ki = k2 = 0, k^ = k . We now decide to let indices from the beginning of the 
alphabet run from 1 to 2. Then one has in the radiation gauge (15.8): 

/i3a = /i33 = /i30 = . (15.11) 


A3 = -\ik{haa - hoo) , 

Ao = |(-/ioo - haa) . (15.12) 
Let us split off the trace of hab '■ 

Kb = hah + \5abh , (15.13) 


Then we find that 

h^haa-, haa^O. (15.14) 


2 ab' I'ab i 



— ^k'^hlg^ + hoaToa 




^k^hhoo — |/ioo^oo ~ \hTaa ■ 


Here we used the abbreviated notation: 

h'^ ^ J d^kh{k,t)h{-k,t) , 



J d^kk'^h{k,t)h{-k,t) ,. (15.18) 

The Lagrangian Ci has the usual form of a harmonic oscillator. Since hab — Ka and 
haa — , there are only two degrees of freedom (forming a spin 2 representation of the 
rotation group around the k axis: "gravitons" are particles with spin 2). C2 has no 
kinetic term. It generates the following Euler-Lagrange equation: 

hoa^-^Toa. (15.19) 

We can substitute this back into £2 : 

A = -^^oa- (15.20) 


Since there are no further kinetic terms this Lagrangian produces directly a term in the 

= I j Toi{x)[A{x-y)5ij-E,j{x-y)]T,j{y)d'xd'y; (15.21) 

with d'^A(x — y) — —8^ix — y) and A — — -rir — z^- , 

47r|x — y\ 

whereas Eij is obtained by solving the equations 

d''Eij{x-y)^didjA{x-y) and {xi - yi) Eij{x - g) ^ , (15.22) 

so that 

E., = -^-^±lMi^. (15.23) 

8tt\x - y\ 8tt\x - y\-^ 

Thus, £2 produces effects which are usually only very tiny rclativistic corrections 
to the instantaneous interactions between the Poynting components of the stress-energy- 
momentum tensor. 

In £3 we find that /iqo acts as a Lagrange multiplier. So the Euler-Lagrange equation 
it generates is simply: 

^=-p^oo, (15.24) 

leading to 

£3 = -Tj8k' + Tjse + TooTjAe . (15.25) 
Now for the source we have in a good approximation 

9^7;. = 0, (15.26) 

so ikTs, = To. and ikT^o = Too , (15.27) 
and therefore one can write 

£3 = -TiJse + Tjse + TooTaa/^k^ ; (15.28) 

H3 = - [ Csd^k . (15.29) 

Here the second term is the dominant one: 

d'kT' /8P - f Too{x)Too{y)d'xd'y _ f d'xd'y 

dkTJ8k - -J 8.4^|^_^^ - 



where we re-inserted Newton's constant. This is the hnearized gravitational potential for 
stationary mass distributions. The other terms have to be processed as in Eqs. (15.21)- 

Wc observe that in the radiation gauge, £2 and £3 generate contributions to the forces 
between the sources. It looks as if these forces are instantaneous, without time delay, but 
this is an artifact peculiar to this gauge choice. There is gravitational radiation, but it is 
all described by £1 . We see that Tab , the traceless, spacelike, transverse part (called the 
TT part) of the energy momentum tensor acts as a source. Let us now consider a small, 
localized source; only in a small region V with dimensions much smaller than 1/k . Then 
we can use: 




This means that, when integrated, the space-space components of the energy momentum 
tensor can be identified with the second time derivative of the quadrupole moment of the 
mass distribution Too • 

We would like to know how much energy is emitted by this radiation. To do this let 
us momentarily return to electrodynamics, or even simpler, a scalar field theory. Take a 
Lagrangian of the form 

k if — if J . 

Let J be periodic in time: 

J{x,t) — J{x)e 




then the solution of the field equation (see the lectures about classical electrodynamics) 
is at large r : 

ip{x,t) = 

k — uj , 


where x' is the retarded position where one measures J . Since we took the support V of 
our source to be very small compared to 1/k the integral here is just a spacelike integral. 
The energy P emitted per unit of time is 



) 47rl/ 







Now this derivation was simple because we have been deahng with a scalar field. How 
does one handle the more complicated Lagrangian Li of Eq. (15.15)? 

The traceless tensor 

Tij — Tij — ISijTkk , (15.36) 

has 5 mutually independent components. Let us now define inner products for these 5 
components by 

r«-f(2) = ^^f^, (15.37) 

then (15.15) has the same form as (15.32), except that in every direction only 2 of the 
5 components of Tij act. If we integrate over all directions we find that all components 
of Tij contribute equally (because of rotational invariance, but the total intensity is just 
2/5 of what it would have been if we had T in £i instead of Tab ■ Therefore, the energy 
emitted in total will be 

i (^doiij^ 

5 -Air 2V / 

20^' 2 

^(^0%)' , (15.38) 

with, according to (15.31), 

tij = / {x'x^ - lx^Sij)Tood-^x . (15.39) 

For a bar with length L , in the x -direction, one has 

in = Y^ML\ 

i22 = is3 = ■ (15-40) 

If now we let it rotate in the xy plane with angular velocity then tn, ii2 and £22 
each rotate with angular velocity 2Q . This is easy to understand: Since, in Eq. (15.39), 
every term rotates as a + 6 cos fit , the products of two of these must depend on 
cos^ Qt = |(cos2r2t + 1) and on cos fit. Because the rod turns into itself after half a 
rotation, the dependence on the linear term cos fit must vanish, and we are left with 
cos 2fit : 

in — a + b cos 2flt ; £22 = a — 6 cos 2flt , £33 — const. (15.41) 
where a and b now follow from: a + b — ML^/18 and a — b — — ML^/36 . Therefore: 

in = ML^ (^ + ^ cos 2fit), (15.42) 


£22 = ML2(^ COS 2Qi), (15.43) 
£12 = ML'^(^^sm2nt^ , (15.44) 

Here, the expression for follows from the fact that a rotation of the bar by 45° turns 
^11 — ^22 into 2^12 and cos 2flt into sin 2flt . 

Only the rotating part contributes to the emitted energy per unit of time: 

P = 


-(20)6 (^^) [2cos^2nt) + 2sin^2nt) = ^M'L*n\ (15.46) 

where we re-inserted the light velocity c to balance the dimensionalities. 

Eq. (15.38) for the emission of gravitational radiation remains valid as long as the 
movements are much slower than the speed of light and the linearized approximation is 
allowed. It also holds if the moving objects move just because they arc in each other's 
gravitational fields (a binary pulsar for example), but this does not follow from the above 
derivation without any further discussion, because in our derivation it was assumed that 

16. Concluding remarks 

Over the years. General Relativity has become a mature doctrine. Various precision 
measurements have now been performed to verify various of its predictions. Until today, 
the theory has only received strong support from these experiments. As is well-known, 
a theory can never be proven to be correct by making experimental checks; it just came 
out stronger than it has been before. 

Every now and then, the theory is attacked by investigators who did not quite under- 
stand its internal logic, which has actually been shown to be impeccable, or who beheve 
that various dubious experimental observations can be used to "overthrow" the theory. 
One can find these on the Internet. They receive little or no support from the community, 
although one is always allowed to ask critical questions. 

A much more delicate situation exists when it come to the question how General Rela- 
tivity should be reconciled with Quantum Mechanics. One simple observation can readily 
be made. A theory that includes Quantum Mechanics will feature three fundamental 
physical constants: Plank's constant h , the velocity of light, c , and Newton's constant, 
Gjv . These can be combined to find natural units of length, time and mass: 

1.616 X 10"^^ cm ; (16.1) 
5.391 X 10~^^ sec ; (16.2) 

Planck — 


Planck — 



21.76 /xg ; (16.3) 
1.221 X 10^^ eV . (16.4) 

This is a domain of physics that is very difficult to reach experimentally, and all 
we can do is try to obtain as much as possible indirect evidence to construct a theory 
that combines the two doctrines. There exist various competing approaches: Superstring 
Theory, Loop Quantum Gravity, and Causal Dynamical Triangulation, to name a few. As 
of today, it seems that superstring theory is the most advanced of these theories, but it 
is based on rather bold assumptions and direct evidence in its favor is still lacking. The 
reason why it receives strong support is the quite delicate and elegant internal mathemat- 
ical coherence. Yet there arc difficulties: the theory is not unambiguous, and its logical 
foundations are still considered to be weak. 

An interesting, fairly robust result is Hawking's observation that black holes should 
actually radiate away their mass and energy by emitting particles of all types, at a tem- 
perature inversely proportional to the black hole mass. This appears to follow directly 
from the assumption of invariance under general coordinate transformations, and it is 
highly likely to be correct. This phenomenon provides us with a glimpse of what physics 
at the Planck scale should be like. For instance, one can conclude that the density of 
distinct quantum states at these very tiny length scales must be limited by strict bounds, 
as if the whole world is discretized there. Even professional physicists are puzzled and 
confused by this finding. 

Mpianck - 
-C/Planck — 1/2