Baixe Ray d´Inverno - Introducing Einstein's Relativity e outras Notas de estudo em PDF para Física, somente na Docsity!
“There is little doubt that Einstein's theory of relativity captures
the imagination. Not only has it radically altered the way we
view the universe, but the theory has a considerable number
PODRES ot IG TONE SORO TERES TOS TUE TRE RESTOS SST NA
topics of current interest that this book reaches, namely: black
holes, gravitational waves, and cosmology
The main aim of this textbook is to provide students with a
sound mathematical introduction coupled ro an understanding
of'the physical insights needed to explore the subject. Indeed,
the book follows Einstein in that it introduces the theory very
PESTE re RT TOTO
special theory of relativity, the basic ficld equations of
| PESAR Roger Sa DECRETAR DM O TIRAR SO RO
to first solving them in simple cases and then exploring the
| E OS PA RT oi TE
DOM to TO RETO a ARS TENTA APERTE Pe PAP Of LR TORO REST
greatest achievements of the human mind. Yet, in this book,
the author makes it possible for students with a wide range
of abilities to deal confidently with the subject. Based on the
author's fiftcen years experience of teaching this subject, this
is mainly achieved by breaking down the main arguments
into simple logical steps. The book includes numerous
illustrative diagrams and exercises (of varying degrees of
difficulty), and as a result this book makes an excellent course
for any student coming to the subject for the first time.
ISBN 0-19-859686-3
9 | 98"596868
OXFORD UNIVERSITY PRESS
Oxford University Press, Great Clarvendon Street, Oxford OX2 6DP
Oxford New York
Athens Auckland Bangkok Bogota Bombay Buenos Aires Calcuita
Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul
Karachi Kuala Lumpur Madrid Melboume Mexico City Mumbai
Nairobi Paris São Paolo Singapore Taipei Tokyo Toronto Warsaw
and associated companies im
Berlin Ibadan
Oxford is a trade mark of Oxford University Press
Published in the United States
by Oxford University Press Inc, New York
O Ray d'Inverno, 1992
Reprinted 1993, 1995 (with corrections), 1996, 1998
AU righês reserved. No part of this publication may be repraduced, stored in
a retrieval system, or transmitted, In any form or by any means, without the prior
permission in writing of Oxford University Press. Within the UK, exceptions are
allowed im respect of any fair dealing for the purpose of research or private study, or
criticism or review, as permitted under the Copyright, Designs and Patents Act, ”
1988, or in the case of reprographic reproduction in accordance with the terms of
licences issued by the Copyright Licensing Agency. Enquiries concerning
reproduction outside those terms and tn other countries should be sent to the Rights
Department, Oxford University Press, at the address above.
This book is sold subject to the condition that it shall not, by way
of trade or otherwise, be leni, re-sold, hired ou, or otherwise circulated
without the publishers prior consent in any form of binding or cover
other than that in which it is published and without a similar condition
including this condition being imposed on the subsequent purchaser.
A catalogue record for this book is
available from the British Library
Library of Congress Cataloging in Publication Data
d' Inverno, R. À.
Introducing Einstein's relativity/R. A. d' Inverno.
Includes bibliographical references and index.
1. Relativity (Physics) 2. Black holes (Astronomy)
3. Gravitation. 4. Cosmology. 5. Calculus of tensors. 1. Title.
QCI73.55.158 1992 530, Pl--delo 91-24894
ISBN O 19 859653 7 (Hbk)
ISBN O 19 859686 3 (Pbk)
Printed in Multa by Interprint Limited
Contents
Overview
1. The organization of the book
11 Notes for the student
1,2 Acknowledgements
1.3 A brief survey of relativity theory
14 Notes for the teacher
15 A final note for the
Jess able student
Exercises
Part A. Special Relativity
2. The k-calculus
2.1 Model building
22 Historical background
23 Newtonian framework
24 Gialilean transformations
25 The principle of special relativity
26 The constancy of the velocity of light
2.7 The k-factor
28 Relative speed of two inertial
observers
29 Composition law for velocities
2.10 The relativity of simultaneity
2.11 The clock paradox
2.12 The Lorentz transformations
2.13 The four-dimensional world view
Exercises
3. The key attributes of special
relativity
31 Standard derivation of the Lorentz
transformations
32 Mathematical properties of Lorentz
transformations
Va A
10
“4
13
15
15
16
16
17
18
20
MH
22
23
24
25
26
28
29
29
31
33 Length contraction
34 Time dilation
35 Transformation of velocities
36 Relationship betwcen space-time
diagrams of inertial observers
37 Acceleration in special relativity
38 Uniform acceleration
39 The twin paradox
3,10 The Doppler effect
Exercises
4. The elements of relativistic
mechanics
41 Newtonian theory
42 Isolated systems of particles in
Newtonian mechanics
43 Relativistic mass
44 Relativistic energy
45 Photons
Exercises
Part B. The Formalism of
Tensors
5. Tensor algebra
5.1 Introduction
5.2 Manifolds and coordinates
8.3 Curves and surfaces
5.4 Transformation of coordinates
8.5 Contravariant tensors
5.6 Covariant and mixed tensors
8.7 Tensor fields
5.8 Elementary operations with tensors
89 Index-free interpretation of contra-
variant vector fields
Exercises
32
33
34
35
36
37
38
39
42
4
45
47
49
51
53
55
55
55
57
58
60
61
62
64
67
viii | Contents
6. Tensor calculus
6.1 Partial derivative of a tensor
62 The Lie derivative
63 The affine connection and covariant
differentiation
64 Afine geodesios
6.5 The Riemann tensor
66 Geodesic coordinates
6.7 Affine fatness
68 The metric
69 Metric geodesics
6.0 The metric connection
6.11 Metric flatness
6.12 The curvature tensor
6.13 The Weyl tensor
Exercises
=
Integration, variation, and
symmetry
7.1 Tensor densities
7.2 The Levi-Civita alternating symbol
73 The metric determinant
74 Integrals and Stokes' theorem
75 The Euler-Lagrange equations
76 The variational method for geodesics
77 Tsometries
Exercises
Part C. General Relativity
8. Special relativity revisited
8.1 Minkowski space-time
8.2 The nuli cone
83 The Lorentz group
84 Proper time
85 An axiomatic formulation of special
relativity
86 A variational principle approach to
classical mechanies
8.7 A variational principle approach to
relativistic mechanics
68
68
69
n
u
7
7
8
81
82
Bá
85
86
87
89
MH
91
93
95
96
102
103
tos
107
107
108
log
141
112
114
n6
o
10.
nu
88 Covariant formulation of relativistic
mechanics
Exercises
The principles of general
relativity
9.1 The role of physical principles
92 Mach's principle
9.3 Mass in Newtonian theory
9.4 The principle of equivalence
9.5 The principle of general covariance
9.6 The principle of minimal
gravitational coupling
9.7 The correspondence principle
Exercises
The field equations of general
relativity
10.1 Non-iocal lift experiments
10,2 The Newtonian equation of
deviation
10.3 The equation of geodesiç deviation
10.4 The Newtonian correspondence
10.5 The vacuum field equations of
general relativity
10.6 The story so far
10.7 The full field equations of general
relativity
Exercises
General relativity from a
variational principle
111 The Palatini equation
112 Differential constraints on the field
equations
113 A simple example
11.4 The Einstein Lagrangian
11.5 indirect derivation of Lhe fieid
equations
116 An equivalent Lagrangian
11.7 The Palatini approach
11.8 The full ficld equations
Exercises
147
119
120
120
14
125
128
130
131
132
132
134
134
135
136
139
141
142
142
144
145
145
146
147
148
149
151
152
153
154
234
235
236
237
238
239
The de Sitter model
The first models
The time-scale problem
Later modeis
The missing matter problem
The standard models
23.10 Early epochs of the universe
2311 Cosmological coincidences
23.12 The steady-state theory
23.13 The event horizon of the de Sitter
universe
23,14 Particle and event horizons
2315 Conformal structure of
Robertson-Walker space-times
337
338
339
339
341
342
343
343
344
348
349
351
Contents | xi
23.16 Conformal structure of de Sitter
space-time 352
23.17 Inflation 354
23.18 The anthropic principle 356
23.19 Conclusion 358
Exercises 359
Answers to exercises 360
Further reading 370
Selected bibliography 3n
Index 315
te ictiirad Ro ue ue
HE ER PAGUE RE
EE
penta uia DO On EaRa aa aaa nana danada as
RR ais REG aaa
LE AR a FRA a
Smam amena E ESC
Data = Bins
BR A OS DO ES E
Ene asa Do o A ERRNR RN RNTanaan a na
A AR
nn
a E
Ka BE EA E ER a E a E E
e ts bag
farei dir aaa CORA a Bonde cs sema
na sSaaanSacSDaN dana!
=
guss
o
ne
Ea
flsas
oe GS ie
E
se
E
Sa da
duas f
SESs aa ana ads
o EE e
RT TO
A SE O A E A a E 0
Mn a Ra a RS O pa
RIR Pei a Ea aa a ao RO MR RO
BAMBI RIRH da GUGA a dd ma
ria da
cin ABA
ARERAA RIMAS SORA GH CAR A doc
1.1 Notes for the student
There is little doubt that relativity theory captures the imagination. Nor is it
surprising: the anti-intuitive properties of special relativity, the bizarre
characteristics of black holes, the exciting prospect of gravitational wave
detection and with it the advent of gravitational wave astronomy, and the
sheer scope and nature of cosmology and its posing of ultimate questions;
these and other issues combine to excite the minds of the inquisitive. Yet, if we
are to look at these issues meaningfuliy, then we really require both physical
insight and a sound mathematical foundation. The aim of this book is to help
provide these,
The book grew out of some notes 1 wrote in lhe mid-1970s to accompany a
UK course on general relativity. Originally, the course was à third-year
undergraduate option aimed at mathematicians and physicists. It sub-
sequently grew to include M.Sc. students and some first-year Ph.D. students.
Consequentiy, the notes, and with it the book, are pitched principally at the
undergraduate level, but they contain sufficient depth and coverage to
interest many students at the first-year graduate level. To help fulfil this dual
purpose, I have indicated the more advanced sections (level-two material) by
a grey shaded bar alongside the appropriate section. Level-one material is
essential to the understanding of the book, whereas level two is enrichment
material included for the more advanced student. To help put a bit more light
and shade into the book, the more important equations and results are given
in tint panels.
In designing the course, I set myself two main objectives. First of all, 1
wanted the student to gain insight into, and confidence in handling, the basic
equations of the theory. From the mathematical viewpoint, this requires good
manipulative ability with tensors. Part B is devoted to developing the
necessary expertise in tensors for Lhe rest of the book. It is essentially written
as a selfstudy unit. Students are urged to attempt all the exercises which
accompany the various sections. Experience has shown that this is the only
real way to be im a position to deal confidently with the ensuing material,
From the physical viewpoint, 1 think the best route to understanding
relativity theoty is to follow the onc takcn by Einstein. Thus the second
chapter of Part C is devoted to discussing the principles which guided
Einstein in his search for a relativistic theory of gravitation. The field
equations are approached first from a largely physical viewpoint using these
principles and subsequently from a purely mathematical viewpoint using the
6| The organization of the book
Indeed, the intellectual Ieap required by Einstein to move from the special
theory to the general theory is, there can be little doubt, one of the greatest in
the history of human thought. So it is not surprising that the theory has the
reputation it does. However, general relativity has been with us for some
three-quarters of a century and our understanding is such that we can now
build it up in a series of simple logical steps. This brings the thcory within the
grasp of most undergraduates equipped with the right ba-kground.
Quite cleariy, 1 owe-a huge debt to ali the authors who have provided the
source material for and inspiration of this book, However, I cannot make the
proper detailed acknowledgements to all these authors, because some of them
are not known even to me, and 1 would otherwise run the risk of leaving
somebody out. Most of the sonrces can be found in the bibliography given at
the end of the book, and some specific references can be found in the section
on further reading. I sincerety hope I have not offended anyone (authors or
publishers) in adopting this approach. I have written this book in the spirit
that any explanation that aids understanding should ultimately reside in the
pool of human knowledge and thence in the public domain. None the less, L
would like to thank all those who, wittingly or unwittingiy, have made this
book possible. In particular, I would like to thank my old Oxford tutor, Alan
Tayler, since it was largely his backing that led finally to the book being
produced. In the process of converting the notes to a book, [ have made a
number of changes, and have added sections, further exercises, and answers.
Consequently this new material, unlike the earlier, has not been vetted by the
student body and it seems more than likely that it may contain errors of one
sort or another. Jf this is the case, I hope that it does not detract too much
from the book and, of course, | would be delighted to receive corrections from
readers. However, I have sought some help and, in this respect, 1 would
particularly like to thank my colleague James Vickers for a critical reading of
much of the book.
Having said 1 do not wish to cite my sources, I now wish to make one
important exception. I think it would generally be accepted in the relativity
community that the most authoritative text in existence in the field is
The large scale structure of space-time by Stephen Hawking and George
Ellis (published by Cambridge University Press). Indeed, this has taken on
something akin to the status of the Bible in the field. However, it is written at
a level which is perhaps too sophisticated for most undergraduates (in parts
too sophisticated for most specialists!). When 1 compiled the notes, I had in
mind the aspiration that they might provide a small stepping stone to
Hawking and Ellis. In particular, T hoped it might become the next port of
call for anyone wishing 10 pursue their interest further. To that end, and
because I cannot improve on it, [ have in places included extracts from that
source virtually verbatim. I felt that, if students were to consult Lhis text, then
the familiarity of some of the material might instil confidence and encourage
them to delve deeper. [am hugely indebted to the authors for allowing me to
borrow from their superb book.
1.3 A brief survey of relativity theory
Kt might be useful, before embarking on the course proper, to attempt to give
some impression of the areas which come under the umbrella of relativity
theory. | have attempted this schematically in Fig. 1.1. This is a rather partial
1.3 A brief survey of relativity theory | 7
a,
nn
g== Difterential geometry | e--
| Diierentia tonotogy
e Astrophysics
|
Relativity
General relativity
Special relativity
DO so rasa
'
,
i
!
Cosmology
f y y t y
Experimental a Exact solutions |) Formaisms 1,4) Gravitational |) Gravitational
tests radiation colanse
Ontits Classification Tensors Waves lack holes
Gravitational waves Ecuivalence problem Frames Energy transter Singularity theorems
Black holes Anélytic extensions Forms Conservation laws Global techniques.
Gravitatianal ved shift] | Singulanties Spinors Equations of motion Cosmic censorshio
Radar signals Cosmic strings Spin coefficients Asymptotic structure of
Light bending Complex teciniques Tuistors spacetime
Gyroscopes Transformation groups Variational principles.
Algebraic computing Graup representations
Initial value problem »4/ Alternative theories Uniea field theory Quantum gravity
Hamiltonian formulation | Torsion theories Kaluza-Klein theory Canonical grauty
Stabilty theorems Brans-Dicke Quantum theory on
Superspace Hoyle-Narlkar cuived backgrounds
Positive mass theorems : | wWaitehead Pathintegral approach
Numerical relativity | | Bimetrie theories Supergravity
eto. Superstrings
ete
Fig. 1.1 An individual survey of relativity,
and incomplete view, but should help to convey some idea of our planned
route. Most of the topics mentioned are being actively researched today. Of
course, they are interrelated in a much more complex way than the diagram
suggests. E
Every few years since 1955 (in fact every three since 1959), the relativity
community comes together in an international conference of general relat-
ivity and gravitation. The first such conference held in Berne in 1955 is now
referred to as GRO, with the subsequent ones numbered accordingly. The list,
to date, of the GR conferences is given in Table 1.1. At these conferences,
there are specialist discussion groups which are held covering the whole area
of interest. Prior to GR8, a list was published giving some detailed idea of
what each discussion group would cover. This is presented below and may be
used as an alternative to Fig. 1.1 to give an idea of the topics which comprise
the subject.
Table 1,1
ero
SRI
GR2
cr3
[a
GR5
GR6
GR7
Gra
arg
1955 Bern, Switzerland
1957 Chapel Hill, North Carolina, USA
1959 Royaumont, France
1962 Jablonna, Poland
1965 London, England
1968 Tbilisi, USSR
1971 Copenhagen, Denmark
1974 Te:-Aviy, Israel
1977 Waterloo, Canada
1980 Jena, DDR
GRIO 1983 Padua, Italy
GRII 1986 Stockholm, Sweden
GR12 1989 Boulder, Colarado, USA
8 | The organization of the book
I. Relativity and astrophysics
Relativistic stars and binaries; pulsars and quasars; gravitational waves and
gravitational collapse; black holes; X-ray sources and aceretion models.
TI. Relativity and classical physics
Equations of motion; conservation laws; kinetic theory, asymptotic flatness
and the positivity of energy: Hamiltonian theory, Lagrangians, and ficid
theory: relativistic continyum mechanics, electrodynamics, and thermo-
dynamics.
TIL Mathematical relativity
Differential geomeiry and fibre bundles; the topology of manifolds; ap-
plications of complex manifolds; twistors; causal and conformal structures;
partial differential equations and exact solutions; stability; geometric singu-
larities and catastrophe theory; spin and torsion: Einstein-Cartan theory.
IV. Relativity and quantum physics
Quantum theory en curved backgrounds; quantum gravity; gravitation and
elementary particles; black hole evaporation; quantum cosmology.
V. Cosmology
Galaxy formation; super-clustering; cosmological consequences of spontan-
eous symmetry breakdown: domain structures; current estimates of cosmo-
logical parameters; tadio source counts; microwave background; the isotropy
of the universe; singularíties.
VI. Observational and experimental relativity
Theoretical frameworks and viable theories; tests of relativity; gravitational
wave detection; solar oblateness.
VII. Computers in reiativity
Numerical methods; solution of field equations; symbolic manipulation
systems in general relativity. :
1.4 Notes for the teacher
In my twenty years as a university lecturer, 1 have undergone two major
conversions which have profoundiy affected the way I teach. Thesé have, in
their way, contributed to the existence of this book. The first conversion was
to the efficacy of the printed word. I began teaching, probably like most of my
colleagues, by giving lectures using the medium of chalk and talk. 1 soon
discovered that this led to something of a confiict in that the main thing that
students want from a course (apart from success in the exam) is a good set of
Jecture notes, whereas what [ really wanted was that they should understand
the course. The process of trying to give students a good set of lecture notes
meant that there was, to me, a lot of time wasted in the process of note taking.
I am sure colleagues know the caricature of the conventional lecture: notes
are copied from the lecturer's notebook to the student's notebook without
their going through the heads of-either —a definition which is perhaps too
entrance interview. He followed up by asking me to define a tensor, and when
Irattled off a definition, he seemed somewhat surprised. Indeed, as it turned
out, we did not cover very much more than I first knew in the Oxford third
year specialist course on general relativity. So how was this possible?
E too, had heard the story about how only a few people in the world really
understood relativily, and it had aroused my curiosity. 1 went to the local
hibrary and, as luck would have it, 1 pulled out a book entitled Elnstein's
Theory of Relativity by Lillian Lieber (1949). This is a very bizarre
book in appearance. The book is not set out in the usual way but rather as
though it were concrete poetry. Moreover, it is interspersed by surrealist
drawings by Hugh Lieber involving the symbols from lhe text (Fig. 1.2).
T must confess that at first sight the book looks rather cranky; but it is not.
I worked through it, filling in all the details missing from the calculations as
1 went. What was amazing was that the book did not make too many
assumptions about what mathematics the reader needed to know. For
example, 1 had not then met partial differentiation in my school mathematics,
and yet there was sufficient coverage in the book for me to cope. It felt almost
as if the book had been written just for me, The combination of the intrinsic
interest of the material and the success I had in doing the intervening
calculations provided sufficient motivation for me to see the enterprise
through to the end.
Perhaps, if you consider yourself a less able student, you are a bit daunted
by the intellectual challenge that lies ahead. 1 will not deny that the book
includes some very demanding ideas (indeed, I do not understand every facet
ofall of these ideas myself). But 1 hope the two facts that the arguments are
broken down into small steps and that the calculations are doable, will hcip
you on your way. Even if you decide to cut out after part €, you will have
come a long way. Take heart from my little story — 1 am certain that if you
persevere you will consider it worth the effort in the end.
Exercises | 11
Fig. 1.2, “The product of two tensors
is equal to another" according to Hugh
Lieber.
Exercises
1,1 (51,3) Goto the library and see if you can locate current
copies of the following journals:
(i) General Relativity and Gravitation;
(ii) Classical and Quantum Gravity;
(ii) Journal of Mathematical Physics ;
(iv) Physical Review D.
See if you can relate any of the articles in them to any of the
topios contained in Fig. 1.1.
1.2 Look back through copies of Scientific American for
future reference, to see what articles there have been in
recent years on relativity theory, especially black holes,
gravitalional waves, and cosmology.
1.3 Read a biography of Einstein (sec Part A of the Selected
Bibliography at the end of this book).
aii
Eita
Ni Hm
WiEniE Ea da
IRIRRE TA
FE CE
Ê
nuno:
as
ES Rn e
ES a Ei
Mo Rr om
Rip aa Sia : E
AEE UR EEE
Mist DRnhes
es
E cacos cnc aspas:
nan o ana
E
Un bs eai
ue Ci e nd
Fen Rs tore nan ng te
Penim qua Riad
Ê ai
EE ERA ROO III
dd
Ends at
EE REE Licsten
2.1 Model building
Before we start, we should be clear what we are about. The essential activity
of mathematical physics, or theoretical physics, is that of modelling or model
building. The activity consists of constructing a mathematical model which
we hope in some way capturcs the essentials of the phenomena we are
investigating. T think we should never fail to be surprised that this turns out to
be such a productive activity. After all, the first thing you notice about the
world we inhabit is that it is an extremely complex place. The fact that so
much of this rich structure can be captured by what are, in essence, a set of
simple formulae is to me quite astonishing Just think how simple Newton's
universal law of gravitation is; and yet it encompasses a whole spectrum of
phenomena from a falling apple to the shape of a globular cluster of stars. As
Einstein said, “Thç most incomprehensible thing about the world is that it is
comprehensible.”
The very success of the activity of modelling has, throughout the history of
science, turned out to be counterproductive. Time and again, the successful
model has been confused with the ultimate reality, and this in turn has
stultified progress. Newtonian theory provides an outstanding example of
this. So successful had it been in explaining a wide range of phenomena, that,
after more than two centuries of success, the laws had taken on an absolute
character. Thus it was that, when at the end of the nineteenth century it was
becoming increasingly clear that something was fundamentally wrong with
the current theories, there was considerable reluctance to make any funda-
mental changes to them. Instead, a number of artificial assumptions were
made in an attempt to explain the unexpected phenomena. It eventually
required the genius of Einstein to overthrow the prejudices of centuries and
demonstrate in a number of simple thought experiments that some of the
most cherished assumptions of Newtonian theory were untenable. This he
did in a number of brilliant papers written in 1905 proposing a theory which
has become known today as the special theory of relativity.
We should perhaps be discouraged from using words like right or wrong
when discussing a physical theory. Remembering that the essential activity is
model building, a model should then rather be described as good or bad,
depending on how well it describes the phenomena it encompasses. Thus,
Newtonian theory is an excellent theory for describing a whole range of
phenomena. For example, if one is concerned with describing the motion of a
car, then the Newtonian framework is likely to be the appropriate one.
18 | The k-calcutus
Fig. 2.4 Two observed bodies and their
inertial frames.
Fig. 2.5 Two frames in standard
configuration at time £
O leem
z
others are at rest or travel with constant velocity relative to it (for otherwise
Newton's first law would no longer be true). The transformation which
connects one inertial frame with another is called a Galilean transformation.
To fix ideas, let us consider two inertial frames called S and S' in standard
configuration, that is, with axes parallel and S moving along S$ºs positive x-
axis with constant velocity (Fig. 2.5). We also assume that the observers
synchronize their clocks so that the origins of time are set when the origins of
the frames coincide. It follows from Fig. 2.5 that the Galilean transformation
connecting the two frames is given by
The last equation provides a manifestation of the assumption of absolute
time in Newtonian theory. Now, Newton's laws hold only in inertial frames.
From a mathematical viewpoint, this means that Newton's laws must be
invariant under a Galilean transformation.
2.5 The principle of special relativity
We begin by stating the relativity principle which underpins Newtonian
theory
2.6 The constancy of the velocity of light | 19
This means that, if one inertial observer carries out some dynamical ex-
periments and discovers a physical law, then any other inertial observer
performing the same experiments must discover the same law. Put another
way, these laws must be invariant under a Galilean transformation. That is to
say. if the law involves the coordinates x, y, z; t of an inertial observer 5, then
the law relative to another obscrver S' will be the same with x, y, 2, t replaced
by x,y, 7, £, respectively. Many fundamental principles of physics are
statements of impossibility, and the above statement of the relativity princi-
ple is equivalent to the statement of the impossibility of deciding, by per-
forming dynamical experiments, whether a body is absolutely at rest or in
uniform motion. In Newtonian theory, we cannot determine the absolute
position in space of an event, but only its position relative to some other
event. In exactly the same way, uniform velocity has only a relative signifi-
cance; we can only talk about the velocity of a body relative to some other.
Thus, both position and velocity are relative concepts.
Einstein realized that the principle as stated above is empty because there
isno such thing as a purely dynamical experiment. Even on a very elementary
level, any dynamical experiment we think of performing involves observation,
ie, looking, and locking is a part of optics, not dynamics. Tn fact, the more
one analyses any one experiment, the more it becomes apparent that practic-
ally alt the branches of physics are involved in the experiment. Thus, Einstein
took the logical step of removing the restriction of dynamics in the principle
and took the following as his first postulate.
Hence wc sec that this principle is in no way a contradiction of Newtonian
thought, but rather constitutes its logical completion.
2.6 The constancy of the velocity of light
We previously defined an observer in Newtonian theory as someone equip-
ped with a clock and ruler with which to map the events of the universe.
However, the approach of the k-calculus is to dispense with the rigid ruler
and use radar methods for measuring distances. (What is rigídity anyway? Ia
moving frame appears non-rigid in another frame, which, if either, is the rigid
one?) Thus, an observer measures the distance of an object by sending out a
light signal which is reflected off the object and received back by the observer.
The distance is then simply defined as half the time difference between
emission and reception. Note that by this method distances are measurod in
intervals of time, like the light year or the light second (- 101º cm).
Why use light? The reason is that we know that Lhe velocity of light is
independent of many things. Observations from double stars tell us that the
velocity of light in vacuo is independent of the motion of the sources as well as
independent of colour, intensity, etc. For, if we suppose that the velocity of
light were dependent on the motion of the source relative to an observer (so
that if the source was coming towards us the light would be travelling faster
and vice versa) then we would no longer see double stars moving in Keplerian
20 | The k-calculus
orbits (circles, ellipses) about each other: their orbits would appear distorted;
yet no such distortion is observed. There are many experiments which
confirm this assumption. However, these were not known to Einstein in 1905,
who adopted the second postulate purely on heurístic grounds. We state the
second postulate in the following form.
Or stated another way: there is no overtaking of light by light in empty space.
The speed of light is conventionally denoted by c and has the exact numerical
value 2.997924 580 x 108 ms”, but in this chapter we shall adopt relativistic
units in which c is taken to be unity (ie. c = 1). Note, in passing, that another
reason for using radar methods is that other methods are totally impractic-
able for large distances. In fact, these days, distances from the Earth to the
Moon and Venus can be measured very nccurately by bouncing radar signals
off them.
2.7 The k-tactor
For simplicity, we shall begin by. working in two dimensions, one spatial
dimension and one time dimension. Thus, we consider a system of observers
distributed along a straight line, cach equipped with a clock and a fiashlight.
We plot the events they map in a two-dimensional space-time diagram. Let us
assume we have two observers, 4 at rest and B moving away from A with
uniform (constant) speed. Then, in a space-time diagram, the world-line of 4
will be represented by a vertical straight line and the world-linc of E by a
straight line at an angle to 4ºs, as shown in Fig 2.6.
A light signal in the diagram will be denoted by a straight line making an
angle in with the axes, because we are taking the speed of light to be L. Now,
suppose 4 sends out a series of flashes of light to B, where the interval
between the flashes is denoted by T according to 4's clock. Then it is
plausible to assume that the intervals of reception by B's clock are propor-
tional to T, say kT. Moreover, the quantity k, which we call the k-factor, is
Time
— ————» Space
Fig. 2.6 The world-lines of observers 4 Fig. 2.7 The reciprocal nature of the
and B. ktector
2.10 The relativity of simultaneity | 23
not exist or, if they do, they do not interact with ordinary matter. This
would seem to be just as well, for otherwise they could be used to signal
back into the past and so would appear to violate causality. For example,
it would be possible theoretically to construct a device which sent out a
tachyon at a given time and which would trigger a mechanism in the
device to blow it up before the tachyon was sent out!
2.10 The relativity of simultaneity
Consider two events P and Q which take place at the same time, according to
A, and also at points equal but opposite distances away. 4 could establish
this by sending out and receiving the light rays as shown in Fig. 2.12
(continuous lines). Suppose now that another inertial observer B meets A at
the time these events occur according to 4. B aiso sends out light rays RQU
and SPV to illuminate the events, as shown (dashed lines). By symmetry
RU = SV and so these events are equidistant according to B. However, the
signal RQ was sent before the signal SP and so B concludes that the event Q
took place well before P. Hence, events that A judges to be simultaneous, B
judges not to be simultaneous. Similarly, 4 maintains that P, O, and Q
occurred simultaneously, whereas B maintains that they occurred in the
order Q, then O, and then P.
This relativity of simuitaneity lies at the very heart of special relativity and
resolves many of the paradoxes that the classical theory gives rise to, such as
the Michelson-Morley experiment. Einstein realized the crucial role that
simnltaneity plays in the theory and gave the following simple thought
experiment to illustrate its dependence on the observer. Imagine a train
travelling along a straight track with velocity v relative to an observer 4 on
the bank of the track. In the train, B is an observer situated at the centre of
one of the carriages. We assume that there are two electrical devices on the
track which are the length of the carriage apart and equidistant from A.
When the carriage containing B goes over these devices, they fire and activate
two light sources situated at each end of the carriage (Fig. 2.13), From the
configuration, it is clear that A will judge that the two events, when the light
sources first switch on. occur simultaneously. However, B is travelling
towards the light emanating from light source 2 and away from the light
emanating from light source 1. Since the speed of light is a constant, B will see
the light from source 2 before seeing the light from source 1, and so will
conclude that one light source comes on before the other.
ev
B
Light source 1 â Light source 2
Eiring dewce 1.4 (DIO OHO. Ne -Frmg device 2
* “ 7 *
E =x .
“ ,
x pa
“A
Fig. 2.12 Relativity of simultaneity.
Fig. 2.13 Light signals emanating trom
the two sources
24 | The k-caleulus
tAbsolutej
future
Elsewhere Elsehere
(fosolute)
past
“Light cone!
Fig, 2.14 Event relationships in special
relativity.
Fig. 2.15 The clock paradox
Fig. 2.16 Spatial analogue of clock
parados
We can now classify event relationships in space and time in the foliowing
manner. Consider any event O on A's world-line and the four regions, as
shown in Fig. 2.14, given by the light rays ending and commencing at O. Then
the event E is on the light ray leaving O and so occurs after O. Any other
inertial observer agrees on this; that is, no observer ses E illuminated before
A sends out the signal from O. The fact that E is illuminated (because 4
originally sends out a signal at O) subsequent to O is a manifestation of
causality —the event O ultimately causes the cvent E. Similarly, the event F
can be reached by an inertial observer travelling from O with finite speed.
Again, all inertial observors agree that F occurs after O. Hence all the events
in this region are called the absolute future of O. In the same way, any event
oceurring in the region vertically below takes place in O's absolute past.
However, the temporal relationship to O of events im the other two regions,
called elsewhere (or sometimes the relative past and relative future) will not
be something all observers will agree upon. For example, one class of
observers will say that G took place after O, another class before, and a third
class will say they took place simultaneously. The light rays entering and
leaving O constitute what is called the light cone or nul cone at O (the fact
that it is a cone will become clearer later when we take ail the spatial
dimensions into account). Note that the world-line of any inertial observer or
material particle passing through O must lie within the light cone at O.
2.11 The clock paradox
Consider three inertial observers as shown in Fig. 2.15, with the relative
velocity vc = —tap- Assume that A and B synchronize their clocks at O and
that C's clock is synchronized with B's at P. Let B and C mest after a time T
according to B, whereupon Lhey emit a light signal to 4. According to the
k-calculus, A receives the signal at R after a time kT since meeting B.
Remembering that € is moving with the opposite velocity to B (so that
kk” 1, then A will meet C at Q after a subsequent time lapse ofk-!T. The
total time that A records between events O and Q is therefore (k + k” ')T. For
k £ 1, this is greater than the combined time intervals 27 recorded between
events OP and PQ by Band C, But should not the time lapse between the two
events agree? This is one form of the so-called clock paradox.
However, it is not really a paradox, but rather what it shows is that in
relativity time, like distance, is a route-dependent quantity. The point is that
the 27 measurement is made by two inertial observers, not one. Some people
have tried to reverse the argument by setting B and C to rest, but this is not
possible since they are in relative motion to each other. Another argument
says that, when B and C meet, € should take B's clock and use it. But, in this
case, the clock would have to be acceterated when being transferred to C and
so it is no longer inertial. Again, some opponents of special relativity (e.g.
H. Dingle) have argued that the short period of acceleration should not make
such a difference, but this is analogous to saying that a journey between two
points which is straight nearly all the time is about the same length as one
which is wholiy straight (as shown), which is absurd (Fig. 2.16), The moral is
thatin special relativity time is a more difficult concept to work with than the
absolute time of Newton.
A more subtle point revolves around the implicit assumption that the
clocks of 4 and B are “good” clocks, ie. that the seconds of 4's clock are the
2.12 The Lorentz transformations | 25
same as those of B's clock. One suggestion is that 4 has two clocks and
adjusts the tick rate until they are the same and then sends one of them to Bat
a very slow rate of acceleration. The assumption here is that the very slow
rate of acceleration will not affect the tick rate of the clock. However, what is
there to say that a clock may not be able to somchow add up the small bits of
acceleration and so affect its performance. A more satisfactory approach
would be for 4 and B to use identically constructed atomic clocks (which is
after all what physicists use today to measure time). The objection then arises
that their construction is based on ideas in quantum physics which is, a priori,
outside the scope of special relativity. However, this is a manifestation of a
point raised earlier, that virtually any real experiment which one cun imagine
carrying out involves more than one branch of physics. The whole structure is
intertwined in a way which cannot easily be separated.
2.12 The Lorentz transformations
We have derived a number of important results in special relativity, which
only involve one spatial dimension, by use of the k-calculus. Other results
follow essentially from the transformations connecting inertial observers, the
fêmous Lorentz transformations. We shall finally use the k-calculus to derive
these transformations.
Let event P have coordinates (t, x) relative to A and (ft, x') relative to B
(Fig. 2.17). Observer A must send out a light ray at time t — x to illuminate P
at time + and also receive the reflected ray back at é + x (check this from
(2.2). The world-line of A is given by x = 0, and the origin of 4's time
coordinate t is arbitrary. Similar remarks appiy to B, where we use primed
quantities for B's coordinates (, x). Assuming A and B synchronize their
clocks when they meet, then the k-calculus immediately gives
=k(t—x), thx=k( + x) 27
After some rearrangement, and using equation (2.4), we obtain the so-called
special Lorentz transformation
t—
This is also referred to as a boost in the x-direction with speed y, since it takes
one from 4's coordinates to B's coordinates and B is moving away from 4
with speed », Some simple algebra reveals the result (exercise)
Poxicpo,
showing that the quantity 1? — x? is an invariant under a special Lorentz
transformation or boost.
To obtain the corresponding formulas in the case of three spatial dimen-
sions we consider Fig. 2.5 with two inertial frames in standard configuration.
Now, since by assumption the xz-plane (y = 0) of A must coincide with the
x'z-plane (y' = 0) of B, then the y and yº coordinates must be connected by a
transformation of the form
y=ny, 2.9)
Fig. 2.17 Coordinatization of events by
inertial observers.
28 | The k-calculus
this direction with speed » transforms S to a frame which is at rest relative to
8º. A final rotation lines up the coordinate frame with that of 5”. The spatial
rotations introduce no new physics. The only new physical information arises
from the boost and that is why we can, without loss of generality, restrict our
attention to a special Lorentz transformation.
Exercises
2.1 62.4) Write down the Galilean transformation from
observer S to observer S', where S' has velocity ny relative to
S. Find the transformation from S' to $ and state in simple
terms how the transformations are related. Write down the
Galilean transformation from S' to S”, where S" has velocity
py relative to S". Find'the transformation ftom $ to S”. Prove
that the Galilean transformations form an Abelian (com-
mutative) group.
2.2 (52.7) Draw the four fundamental k-factor diagrams
(see Fig. 2.7) for the cases of two inertial observers A and B
approáching and receding with uniform velocity v:
(à) as seen by A;
(ã) as seen by B.
2.3 (62.8) Show that v> —» corresponds to kk? If
k> 1 corresponds physically to a red shifi of recession, what
does k < 1 correspond to?
24 (82.9) Show that (2.6) follows from (2.5). Use the com-
position law for velocities to prove that if O < v,p < 1 and
O<tsc<l,thenO<tç<tl
2.5 (42.9) Establish the fact that if vu and vgc are small
compared with the velocity of light, then the composition
law for velocities reduces to the standard additive law of
Newtonian theory.
2.6 (82.10) In the event diagram of Fig. 2.14, find a geomet-
rical construction for the world-linc of an inertial observer
passing through O who considers event G as occurring
simultaneously with O. Hence describe the worlá-lines of
inertial observers passing through O who consider G as
occurring before or after O.
2.7 62.11) Draw Fig, 2.15 from B's point of view. Co-
ordimatize the events O, R, and Q with respect to B and find
the times between O und R, and R and Q, and compare them
with 4's timings.
2.8 (82.12) Deduee (2.8) from (2.7). Use (2.7) to deduce
directly that
Pop
Confirm the equality under the transformation formula (2.8).
2.9 (52.12) In 5, two events occur at the origin and à
distance X along the x-axis simultaneously at t = 0. The
time interval between the events in Sis T. Show that the
spatial distance between the events in Sis (Xº + Tt
and determine the relative velocity » of the frames in terms of
X and T.
2.10 (62.13) Show that the interval between two events
(f1,X1, 7121) and (ta, Xp, Va, 22) defined by
Setran saP-(os ma)
is invariant under a special Lorentz transformation.
Deduce the Minkowski lihe element (2.13) for infinitesimally
separatod events. What does sº become ift, = ta, and how is
it related to the Euclidean distance a between the two
events?
3.1 Standard derivation of the Lorentz
transformations
We start this chapter by deriving again the Lorentz transformations, but
this time by using à more standard approach. We shall work in non-
relativistic units in which the speed of light is denoted by c. We restrict
attention to two inertial observers S and S' in standard configuration. As
before, we shall show that the Lorentz transformations follow from the two
postulates, namely, the principle of special relativity and the constancy of the
velocity of light.
Now, by the first postulate, if the observer S sees a free particle, that is, a
particle with no forces acting on it, travelling in a straight line with constant
velocity, then so will 8”. Thus, using vector notation, it follows that under a
transformation connecting the two frames
r=r+tu o rP=n+u't.
Since straight lines get mapped into straight lines, it suggests that the
transformation between the frames is linear and so we shall assume that the
transformation from $ to S' can be written in matrix form
t t
x x
1|=E , 3.1
y y (E)
Z z
where Lis a 4 x 4 matrix of quantities which can only depend on the speed of
separation s. Using exactly the same argument as we used at the end of
$2.12, the assumption that space is isotropic leads to the transformations of y
and z being
y=y and 7 =2. (3.2
We next use the second postulate. Let us assume that, when the origins of $
and S' are coincident, they zero their clocks, ie. t = 1 = 0, and emit a flash of
light. Then, according to S, the light flash moves out radially from the origin
with speed c. The wave front of light will constitute a sphere, If we
define the quantity 1 by
HKexypgj=x+y) +22 — cp,
then the events comprising this sphere must satisfy 1= O. By the second
30 | The key attributes of special relativity
7 1
o
Fig. 3.1 A rotation in (x, T)-space,
postulate, S' must also see the light move out in a spherical wave front with
speed c and satisfy
=x 4 y242? et? =0.
Thus it follows that, under a transformation connecting S and S',
1=0 o F=0, (3.3)
and since the transformation is linear by (3.1), we may conclude
I=nI', (3.4)
where n is a quantity which can only depend on v. Using the same argument
as we did in $2.12, we can reverse the role of S and S' and so by the relativity
principle we must also have
F=nl (3.5)
Combining the last two equations we find
pP=t > n=+t
In the limit as »>0, the two frames coincide and 1º —>, from which we
conclude that we must take n = 1.
Substituting n = 1 in (3.4), this becomes
neypya cr=ytrytyr? cê,
and, using (3.2), this reduces to
x eHl= x? cp? (3.6)
We next introduce imaginary time coordinates 7 and T' defined by
T=ie, 8m
T'= ic, (3.8)
in which case equation (3.6) becomes
xX+7=x24+ 72.
In a two-dimensional (x, T)-space, the quantity x? + T? represents the
distance of a point P from the origin. This will only remain invariant under a
rotation in (x, 7)-space (Fig. 3.1). If we denote the angle of rotation by 8, then
a rotation is given by
x'=xcos0 + Tsinô, (3.9)
T'= —xsinô + Tcos6. (3.10)
Now, the origin of S' (x' = 0), as seen by 5, moves along the positive x-axis of
S with speed v and so must satisfy x = vt. Thus, we require
x=0 o x=p e x=pTfic,
using (3.7). Substituting this into (3.9) gives
tan = ivjc, . 6.11)
from which we see that the angle & is imaginary as well. We can obtain an
expression for cos 8, using
1 1 1
0 = 007 Trtanio O Tori
3.4 Time dilation | 33
then, subtracting the formulae in (3.14), we find the result
Since
tti<e & B>1 & Ich,
the result shows that the length of a body in the direction of its motion with
uniform velocity v is reduced by a factor (1 — v2/c2)*, This phenomenon is
called length contraction. Clearly, the body will have greatest length in its
rest frame, in which case it is called the rest length or proper length. Note also
that the length approaches zero as the velocity approaches the velocity of
light.
In an attempt to explain the null result of the Michelson-Morley experi-
ment, Fitzgerald had suggested the apparent shortening of a body in motion
relative to the ether. This is rather different from the length contraction of
special reiativity, which is not to be regarded as illusory but is a very real
effect. It is closely connected with the relativity of simultaneity and indeed can
be deduced as a direct consequence of it. Unlike the Fitzgerald contraction,
the effect is relative, ie. a rod fixed in S appears contracted in 8º, Note also
that there arc no contraction effects in directions transverse to the direction of
motion,
3.4 Time dilation
Leta clock fixed at x' = x4 in $' record two successive events separated by an
interval of time 5 (Fig. 3.3) The successive events im S' are (x4, ti)and q
(xa, ty + To), say. Using the Lorentz transformation, we havein S
tr= BOL + xao), ta= Bl + To + vxaje?).
On subtracting, we find the time interval in S defined by 7
World-line
of ciock
ja
T=h-—t
is given by
FIg. 3.3 Successive events recorded by à
clock fixed in S'.
Thus, moving clocks go slow by a factor (1 — y?/c?)*. This phenomenon is
called time dilation. The fastest rate of a clock is in its rest frame and is called
its proper rate. Again, the effect has a reciprocal nature.
Let us now consider an accelerated clock. We define an ideal clock to be
ong unaffected by its acceleration; in other words, its instantaneous rate
depends only on its instantaneous speed v, in accordante with the above
phenomenon of time dilation. This is often referred to as the clock hypoth-
esis. The time recorded by an ideal clock is called the proper time « (Fig. 3.4).
Thus, the proper time of an ideal clock between to and t is given by
Worldine
of clock
Fig. 3.4 Proper time recorded by an
accelerated clock.
34 | The key attributes of special relativity
The general question of what constitutes a clock or an ideal clock is a non-
trivial one. However, an experiment has been performed where an atomic
clock was flown round the world and then compared with an identical clock
left back on the ground. The travelling clock was found on return to be
running slow by precisely the amount predicted by time dilation. Another
instance occurs in the study of cosmic rays. Certain mesons reaching us from
the top of the Earth's atmosphere are so short-lived that, even had they been
travelling at the speed of light, their travel time in the absence of time dilation
would exceed their known proper lifetimes by factors of the order of 10.
However, these particles are in fact detected at the Earth's surface because
their very high velocities keep them young, as it were. Of course, whether or
not time dilation affects the human clock, that is, biological ageing, is still an
open question. But the fact that we are ultimately made up of atoms, which
do appear to suffer time dilation, would suggest that there is no reason by
which we should be an exception.
3.5 Transformation of velocities
Consider a particle in motion (Fig. 3.5) with its Cartesian components of
velocity being
dx dy dz
tera upa (E 28)
and
Ra dx dy da a
aii) (820) 8
Taking differentials of a Lorentz transformation
r=Btoo/), x = (xt),
we get
dt'=p(dt—vdxe?) dx =fldx—vdi, dy=dy dr'=dz,
and hence
dx - v
=” dy Bldx-od) | de 2 mo! (349)
1 gr paiva) une” :
dy dy us
(3.19)
sea pe
dr B(dt — vaxje?) E e] CBO nv
a
v
Path of particle
Fig. 35 Partileinmotionrelatvstos Lo
and S”
3.6 Relationship between space-time diagrams of inertial observers | 35
de
de dz de = us
's=qp = paro vdx/) d-S659)] a (59)] “PADuno) (3.20)
e de
Notice that the velocity components x, and 4; transverse to the direction of
motion of the frame S' are affected by the transformation. This is due to the
time difierence in the two frames, To obtain the inverse transformations,
simply interchange primes and unprimes and replace v by —c.
3.6 Relationship between space-time
diagrams of inertial observers
We now show how to relate the space-time diagrams of S and S' (see Fig. 3.6).
We start by taking ct and x as the coordinate axes of 5, so that a light ray has
slope ix (as in relativistic units). Then, to draw the ct” and x'-axes of S', we
note from the Lorentz transformation equations (3.12)
c=0 «o ct=(uic)x,
that is, the x"-axis, er = 0, is the straight line et = (v/e)x with siope je < 1.
Similarly,
X=0 « ct=(co)x,
that is, the ct-axis, x = 0, is the straight line ct — (c/v)x with slope c/v > 1.
The lines parallel to O(ct') are the world-lines of fixed points in S”. The lines
parallel to Ox are the lines connecting points at a fixed time according to S'
and are called lines of simultaneity in $”. The coordinates of a general event
Pare (ct, x) = (OR, 0Q) relative to S and (ct, x”) = (OV, OU) relative to S”.
However, the diagram is somewhat misleading because the length scales
along the axes are not the same, To relate them, we draw in the hyperbolae
x Cr=x?-ç?=+1,
asshownin Fig. 3.7. Then, if we first consider the positive sign, setting ct = 0,
wegetx' = +1. It follows that OA is a unit distance on Ox”. Similarly, taking
the negative sign and setting *=0we getct = +1 and so OB is the unit
measure on Oct”. Then the coordinates of P in the frame S” are given by
(cr,x) = (os 55):
04" 0B
Note the following properties from Fig. 3.7.
- A boost can be thought of as à rotation through an imaginary angle in the
(x, T)-plane, where T'is imaginary time. We have seen that this is equival-
ent, in the real (x, ct)-plane, to a skewing of the coordinate axes inwards
through the same angle. (This was not appreciated by some past oppo-
nents of special relativity, who gave some erroneous counter-
arguments based on the mistaken idea that a boost could be represented
by a real rotation in the (x, ct)-plane.)
The hyperbolae are the same for all frames and so we can draw in any
number of frames in the same diagram and use the hyperbolas to calibrate
them.
>
o
Fig. 3.6 The worldilines in 5 of the fixed
points and simultaneity lines of 8º
Light ray
Fig. 3.7 Length scalesin Sand $'.
38 | The key attributes of special relativity
Fig. 3.8 Hyperbolic motions.
e
Uniform deceleration
a
Uniform reversai
of direction
Uniform velocity
Uniform acceleration away from
the Earth x
Fig. 3.9 The twin paradox
x
Fig. 3.10 Simultaneity lines of À on the
outward and retum journeys.
This can be rewritten in the form
(exp +c/a? (et—co?
ir ea (a)
which is a hyperbola in (x, ctr-space. £ in particular, we take xo — c?/a = ty = 6,
then we obtain a family of hyperbolas for different values of a (Fig. 3.8). These
world-lines are known as hyperbolic motions and, as we shall see in
Chapter 23, they have significanoe in cosmology. It can be shown that the
radar distance between the world-lines is a constant. Moreover, consider the
regions I and II bounded by the light rays passing through O, and a system of
particles undergoing hyperbolic motions as shown in Fig. 38 (n some
cosmological models, the particles would be galaxies). Then, remembering
that light rays emanating from any point in the diagram do so at 45º, no
particle in region 1 can communicate with another particle in region II, and
vice versa. The light rays are called event horizons and ac! as barriers beyond
which no knowledge can ever be gained. We shall sec that event horizons will
play an important role later in this book.
3.9 The twin paradox
This is a form ol the clock paradox which has caused the most controversy —
a controversy which raged on and offfor over 50 years. The paradox concerns
two twins whom we shall call 4 and À. The twin À takes offin a spaceship for
a retum trip to some distant star. The assumption is that À is uniformly
accelerated to some given velocity which is retained until the star is reached,
whereupon the motion is uniformiy reversed, as shown in Fig. 3.9. According
to 4, Aºs clock records slowly on the outward and return journeys and so, on
return, À will be younger than 4. If the periods of acceleration are negligible
compared with the periods of uniform velocity, then could not À reverse the
argument and conclude that it is 4 who should appear to be the younger?
This is the basis of the paradox.
The resolution rests on the fact that the accelerations, however brief, have
immediate and finite effects on À but not on 4 who remains inertial
throughout. One striking way of seeing this effect is to draw in the simul-
taneity lines of À for the periods of uniform velocity, as in Fig. 3.10. Clearly,
the period of uniform reversal has a marked effect on the simultaneily lines.
Another way of looking at it is to see the effect that the periods of acceleration
have on shortening the length of the journey as viewed by À. Let us be
specific: we assume that the periods of acceleration are 71, 7>, and 75, and
that, after the period T;, À has attained a speed v = /3c/2. Then, from 4ºs
viewpoint, during the period T,, A finds that more than half the outward
journey has been accomplished, in that À has transferred to a frame in which
the distancc betwecn the Earth and the star is more than halved by length
contraction. Thus, À accomplishes the outward trip in about half the time
which A aseribes to it, and the same applies to the return trip. In fact, we
could use the machinery of previous sections to calculate thc time clapscd in
both the periods of uniform acceleration and uniform velocity, and we would
again reach the conclusion that on return À will be younger than A. As we
have said before, this points out thc fact that in special relativity time is a
route-dependent quantity. The fact that in Fig. 3.9 A's world-line is longer
than 4's, and yet takes less time to Lravel, is connected with the Minkow-
skian metric
ds? =cidr? — dx? — dy? — dz?
and the negative signs which appear in it compared with the positive signs
oceurring in the usual three-dimensional Euclidean metric,
3.10 The Doppler effect
Al kinds of waves appear lengthencd when the source recedes from the
observer: sounds are deepened, light is reddened. Exactly the opposite occurs
when the source, instead, approaches the observer. We first of all calculate thç
classical Doppler effect.
Consider a source of light emitting radiation whose wavelength in its rest
frame is 4o. Consider an observer S relative to whose frame the source is in
motion with radial velocity 4. Then, if two successive pulses are emitted at
time differing by dr" as measured by S', the distance these pulses have to travel
will differ by an amount u dr' (see Fig. 3.11). Since the pulses travel with speed
e it follows that they arrive at S with à time difference
At=dt + udt'/e,
giving
Asjde'
+ ue.
Now, using the fundamental relationship between wavelength and velocity,
set
A=cAt and A=cdt.
We then obtain the classical Doppler formula
RE
Let us now consider the special relativistic formula. Because of time
dilation (see Fig. 3.3), the time interval between successive pulses according
to Sis $dt (Fig. 3.12). Hence, by the same argument, the pulses arrive at S
with a time difference
At= Bd + updefe
4,
fal
to)
3.10 The Doppler effect | 39
Fig. 3.11 The Doppler effect:
(a) first pulse; (b) second pulse.
40 | The key attributes of special relati
s 8,
at v
Fig. 3.12 The special relativistic Doppler
shift
k
Br] T
Fig. 3.13 The radial Doppler shift &.
ty
and so this time we find that the special relativistic Doppler formula is
do rute
FR UR
Tf the velocity of the source is purely radial, then w, = v and (3.26) reduces to
26
This is the radial Doppler shift, and, if we set c = 1, we obtain (2.4), whichis
the formula for the A-factor. Combining Figs. 2.7 and 3.12, the radial Doppler
shift is illustrated in Fig 3.13, where df is replaced by T. From equa-
tion (3.26), we see that there is also a change in wavelength, even when the
radial velocity of the source is zero. For example, if the source is movingin a
circle about the origin of S with speed v (as measured by an instantaneous co-
moving frame), then the transverse Doppler shift is given by
This is a purely relativistic effect due to the time dilation of the moving
source. Experiments with revolving apparatus using the so-called “Móssbauer
effect have directly confirmed the transverse Doppler shift in full agreement
with the relativistic formula, thus providing another striking verification of
the phenomenon of time dilation.
Exercises
3.1($3.1) Sand 8' arein standard configuration with E = ce
(0 <a < 1). fa rod at rest in S makes an angle of 45º with
Oxin S and 30º with O'x in S', then lind a.
3.2($3:1) Note from the previous question that perpendicu-
lar lines in one frame need not be perpendicular in another
frame, This shows that there is no obvious meaning to the
phrase “two inertial frames are parallel” unless their relative
velocity is along à common axis, because the axes of either
frame nccd not appear rectangular in the other. Vorify that
the Lorentz transformation between frames in standard
configuration with relative velocity = = (1,0,0) may be
written in vector form
rers(228-n- Moo refe
where + = (x, 3,7). The formulae are said to comprise the
“Lorentz transformation without relative rotation'. Justify
this name by showing that the formulae remain valid when
the frames ace not in standard configuration, but are parallel
in the sense that the same rotalion must be applied to each
frame to bring the two into standard configuration (in which
case v is the velocity of S' relative to S, but v = (4,0,0) no
longer applies).
3.3 (53.1) Prove that the first two equations of the special
Lorentz transformation can be written in the form
et'= —xsinh& + ctcosh à, x = xcosh q — ctsinhg,
where the rapldlty 4 is defined by & = tanh" !(v/c).
Estublish also the following version of these equations:
ct + x =e "et + x),
cx =ee—a),
e = (1 + sfo(d — vio,
What relation does & have to 8 in equation (3.11?
E in particular, the mass is a constant, then
do
F= ng =ma (4.2)
where a is the acceleration.
Now, strictly speaking, in Newtonian theory, all observable quantities
should be defined in terms of their measurement. We have seen how an
observer equipped with a frame of reference, ruler, and clock can map the
events of the universe, and hence measure such quantities as position,
velocity, and acceleration. However, Newton's laws introduce the new con-
cepts of force and mass, and so we should give a prescription for their
measurement. Unfortunately, any experiment designed to measure these
quantities involves Newton's laws themselves in its interpretation. Thus,
Newtonian mechanics has the rather unexpected property that the opera-
tional definitions of force and mass which are required to make the laws
physically significant are actually contained in the laws themselves.
To make this more precise, let us discuss how we might use the laws to
measure the mass of à body. We consider two bodies isolated from all other
influences other than the force acting on one due to the influence of the other
and vice versa (Fig. 4.1). Since the masses are assumed to be constant, we
have, by Newton's second law in the form (4.2),
Fi=ma and F,=ma,.
Tn addition, by Newton's third law, F, = —F,. Hence, we havc
a
Therefore, if we take one standard body and define it to have unit mass, then
we can find the mass of the other body, by using (4.3). We can kcep doing this
with any other body and in this way we can calibrate masses. In fact, this
method is commonty used for comparing the masses of elementary partícles.
Of course, in practice, we cannot remove all other influences, but it may be
possible to keep them almost constant and so neglect them.
We have described how to use Newton's laws to measure mass. How do we
measure force? One approach is simply to use Newton's second law, work
out ma for a body and then read off from the law the force acting on 1. This is
consistent, although rather circular, especially since a force has independent
properties of its own. For example, Newton bas provided us with a way for
working out the force in the case of gravitation in his universal law of
gravitation (UG).
If we denote the constant of proportionality by G (with value 6.67 x 107! in
mk.s, units), the so-called Newtonian constant, then the law is (see Fig. 4.2)
4.1 Newtonian theory | 43
A F
DD er
m [E
Fig. 4.1 Measuring mass by mutually
induced accelerations.
m r mM
Fig. 4.2 Newton's universal law of
gravitation.
44 | The elements of relativistic mechanics
where a hat denotes a unit vector. There are other force laws which can be
stated separatcly. Again, another independent property which holds for
certain forces is contained in Newton's third law. The standard approach to
defining force is to consider it as being fundamental, in which case force laws
can be stated separately or they can be worked out from other considerations.
We postpone a more detailed critique of Newton's laws until Part €' of the
book.
Special relativity is concerned with the behaviour of material bodies and
light rays In the absence of gravitation. So we shali also postpone a detailed
consideration of gravitation until we discuss general relativity in Part C of the
book. However, since we have stated Newton's universal laws of gravitation
in (44), we should, for completeness, include a statement of Newtonjan
gravitation for a distribution of mutter. À distribution of matter of mass
density p = p(X, ), 2, £) gives rise to a gravitational potential which satisfies
Poisson's equation
E
at points inside the distribution, where the Laplacian operator V? is given in
Cartesian coordinates by
& a a
“ota ta
At points external to the distribution, this reduces to Laplace's equation
We assume that the reader is familiar with this background to Newtonian
theory.
4.2 Isolated systems of particles in
Newtonian mechanics
In this section, we shall, for completeness, derive the conservation of linear
momentum in Newtonian mechanies for a system of 1 particles. Let the ith
particle have constant mass », and position vector 1; relative to some
arbitrary origin. Then the ith particle possesses lincar momentum p, defined
by p; = ms”; where the dot denotes differentiation with respect to time é. If F,
is the total force on m;, then, by Newton's second law, we have
Eb mb (47)
The total force F, on the ith particle can be divided into an external force Ff”
due to any external fields present and to the resultant of the internal forces.
We write
F=FE+ DF;
ss
where F,, is the force or the ith particle due to the jth particle and where, for
convenience, we define F = 0. If we sum over i in (4.7), we find
de Lda E º
Sp B-S em F,.
ah ds de
Using Newton's third law, namely, F,, = — F;, then the last term is zero and
we obtain P = FX, where P = Sea P;is termed the total linear momentum
ofthe system and Fest = 512, F$% is the total external force on the system.
T£ in particular, the system of particles is isolated, then
Ft=0 > P=e,
where e is a constant vector. This leads to the law of the conservation of
fingar momentum of the system, namely,
4.3 Relativistic mass
The transition from Newtonian to relativistic mechanics is not, in fact,
completely straightforward, because it involves at some point or another
the introduction of ad hoc assumptions about the behaviour of particles in
relativistic situations. We shall adopt the approach of trying to keep as close
to the non-relativistic definition of energy and momentum as we can. This
leads to results which in the end must be confronted with experiment. The
ultimate justification of the formulae we shall derive resides in the fact that
they have been repeatedly confirmed in numerous laboratory experiments in
particle physics. We shall only derive them in a simple case and state that the
arguments can be extended to a more general situation.
K would seem plausible that, since length and time measurements are
dependent on the observer, then mass should also be an observer-dependent
quantity. We thus assume that a particle which is moving with a velocity u
relative to an inertial observer bas a mass, which we shall term its relativistic
mass, which is some function of É, that is,
m=m(u), (4.9)
where the problem is to find the explicit dependence of m on u. We restrict
attention to motion along a straight tine and consider the special case of two
equal particies colliding inelastically (in which case they stick together), and
look at the collision from the point of view of two incrtial observers S and S'
(see Fig. 4.3), Let one of the particles be at rest in the frame S and the other
possess a velocity u before they collide. We then assume that they coalesce
and that the combined object moves with velocity U. The masses of the two
particles are respectively m(0) and m(u) by (4.9). We denote m(0) by mo and
term it the rest mass of the particle. In addition, we denote the mass of the
combined object by M(U). If we take S' to be the centre-of-mass frame, then
itshould be clear that, relative to S”, the two equal particles coltide with equal
and opposite speeds, Ieaving the combined object with mass M, at rest. It
follows that S must have velocity U relative to 5.
4.3 Relativistic mass | 45
48 | The elements of relativistic mechanics
(4.14) suggest that we regard the energy E of a particle as given-by
This is one of the most famous equations in physics. However, it is not just a
mathematical relationship between two different quantities, namely energy
and mass, but rather states that energy and mass are equivalent concepts.
Because of the arbitrariness in the actual value of E, a better way of stating
the relationship is to say that a change in energy is equal to a change in
relativistic mass, namely,
AE = Ame?
Using conventional units, «? is a large number and indicates that a smali
change in mass is equivalent to an enormous change in energy. Às is well
known, this relationship and the deep implications it carries with it for peace
and war, have been amply verified. For obvious reasons, the term ge? is
termed the rest energy of the particle. Finally, we point out that conservation
of linear momentum, using relativístic mass, leads to the usual conservation
law in the Newtonian approximation. For example (exercise), the collísion
problem considered above leads to the usual conservation of linear
momentum equation for slow-moving particles:
mobi + Hiod, = moda + Moda. (4.18)
Extending these ideas to three spatial dimensions, then a particle moving
with velocity u relative to an inertial frame S has relativistic mass m, energy E,
and linear momentum p given by
Some straightforward algebra (exercise) reveals that
(E/eP — nã — p3 — pi (moc)P, (4.20)
where moc is an invariant, since it is the same for all inertial observers. If we
compare this with the invariant (3.13), ie.
(cP-x2-p-2=s,
then it suggests that the quantities (E/C, Px, Pp Dz) transform under a Lorentz
transformation in the same way as the quantities (ct, x, y, 2). We shall see in
Part C that the language of tensors provides a better framework for dis-
cussing transformation laws. For the moment, we shall assume that energy
and momentum transform in an identical manner and quote the results.
Thus, in a frame S' moving in standard configuration with velocity v relative
to S, the transformation equations are (see (3.12))
The inverse transformations are obtained in the usual way, namely, by
interchanging primes and unprimes ând replacing » by —t, which gives
Xf, in particular, we take S' to be the instantancous rest frame of the
particle, then p' = 0 and E' = E, = moc?. Substituting in (4.22), we find
a
E-DE'=; Doo me,
1 /)t
where m = mo(l - v2/c2)-* and p = (o E'/€2,0,0) = (mo, 0,0) = mo, which
are precisely the values of the energy, mass, and momentum arrived at in
(4.19) with « replaced by ».
4.5 Photons
At the end of the last century, Lhere was considerable conflict between theory
and experiment in the investigation ol radiation in enclosed volumes. In an
attempt to resolve the difficulties, Max Planck proposed that light and other
electromagnetic radiation consisted of individual 'packets' of energy. which
he called quanta. He suggested that the energy É of each quantum was to
depend on its frequency v, and proposed the simple law, called Planck's
hypothesis,
saia
where h is a universal constant known now as Planck's constant. The idea of
the quantum was developed further by Einstein, especially in attempting to
explain the photoelectric effect. The effect is to do with the ejection of
electrons from a metal surface by incident light (especially ultraviolet) and is
strongly in support of Planck's quantum hypothesis. Nowadays, the quan-
tum theory is well established and applications of it to explain properties of
molecules, atoms, and fundamental particles are at the heart of modern
physics. Theories of light now give it a dual wave-particle nature, Some
properties, such as difíraction and interference, are wavelike in nature, while
the photoelectric effect and other cases of the interaction of light and atoms
are best described on a particle basis.
The particle description of light consists in treating it as a stream of quanta
called photons. Using equation (4.19) and substituting in the speed of light,
u=c, we find
m=ym=(1-wje)im=o, (4.24)
that is, the rest mass of a photon must be zero! This is not so bizarre as it first
seems, since no inertial observer ever sees a photon at rest — its speed is
always c — and so the rest mass of a photon is merely a notional quantity. If
we let à be a unit vector denoting the direction of travel of the photon, then
P=(PoPy Po.) = Pê,
and equation (4.20) becomes
(Ef -pi=0,
4.5 Photons | 49
50 | The elements cf relativistic mechanics
Taking square roots (and remembering c and p are positive), we find that the
energy E of a photon is related to the magnitude p of its momentum by
E= pe (4.25)
Finally, using the energy-mass relationship E = mc”, we find that the rela-
tivistic mass of a photon is non-zero and is given by
m= pfe. (4.26)
Combining these results with Planck's hypothesis, we obtain the following
formulae for thc energy E, relatívistic mass m, and linear momentum p of the
photon:
Rua
It is gratifying to discover that special relativity, which was born to reconcile
conflicts in the kinematical properties of light and matter, also includes their
mechanical properties in a single alk-inclusive system.
We finish this section with an argument which shows that Planck's
hypothesis can be derived directly within the framework of special relativity.
We have already seen in the last chapter that the radial Doppler eflect for a
moving sourec is given by (3.27), namely
A (Levi
do Ni-sje)"
where Ag is the wavelength in the frame of the source and À is the wavelength
in the frame of the observer. We write this result, instead, in terms of
frequency, using the fundamental relationships c =2.v and c= ovo, to
obtain
vo (lr
Now, suppose that the source emits a light flash of total energy Eç. Let us use
the equations (4.22) to find the energy received in the frame of the observer 3.
Since, recalting Fig. 3.t1, the light flash is travelting along the negative x-
direction of both frames, the relationship (4.25) leads to the result
pi, = — Eo/c, with the other primed components of momentum zero. Substi-
tuting in the first equation of (4.22), namely,
E =B(E + op,
we get
º o Egl=0e) (1-0
E= Bo vio ss = ol True) *
or
E L+v/c
So (e e
Compbining this with equation (4.28), we obtain
Fo E
do vo
Since this relationship holds for any pair of inertial observers, it follows that
Re fo
Sosnmandca a
fun ssa Ein a A E
RES nd E CRE sa Santo sa Da mu
RERmndaMa acenda CORRA anna canas at
sa sm a dia as ;
EEE SABEREC DRE EEE
HESEGOnGRECEGacLa EticEE E
diRCaERPS Rana dc.
RERance send aceç LO
dis em R mE Sp Di E SEL ana sen canas
Gde Gan annds E
SoM scans
Sec paças HANS RE GU Ea ES DES DEN dna E O DUO Dai e
Cad ianado c sas nan Saad ncansanoAuG sa
Rc an
Hasan
Ra ERA pen E a
una Ea
HndGEsaaas
RHBRE Ras
danca
ditam sn
E sao
HEM
Fa
Pi
RR
a
RR d: à É
EUR Re SM E ER E
a o e a a Ra
E en A E RR
iai hi E Co E
me Us ENE RAR nnnn Rms aE di ds iam nas a dan e
dói DERA RE RES GUARA ad RE ssa = REA ada RG
roger : nn
aabafaanh fado
= EEE AESA Ras smER RASA
qa ata
=
ss
Er ta Glnas
mi none BREGUECO E
msRnaaês pEnGna GuaIpadE DD
bem be bri pena o
í
:
ia
f
5.1 Introduction
To work eflectively in Newtonian theory, one really needs the language of
vectors. This language, first of all, is more succinct, since it summarizes a set
of three equations in one. Moreover, the formalism of vectors helps to solve
certain problems more readily, and, most important of all, the language
reveals structure and thereby offers insight. In exactly the same way, in
relativity theory, one needs the language of tensors. Again, the language helps
to summarize sets of equations succinctly and to solve problems more readily,
and it reveals structure in the equations. This part of the book is devoted to
learning the formalism of tensors which is a pre-condition for the rest of the
book.
The approach we adopt is to concentrate on the technique of tensors
without taking into account the deeper geometrical significance behind the
theory. We shall be concerned more with what you do with tensors rather
than what tensors actually are. There are two distinct approaches to the
teaching of tensors: the abstract or index.free (coordinate-free) approach and
the conventional approach based on indices. There has been a move in recent
years in some quarters to introduce tensors from the start using the more
modern abstract approach (although some have subsequently changed their
mind and reverted to the conventional approach). The main advantage of this
approach is that it offers deeper geometrical insight. However, it has two
disadvantages. First of all, it requires much more of a mathematical back-
ground, which in turn takes time to develop. The other disadvantage is that,
for all its elegance, when one wants to do a real calculation with tensors, as
one frequently needs to, then recourse has to be made to indices. We shall
adopt the more conventional index approach, because it will prove faster and
more practical. However, we advise those who wish to takc thcir study of the
subject further to look at the index-free approach at the first opportunity.
We repeat that the exercises are seen as integral to this part of the book and
should not be omitted.
5.2 Manifolds and coordinates
We shall start by working with tensors defined in n dimensions since, and it is
part of the power of the formalism, there is little extra effort involved. A
tensor is an object defined on a geometric entry called a (differential)
manifold. We shall not define a manifold precisely because it would involve
56 | Tensor algebra
4 indeterminate
ato
Fig. 5.1 Plane polar coordinate curves.
Fig. 5.2 Two non-degenerate coordinate
systems covering an Sº.
Fig. 5.3 Overiapping coordinate patches
in a manifold.
us in too much of a digression. But, in simple terms, a manifold is something
which docally” looks like a bit of n-dimensional Euclidean space R*. For
example, compare a 2-sphere S? with the Euclidean plane IR?. They are
clearly different. But a small bit of Sº looks very much like a small bit of R? (if
we neglect metrical properties). The fact that S? is 'compact', i.e. in some sense
finite, whereas R? “goes off to infinity is a global property rather than a local
property. We shall not say anything precise about global properties — the
topology of the manifold —, although the issue will surface when we start
to look carefully at solutions of Einstein's equations in general relativity.
We shall simply take an n-dimensional manifold M to be a set of points
such that each point possesses a set of 1 coordinates (x!, x?,..., x”), where
each coordinate ranges over a subset of the reals, which may, in particular,
range from — oo to +00. To start off with, we can think of these coordinates
as corresponding to distances or angles in Euclidean space. The reason why
the coordinates are written as superscripts rather than subscripts will become
clear later. Now the key thing about a manifold is that it may not be possible
to cover the whole manifold by one non-degenerate coordinate system,
namely, one which ascribes a unique set of n coordinate numbers to each
point. Sometimes it is simply convenient to use coordinate systems with
degenerate points. For example, plane polar coordinates (R, &) in the plane
have a degeneracy at the origin because & is indeterminate there (Fig. 5.1).
However, here we could avoid the degeneracy at the origin by using
Cartesian coordinates. But in other circumstances we have no choice in the
matter. For example, it can be shown that there is no coordinate system
which covers the whole of a 2-sphere S? without degeneracy. The smallest
number needed is two, which is shown schematically in Fig. 5.2, We therefore
First non-degenerate
coordinate system
covering North Pole
Overlap of coordinate
systems at equator
Second non-degenerate
coordinate system
covering South Pole
Overiap of
coordmate patches
Manifoid M
Coordinate patch
Coordinate patch
5.4 Transformation of coordinates | 59
It follows from the product rule for determinants that, if we define the
Jacobian of the inverse transformation by
then 3 = 1/8,
In three dimensions, the equation of a surface is given by z = f(x, y), then
its total differential is defined to be
=?
E o dx + = dy.
Then, in an exactly analogous manner, starting from (5.6), we define the total
differential
dx'* «Gras Cats Eae
dx? ox"
for each a running from 1 to n. We can write this more economically by
introducing an explicit summation sigá:
dx =
(5.10)
This can be written more economically still by introducing the Einstein
summation convention: whenever a literal index is repeated, it is understood
to imply a summation over the index from 1 to 4, the dimension of the
manifold. Hence, we can write (5.10) simply as
ALA
The index a oceurring on each side of this equation is said to be free and may
take on separately any value from 1 to n. The index b on the right-hand side is
repeated and hence there is an implied summation from 1 to n. À repeated
index is called bound or dummy because it can be replaced by any other
index not already in use. For example,
ox do ôxe
a dr E
E
dx
because c was not already in use in the expression. We definc the Kronecker
deita 8; to be a quantity which is either O or 1 according to
a Sl f a=b,
a-t if ab. (Eu)
Tt therefore follows directly from the definition of partial differentiation
(check) that
dx Ox
ara Sê (5.13)
60 | Tensor algebra
ç
E era
tá
Fig. 5.4 Intinitesimal vector PQ attached
to P.
Fig. 5.5 The tangent vector at two points
of a curve x = xa(1).
5.5 Contravariant tensors
The approach we are going to adopt is to define a geometrical quantity in
terms of its transformation properties under a coordinate transformation
(5.6). We shall start with a prototype and then give the general definition.
Consider two neighbouring points in the manifold P and Q with coordinates
xº and xº + dxs, respectively, The two points define an infinitesimal dis-
placement or infinitesimal vector PÓ . The vector is not to be regarded as
free, but as being attached to the point P (Fig. 5.4). The components of this
vector in the x”-coordinate system are dx”. The components in another
coordinate system, say the x'*-coordinate system, are dx” which are connec-
ted to dxº by (5.11), namely,
sa
dx = dE dx. (514)
The transformation matrix appearing in this equation is to be regarded as
being evaluated at the point P. ie. strictly speaking we should write
dei [5 | de, (519
D
Ox
but with this understood .we shall stick to the notation of (5.14). Thus,
[ôx'"/0x"1, consists of an n x n matrix of real numbers. The transformation
is therefore a linear homogeneous transformation. This is our prototype.
A contravariant vector or contravariant tensor of rank (order) 1 is a set of
quantities, written X2 in the xº-coordinate system, associated with a point P,
which transforms under a change of coordinates according to
where the transformation matrix is evaluated at P. The infinitesimal vector
dx” is a special case of (5.16) where the components X* are infinitesimal. An
example af a vector with finite components is provided by the tangent vector
dx/du to the curve xº = xº(u). It is important to distinguish between the
actual geometric object like the tangent vector in Fig. 5.5 (depicted by an
arrow) and its representation in a particular coordinate system, like the n
numbers [dx*/du]p in the xº-coordinate system and the (in general) different
numbers [dx/du]p in the x“-coordinate system.
We now generalize the definition (5.16) to obtain contravariant tensors of
higher rank or order. Thus, a contravariant tensor of rank 2 is a set of nº
quantities associated with a point P, denoted by X” in the x*-coordinate
system, which transform according to
2x4 px?
dx dio
The quantities X"? are tho components in the x-coordinate system, the
transformation matrices are evaluated at P, and the law involves two dummy
indices c and d. An example of such a quantity is provided by the product
Y2 Z”, say, of two contravariant vectors Yº and Z”. The definition of third-
and higher-order contravariant tensors proceeds in an analogous manner. An
xt (517)
5.6 Covariant and mixed tensors | 61
important case is a tensor of zero rank, called a scalar or scalar invariant é,
which transforms according to
atP.
5.6 Covariant and mixed tensors
As in the last section, we begin by considering the transformation of a
prototype quantity. Let
& = d(x*) (5.19)
be a real-valued function on the manifold, ie. at every point Pin the
manifold, 9(P) produces a real number. We also assume that $ is continuous
and differentiable, so that we can obtain lhe diflerential coefficients 09/0xº.
Now, remembering from equation (5.9) that xº can be thought of as a
function of x”, equation (5.19) can be written equivalently as
+= d(rº0)).
Differentiating this with respect to x, using the function of a function rule,
we obtain
dd dd dx
dx O oe x
Then changing the order of the terms, the dummy index, and the free index
(from b to a) gives
4 ô dp
dee” gx dd”
(5.20)
This is the prototype equation we are looking for. Notice that it involves the
inverse transformation matrix 2x2/0x', Thus, a covariant vector or covariant
tensor of rank (order) 1 is a set of quantities, written X, in the x-coordinate
system, associated with a point P, which transforms according to
Again, the transformation matrix occurring is assumed to be evaluated at P.
Similarly, we define a covariant tensor of rank 2 by the transformation law
ôx dx
Xis = oa pus Kia»
(5.22)
and so on for higher-rank tensors. Note the convention that contravariant
tensors have raised indices whereas covariant tensors have lowered indices.
(The way to remember this is that co goes below.) The fact that the
differentials dx” transform as a contravariant vector explains the convention
that the coordinates themselves are written as x” rather than x,, although
64 | Tensor algebra
(A way to remember the above expression is to note that the positive terins
are obtained by cycling the indices to the right and the corresponding
negative terms by flipping the last two indices.) A totally symmetric tensor is
defined to be one equal to its symmetric part, and a totally anti-symmetric
tensor is one equal to its anti-symmetric part.
We can multiply two tensors of type (p,. 1) and (p,,g;) together and
obtain a tensor of type (p; + p2, q + 92) eg.
oca = VyZca 65.30)
Tn particular, a tensor of type (p, q) when multiplied by a scalar field & is
again a tensor of'type (p, q). Given a tensor of mixed type (p, q), we canforma
tensor ol type (p — tg — 1) by the process of contraction, which simply
involves setting a raised and lowered index equal. For example,
x contractionon aandb | qa r.
eg — o? Roça = Peas
ie. a tensor of type (1, 3) has become a tensor of type (0, 2). Notice that we can
contract a tensor by multiplying by the Kronecker tensor ô%, e.g.
X aa = 34X rear (5.31)
In effect, multiplying by dg turns the index 6 into « (or equivalently the index
a into b).
5.9 Index-free interpretation of contravariant
vector fields
As we pointed out in 55.5, we must distinguish between the actual geometric
object itself and its components in a particular coordinate system. The
important point about tensots is that we want to make statements which are
independent of any particular coordinate system being used. This is abund-
antly clear in the index-free approach to tensors. We shall get a feel for this
approach in this section by considering the special case of a contravariant
vector field, although similar index-free interpretations can be given for any
tensor field. The key idea is to interpret the vector field as an operator which
maps real-valued functions into real-valued functions. Thus, if X represents a
contravariant vector field, then X operates on any real-valued function f to
produce another function g, Le. Xf= q. We shall show how actually to
compute X/ by introducing a coordinate system. However, as we shall see,
we could equally well introduce any other coordinate system, and the
computation would iead to the same result.
In the xº-coordinate system, we introduce the notation
õ
= a
ô,
and then X is defined as the operator
so that
Xf=(Xº0)f= 0,1) (5.33)
5.9 Index-free interpretation of contravariant vector fields | 65
for any real-valued function f. Let us compute X in some other x-coordinate
system. We need to use the result (5.13) expressed in the following form: we
may take xº to be a function of x'* by (5.9) and x"? to be a function of xº by
(5.6), and so, using the function of a function rule, we find
õx à dx dx
a LO ay)
3 = o” Ce 0) e do (5.34)
Then, using the transformation law (5.16) and (5.20) together with the above
trick, we get
ô
na = ra
Xg=k
= px à
de À Ex dx
ôx 0x0 4 0
= qui do O ae
ô
= E
E dx
ô
=x
“Xp
ô
=xoL
=X ag
= Xº0,
Thus the result of operating on f by X will be the same irrespective of the
coordinate system employed in (5.32).
Tn any coordinate system, we may think of the quantities [d/0x,]p as
forming a basis for all the vectors at P, since any vector at Pis, by (5.32), given
by
Xo= peo[s),
that is, a linear combination of the [0/0x*],. The vector space of all the
contravariant vectors at P is known as the tangent space at P and is written
TAM) (Fig. 5.6). In general, thc tangent space at any point in a manifold is
ON Cantravariant vectors
a
Tangent space TAM)
Z
7
Manifold M
Fig. 5.8 The tangent space at P.
66 | Tensor algebra
different from the underlying manifold. For this reason, we need to be careful
in representing a finite contravariant vector by an arrow in our figures since,
strictly speaking, the arrow lies in the tangent space not the manifold. Two
exceptions to this are Euclidean space and Minkowski space-time, where the
tangent space at each point coincides with the manifold.
Given two vector fields X and Y we can define a new vector field called the
commutator or Lie bracket of X and Y by
Letting [X, Y] = Z and operating with it on some arbitrary function f
Zf=[X,71/
=(XY— FX)f
=X(F9— HXf)
=X(Vºô,f)— HXº0af)
= 00 (Pº2,1) — POA MAS)
=(XB, VI — PPS KAS XVHD,O,S— 00,1).
The least term vanishes since we assume commutativity of second mixed
partial derivatives, ie.
o o
gh = 55 Ta E -
Bus = eai = date Md
Since fis arbitrary, we obtain the result
LX YJP=2º= xt, Yº— yºa,xo (5.36)
from which it clearly follows that the commutator of two vector fields is itself
a vector field. It also follows, directly from the definition (5.35), that
[X,x]=0 (537
[X,Y]=-[7X] . (5.38)
[x.cr.2])+[Zz,[x,y1]+[r[2,X]=0. (5.39)
Equation (5.38) shows that the Lie bracket is anti-commutative. The result
(5.39) is known as Jacobi's identity. Notice it states that the left-hand side is
not just equal to zero but is identically zero. What does this mean? The
equation x? — 4 = O is only satisfied by particular values of x, namely, +2
and —2. The identity x? — x? = O is satisfied for all values of x. But, you may
argue, the x? terms cancel out, and this is precisely the point. An expression is
identically zero if, when all the terms are written out fully, they all cancel in
pairs.
the form (5.25), we see that
a êet | é
r-[5] 0 and X -[3e] tt
This involves the transformation matrix evaluated at different points, from
which it should be clear that Xf — Xg is not a tensor. Similar remarks hold
for differentiating tensors in general.
K turns out that if we wish to differentiate a tensor in a tensorial manner
then we need to introduce some auxiliary field onto the manifold. We shall
meet three different types of differentiation. First of all, in the next section, we
shall introduce a contravariant vector field onto the manifold and use it to
define (he Lie derivative. Then we shall introduce a quantity called an affine
connection and use it to define covariant differentiation. Finally, we shall
introducc a tensor called a metric and from it build a special affine con-
nection, called thé metric connection, and again define covariant differ-
entlation but relative to this specific connection.
6.2 The Lie derivative
The argument we present in this section is rather intricate. It rests on the idea
of interpreting a coordinate transformation actively as a point transforma-
tion, rather than passively as we have done up to now, The important results
occur al the end ol the section and consist of the formula for the
Lie derivative of a general tensor field and the basic properties of Lie
differentiation.
We start by considering a congruence of curves defined such that only one
curve goes through each point in the manifold, Then, given any one curve of
the congruence,
xº = xº(u),
we can use it to define the tangent vector field dx“/du along the curve. If we do
this for every curve in the congruence, then we end up with a vector field Xº
(given by dx“/dy at each point) defined over the whole manifold (Fig. 6.1).
Conversely, given a non-zero vector field X“(x) defined over the manifold,
then this can be used to define a congruence of curves in the manifoid called
the orbits or trajectories of X*. The procedure is exactly the same as the way
in which a vector field gives rise to field lines or streamtines in vector analysis.
These curves are obtained by solving the ordinary differential equations
dx
du
= Xº(x(14)). (6.2)
The existence and uniqueness theorem for ordinary differential equations
guarantees a solution, at least for some subset of the reals. In what follows, we
are really only interested in what happens locally (Fig. 6.2).
We therefore assume that Xº has been given and we have constructed the
local congruence of curves. Suppose we have some tensor held T:::(x) which
we wish to differentiate using Xº. Then the essential idea is to use the
congruence of curves to drag the tensor at some point P (ie. T$!(P)) along
the curve passing through P to some neighbouring point Q, and then
compare this “dragged-along tensor with the tensor already there (ie.
T3:(0)) (Fig. 6.3). Since the dragped-along tensor will be of the same type as
6.2 The Lie derivative | 69
Fig. 6.1 The tangent vector field
resulting from a congruence of curves.
Fig. 6.2 The local congruence of curves
resulting from a vector field.
70 | Tensor calculus
Fig. 6.3 Using the congruence to
compare tensors at neighbouring points.
x2coordinate chart
Fig. 6,4 The point Ptransformed to Qin
the same xa-coordinate system.
“Dragged along tersor at Q
“Tensor at P 4
Tensor at Q
Xta
the tensor already at Q, we can subtract the two tensors at Q and so define a
derivative by some limiting process as Q tends to P. The technique for
dragging involves viewing the coordinate transformation from P to Q
actively, and applying it to the usual transformalion law for tensors, We shall
consider the detailed calculation in the case of a contravariant tensor field of
rank 2, 7º(x) say.
Consider the transformation
where du is small. This is called a point transformation and is to be regarded
actively as sending the point P, with coordinates xº, to the point Q, with
coordinates xº + du X“(x), where the coordinates of each point are given in
the same x*-coordinate system, i.e.
P>Q
x" x + du Xº(x),
The point Q clearly lies on the curve of the congruence through P which Xº
generates (Fig. 6.4). Differentiating (6.3), we get
axa
àx*
2 + dudXS. (64)
Next, consider the tensor field T* at the point P. Then its components at P
are T“(x) and, under the point transiormation (6.3), we have the mapping
TP) 5 Tx),
ie, the transformation 'drags' the tensor 7% along from P to Q. The
components of the dragged-along tensor are given by the usual trans-
formation law for tensors (see (5.25), and so, using (6.4),
Ox" O:
àxº ôx*
=(88 + du 0,XNE + Sud XT)
= Tx) + [EX Tx) + 0,20 TUix)]ôu + O(du?). (6.5)
Applying Taylor's theorem to first order, we get
TE) = TES + Su XI) = TP) + du XCA TH. (64)
We are now in a position to define the Lie derivative of Tº* with respect to
so
Té) Te(x)
X", which is denoted by Ly T%, as
This involves comparing the tensor T(x') already at Q with T'(x'), the
dragged-along tensor at Q. Using (6.5) and (6.6), we find
Ly TE =XETO- TE THE (6.8)
e e
Tt can be shown that it is always possible to introduce a coordinate system
such that the curve passing thróugh P is given by x! varying, with x2, x?
»:«-, Xº all constant along the curve, and such that
x E81=(1,0,0,...,0) (6.9)
along this curve. The notation É used in (6.9) means that the equation holds
oniy in a particular coordinate system. Then it follows that
X=Xº0,20.
and equation (6.8) reduces to
Lyréta, ro, (6.10)
Thus, in this special coordinate system, Lie dificrentiation reduces to ordi-
nary differentiation. In fact, one can define Lie diflerentiation starting from
this viewpoint.
We end thé section by collecting together some important properties of Lie
diflerentiation with respect to X which follow from its definition.
1. tis linear; for example
where À and « are constants. Thus, in particular, the Lie derivative of the
sum and difference of two tensors is the sum and difference, respectively, of
the Lie derivatives of the two tensors.
K is Leibniz; that is, it satisfies the usual product rule for differentiation, for
example
»
vo
, ILis type-preserving; that is, the Lie derivative of a tensor of type (p, q) is
again a tensor of type (p, q).
. Tt commutes with contraction; for example
>
6.2 The Lie derivative | 71
74 | Tensor calculus
Tf we now demand that covariant differentiation satisfies the Leibniz rule,
then we find
É
a
Notice again that the differentiation index comes last in the F-term and that
this term enters with a minus sign. Thg name covariant derivative stems from
the fact that the derivative of a tensor: of type (p, q) is of type (p, q + 1 ie.it
has one extra covariant rank. The expression in the case of a general tensor is
(compare and contrast with (6.17)
perenes
Tt follows directly from the transformation laws that the sum of two
connections is not a connection or a tensor. However, the difference of two
connections is a tensor of valence (1, 2), because the inhomogencous term
cancels out in the transformation. For the same reason, the anti-symmetric
part of a Ff., namely,
Ti= Pi
is a tensor called the torsion tensor, If the torsion tensor vanishes, then the
connection is symmetric, i.e.
From now on, unless we state otherwise, we shall restrict ourselves to
symmetric connections, in which case the torsion vanishes. The assumption
that the connection is symmetric leads to the following useful result. In the
expression for a Lie derivative of a tensor, all occurrençes of the partial
derivatives may be replaced by covariant derivatives. For example, in the cast
of a vector (exercise)
LeY=X"0,7º— PaXe= XV, Fº — Viv, xe (6.29)
6.4 Affine geodesics
that is, Vy of a tensor is its covariant derivative contracted with X. Now in
$6.2 we saw that a contravariant vector field X determines a local congruence
of curves,
x = xº(u),
where the tangent vector field to the congruence is
dx
dy
We next define the absolute derivative of a tensor 7
the congruence, written DT$:::/Du, by
=Xº
along a curve € of
The tensor T$.:: is said to be parallely propagated or transported along the
curve Cif
This is a first-order ordinary differential equation for T$..:, and so given an
initial value for T$.::, say T$.U(P), equation (6.32) determines a tensor along
€ which is everywhere parallel to T$..(P).
Using this notation, an affine geodesic is defined as a privileged curve
along which the tangent vector is propagated parallel to itself. In other words,
the parallely propagated vector at any point of the curve is parallel, that is,
proportional, to the tangent vector at that point:
D /dxº dx”
Dela) =) gg!
Using (6.31), the equation for an affine geodesic can be written in the form
RE
or equivalentiy (exercise)
The last result is very important and so we shall establish it afresh from first
principles using the notation of the last section. Let the neighbouring points
Pand Q on € be given by x“(u) and
dx
a = 4
x(u + du) = x(u) + EM ôu
to first order in ôu, respectively. Then in the notation of the last section
dus = E gu (635)
du
6.4 Affine geodesics | 75
76 | Tensor calculus
Fig. 6.6 Two affine geodesics passing
through P with given directions.
P
Fig. 6.7 Two affine gaodesics from P
refocusing at Q.
The vector X“(x) at P is now the tangent vector (dx“/du) (u). The vector at Q
parallel to d:*/du is, by (6.21) and (6.35),
de pa det dt,
du dudu o
The vector already at Q is
dx dx dx
dg Ut = + qua O
to first order in ôu. These last two vectors must be parallel, so we require
dx di de dat de,
qt qu du = EL + Aujõu] (GE- do dr ).
where we have written the proportionality factor as 1 + A(u)ôu without loss
of generality, since the equation must hold in the limit ôu — O. Subtracting
dx“/du from cach side, dividing by ôu and taking the limit as du tends to zero
produces the result (6.34). Note thal TZ, appears in the equation multiplied by
the symmetric quantity (dx?/du) (dx*/du), and so even if we had not assumed
that FZ. was symmetric the equation picks out its symmetric part only.
Tf the curve is parametrized in such a way that À vanishes (that is, by the
above, so that the tangent vector is transported into itself), then the para-
meter is a privileged parameter called an affine parameter, oflen convention-
aliy denoted by 5, and the affine geodesic equation reduces to
or equivalently
bb AE UR gu
where « and f are constants. We can use the affine parameter s to define the
affine length of the geodesic between two points P, and P; by fp: ds, and so
we can compare lengths on the same geodesic. However, we cannot compare
lengths on different geodesics (without a metric) because of the arbitrariness
in the parameter s. From the existence and uniqueness theorem for ordinary
differential equations, it follows that corresponding to every direction at a
point there is a unique geodesic passing through the point (Fig. 6.6). Similarly,
any point can be joined to any other point, as long as the points are
sufficiently “close”, by a unique geodesic. However, in the large, geodesics may
focus, that is, meet again (Fig. 6.7).
the manifold by paralleiy propagating Xº. The equation for parallely pro-
pagating Xº is
DX! de
Du du
ve
and, since dx“/du is arbitrary, it follows that the covariant derivative of Xº
vanishes, ie.
VXI=TX+TEX=0, (6.43)
Hence, this equation must possess solutions. A necessary condition for a
solution of this first-order partial differential equation is
BOM = 0X, (644)
namely, the second mixed partial derivatives should commute: In the
previous section, we met the identity for the commutator of a vector field
(6.38), namely
VE VM = 04X — AX Riga X!
The left-hand side of this equation vanishes by construction, that is, by (6.43);
hence it follows that (6.44) will hold if and only if
Riga X* =0.
Finally, since Xº is arbitrary at every point, a necessary condition for
integrability is Rºpa = O everywhere.
We next prove sufficiency, We start by considering the difference in
parallely propagating a vector Xº around an infinitesimal loop connecting x”
to x + dxº + dxº, first via xº + ôxº and then via xº + dxº (Fig. 6.9). From
$63, if we parallely transport Xº from xº to xº + dx”, we obtain the vector
X (x + dx) = (x) + EX x),
where, by (6.21),
EX x) = — Fal) X)dxe.
Similarly, if we transport this vector subsequently to x“ + ôxº + dx”, we
obtain the vector
XUx + dx + de= X4x + 0x) + 5X(x + dx),
where, in this case,
EX (x + O) = —TElx + d)XMx + ôxjde.
Expanding by Taylor's theorem and using the previous results, we obtain
(where everything is assumed evaluated at xº)
8X (x + dx) = (TR + 0 pd MX! — Pê XºôxN dae
—T&Xtdx — Cr xPôxi dx
+TETtXôx/ dx + BT Tt, Xcôxtôx! dee.
Neglecting the last term, which is third order, we have
X(x + Ox + dx)
=X TRXtôx — TE Xºdx — ATE XOxida + PRIb,XCBx! ds,
To obtain the equivalent result for the path connecting xº to xº + dxº + dx”
6.7 Affine flatness | 79
a nr aniedo
rar
Fig. 6.9 Transporting Xº around an
infinitesimal loop.
80 | Tensor calculus
G
G
Fig. 6.10 Deforming C; into C, (infínites
imally at each stage).
via xº + dx”, we simply interchange ôx* and dx” to give
Xº(x + dx + dx)
=X" FEXdx — PEXPOX — APEX dxtóx + TES X dx! dx
Hence, the difference between these two vectors is
AXº=Xx + ôx + dx) — X(x + dx + Ox)
=(ôB — BD fa + SFB TETE)! dx das
Rat ôx' da
=—R'pgXtôxt dx!
by (6.39) and the fact that the Riemann tensor is anti-symmetric on its
last pair of indices (see (6.77). Thus, the vector Xº will be the same at
xº + dxº + dxº, irrespective of which path is taken, if and only if Rg = 0.1
follows that if the Riemann tensor vanishes then the vector X* will not change
if parallely transported around any infinitesimal closed loop. Using this result
and assuming the manifold has no holes (that is, the manifold is simply
connected), then we can continuously deform one curve into another by
deforming the curves infinitesimally at cach stage (Fig. 6.10), which estab-
lishes that the connection is integrable (check).
The second lemma is as follows.
Sufficiency is established by first choosing n lineariy independent vectors
XP (=1,2,...,n)
at ?, where the bold index 7 runs from 1 to n and labels the vectors. Using the
integrability assumption we can construct the parallel vector fields X,40x) and
these will also be linearly independent everywhere. Therefore, at each point P,
X;(P) is a non-singular matrix of numbers and so we can construct its
inverse, denoted by X',, which must satisfy
Xi Êo (6.45)
where there is a summation over i. Multiplying the propagation equation
BXEA PAX! =0
by Xi, produces
Th =—Xiax£ (646)
Differentiating (6.45), we obtain
XEOX = KM XP=T% (647)
by (6.46). Using (6.47), we find that
XX, BXi)=TE-Tã=
because the connection is symmetric by assumption. Since the determinant of
XY is non-zero, it follows that the quantity in brackets must vanish, from
which we get
0.X%, = 0,X%.
This in turn implies that X', must be the gradient of n scalar fields, fi(x) say,
that is,
Xi = 209).
Tí we consider the transformation
xt xt fe(x)
then
a =ôpfº(x) =X, (6.48)
and so, taking inverses,
e =X. (6.49)
Multipiying (6.23) by X,* and using (6.48) and (6.49) and then (6.45) and
(647), we find
KT = KMS KATE, — KV XE OX)
=BIXKXATE— XV XITI,=0.
Again, since the determinant of X,* is non-zero, Fj2 vanishes everywhere in
this coordinate system and hence the manifold is affine flat. The necessity is
straightforward and is left as an exercise.
Tf we put these two lemmas together, we get the result we have been looking
for.
6.8 The metric
Any symmetric covariant tensor field of rank 2, say g,»(x), defines a metric. A
manifold endowed with a metric is callcd a Riemannian manifotd. A metric
can be used to define distances and lengths of vectors. The infinitesimal
distance (or interval in relativity), which we call ds, between two neigh-
bouring points x” and x” + dx” is defined by
Note that this gives the square of the infinitesimal distance, (ds)?, which is
conventionally written as ds?. The equation (6.50) is also known as the line
element and g,, is also called the metrie form or first fundamental form. The
square of the length or norm of a contravariant vector Xº is defined by
6.8 The metric | 81
84 | Tensor calculus
be shown that these curves can be parametrized by a special parameter u,
called an affine parameter, such that their equation does not possess a right-
hand side, that is,
The last equation follows since the distance between any two points is zero, or
equivalentIy the tangent vector is null, Again, any other affine parameter is
related to u by the transformation
uau + 8,
where « and f are constants.
6.10 The metric connection
In general, if we have a manifold endowed with both an affine connection and
metric, then it possesses two classes of curves, affine geodesics and metric
geodesics, which will be different (Fig. 6.11). However, comparing (6.37) with
(6.66), the two classes will coincide if we take
= fa) (670)
or, using (6.64) and (6.62), if
Fig. 6.11 Affine and metric geodesics on
a manifold.
Tt follows from the last equation that the connection-is necessarily symmetric,
ie.
Tie = (6.72)
In fact, if one checks the transformation properties of-(£) from first prin-
ciples, it does indeed transform like a connection (exercise). This special
connection built out of the metric and its derivatives is called the metric
connection. From now on, we shall always work with the metric connection
and we shall denote it by F'$, rather than (E), where T'$. is defined by (6.71).
This definition leads immediately to the identity (exercise)
Conversely, if we require that (6.73) holds for an arbitrary symmetric
connection, then it can be deduced (exercise) lhat the connection is neces-
sarily the metric connection. Thus, we have the following important result.
In addition, we can show that
v.5;=0 (6.74)
and
vgt=0. (6.75)
6.11 Metric flatness
Now at any point P of a manifold, g., is a symmetric matrix of real numbers.
Therefore, by standard matrix theory, there exists a transformation which
reduces the matrix to diagonal form with every diagonal term either +1
or —1. The excess of plus signs over minus signs in this form is called the
signature of the metric, Assuming that the metric is-continuous over the
manifold and non-singular, then it follows that the signature is an invariant.
In general, it will not be possible to find a coordinate system in which the
metric reduces to this diagonal form everywhere. If, however, there does exist
a coordinate system in which the metric reduces to diagonal form with +1
diagonal clements everywhere, then the metric is called flat.
How does metric flatness relate to affine flatness in the case we are
interested in, that is, when the connection is the metric connection? The
answer is contained in the following result.
Necessity follows from the fact that there exists a coordinate system in
which the metric is diagonal with + 1 diagonal elements. Since the metric is
constant everywhere, its partial derivatives vanish and therefore the metric
connection F$, vanishes as a consequence of the definition (6.71). Since Fã.
vanishes everywhere then so must its derivatives. (One way to see this is to
recall the definition of partial differentiation which involves subtracting
quantities at neighhouring points. If the quantities are always zero, then their
difference vanishes, and so does the resulting limit.) The Riemann tensor
therefore vanishes by the definition (6.39).
Conversely, if the Riemann tensor vanishes, then by the theorem of $6.7,
there exists a special coordinate system in which the connection vanishes
everywhere. Since this is the metric connection, by (6.73),
Vedas = 0.Gs — Pais — Têca = 0.
6.11 Metric flatness | 85
86 | Tensor calculus
from which we get
and it follows that 2.95, = O. The metric is therefore constant everywhere and
hence can be transformed into diagonal form with diagonal elements +1.
Note the result (6.76) which expresses the ordinary derivative of the metricin
terms of the connection, This equation will prove useful later.
Combining this theorem with the theorem of $6.7, we see that if we use the
metric connection then metric flatness coincides with affine flatness.
6.12 The curvature tensor
The curvature tensor or Riemann-Christoffel tensor (Riemann tensor for
short) is defined by (6.39), namely,
Rºsça = A Pia — Da Pio + Pia — ThN ça
where F, is the metric connection, which by (671) is given as
de = D9º (Opa + Cao — Dag).
Thus, Rºsa depends on the metric and its first and second derivatives, Tt
follows immediately from the definition that it is anti-symmetric on its last
pair of indices
Résea = — Rºpgos (6.77)
The fact that the connection is symmetric leads to the identity
Rica + Rae + Ri =
(6.78)
Lowering the first index with the metric, then it is easy to establish, for
example by using geodesic coordinates, that the lowered tensor is symmetric
under interchange of the first and last pair of indices, that is,
Rama = Reaab- (6.79)
Combining this with equation (6.77), we see that the lowered tensor is anti-
symmetric on its first pair of indices as well:
Rosca = — Rogcar (6:80)
Collecting these symmetries together, we ses that the lowered curvature
tensor satisfies
These symmetries considerably reduce the number of independent compon-
ents; in fact, in n dimensions, the number is reduced from nº to J nH(n? — 1).
In addition to the algebraic identities, it can be shown, again most casily
by using geodesic coordinates, that the curvature tensor satisfies a set of
Exercises | 89
Exercises
61 86.2) Prove (6.13) by showing that Lyôg= O in two
ways: () using (6.17); (i) from first principles (remembering
Exercise 5.8).
62 (36.2) Use (6.17) to find expressions for LyZy and
Li FZ). Use these expressions and (6.15) to check the
Leibniz property in the form (6.12)
63 (6.3) Establish (6.23) by assuming that the quantity
defined by (6.22) has the tensor character indicated. Take the
partial derivative of
Ox dx Oui
ss=D =
axe
dx! êxre
with respect to x'? to establish the alternative form (6.24).
6.4 (96.3) Show that covariant differentiation commutes
with contraction by checking that V.83 = O.
65 (56.3) Assuming (6.22) and (6.25), apply the Leibniz rule
tothe covariant derivative ot X, X*, where Xº is arbitrary, to
verily (6.26).
6.6 (36.3) Check (6.29).
67 696.4) IE X, Y, and Z are vector felds, f and q smooth
functions, and 2 and q constants, then show that
() ValAY + AZ) = AVE Y + vz,
Gi) VaragiZ = /VaZ + 9V5Z,
Gi) VI = (MNT ASTM.
- 68 (56.4) Show that (6.33) leads to (6.34).
: 6.9 (56,9) If s is an affine parameter, then show that, under
É the transformation
s>5=5(s),
the parameter & will be affine only if s = as + , whcrc « and
B arc constants.
6.10 (56.5) Show that
VV ÃO — VV = Rºgça ÃO, — Ripa
6.11 66.5) Show that
VAZ) VV ZM) — Vox ng?! = Rea Z XOTE
6.12 (56.7) Prove that if a manifold is affine flat then the
connection is necessarily integrable and symmetric.
6.13 (56.8) Show that if g,, is diagonal, ie. go, = Difa x b,
then 9º is diagonal with corresponding reciprocal diagonal
elements.
6.14 (56.8) The line elements of IR? in Cartesian. cylindrical
polar, and spheriçal polar coordinates are given respectively
by
(i) ds? = dx? + dy? + dz”,
(ii) ds? = dR? = RIdg? + de?,
(ii) ds? = dr? + r?d9? + r'sin? 0 dg?
Find g.s, 9º, and g in each case.
6.15 (56.8) Express T, in terms of T<.
6.16 (56.9) Write down the tensor transformation law of
gap Show directly that
a
(a = Ig'Hdsgu + idas — Daire)
transforms like a connection.
6.17 (86.9) Find the geodesic equation for Rº in cylindrical
polars, [Hint: use the results of Exercise 6,14(ii) to compute
the metric connection and substitute in (6.68).]
6.18 (86.9) Consider a 3space with coordinates
(xº)= (2, 3,2) and line element
di =dx + dy — dz”,
Prove that the null geodesics arc given by
x=u+P, v=mu+m, z=nu+n',
where u is a parameter and 4, [, m, m', n, nº are arbitrary
constants satisfying P + m? — nº = 0.
80 | Tensor caiculus
619 66.10) Prove that Vga =0 Deduco that
VoXa = Ba Vo XE.
6.20 (56.10) Suppose we have an arbitrary symmetric con-
nection F$, satisfying V.ga = O. Deduce that F3. must be the
metric connection. [Hint: use the equation to find expres-
sions for d9u, 0.9 and — Qsgh, AS in (6.76), add the
equations together, and multiply by 1g".]
6.21 (56.11) The Minkowski line element in Minkowski
coordinates
(28) = (28,200, 22,x3)= (6x, 9,2)
is given by
dsi=d—dx—- dy — do
(i) What is the signature?
(ii) Is the metric non-singular?
(iii) Es the metric fiat?
6.22 (6.11) The line element of IR? in a particular coordin-
ate system is
ds? =(da!' (x Px? + (x! sinx? da? 2
(i) Identify the coordinates.
(ii) Is the metric Rat?
6.23 (46.12) Establish the identities (6.78) and (6.79). [Hint:
choose an arbitrary point P and introduce geodesic co-
ordinates at P.] Show that (6.78) is equivalent to Rºpeg = O.
6.24 (56.12) Establish the identity (6.82). [Hint: use peo-
desic coordinates] Show that (6.82) is equivalent to
Roctabse, E O. Deduce (6.86).
6.25 (56.12) Show that G, = O if and only if Ro = O.
6.26 (56.13) Establish the identity (6.89) Deduce that the
Weyl tensor is trace-free on all pairs of indices.
6.27 (36.13) Show that angles between vectors and ratios of
Jengths of vectors, but not lengths, are the same for conform-
ally related metrics.
6.28 (66.13) Prove that the null geodesies of two conform-
ally related metrics coincide. [ Hint: the two classes of geo
desics need not both be affinely parametrized. ]
6.29 (56,13) Establish (6.91)
6.30 (56.13) Establish the theorem that any two-dimen-
sional Riemann manifold is conformally flat in the case ofa
metric of signature O, ie. at any point the metric can be |
reduced to the diagonal form (+1,—1) say. (Hint: use nuit
curves as coordinate curves, that is, change to new co :
ordinates
|
|
satisiying
and show that the line element reduces to the form
ds? = ex dady
and finally introduce new coordinates Hi ++) and
Ki]
6.31 This final exercise consists of a long calculation which
will be needed later in the book. If we take coordinates
(nt = (1,1,8, 6),
then the four-dimensional spherically symmerrio line ele-
ment is
ds =e'dt? — edr
ro? — rº sin? 0dg?,
where v = v(t,r) and À =
tand 7.
(4,7) are arbitrary functions of
() Find g,s, 9, and 9º (see Exercise 6.13)
(i) Use the expressions in (i) to calculate Tj,. [Hint re-
member Fi, = 13.]
(ii) Calculate Ro, [Hint use the symmetry relations
(680]
(iv) Calculate Rs, R, and Gy
(v) Caleulate G8,(=9“Ga =
nº)
7,1 Tensor densities
A tensor density of weight W, denoted conventionaliy by a gothic letter,
Ty... transforms like an ordinary tensor, except that in addition the Wth
power of the Jacobian
appears as a factor, ie.
sm
Then, with certain modifications, we can combine tensor densities in much
the same way as we do tensors. One exception, which follows from (7.1), is
that the product of two tensor densities of weight W, and W; is a tensor
density of weight W4 + H%. There is some arbitrariness in defining the
covariant derivative of a tensor density, but we shall adhere to the definition
s à tensor density of weight W then
or Eu e eg a a ig asanaia pum
Guias ua pal ê esa dee
For example, the covariant derivative of a vector density of weight W is
VE= OTA TET'- WIpT
In the special case when W = +1 and c=a, we get the important result
(check)
ER
that is, thc covariant divergence of a vector density of weight + 1 is identical
to its ordinary divergence. It can be shown that both these quantities are
scalar densities of weight +1 (exercise).